Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Migrate _everything_ to modules #35561

Closed

Conversation

weswigham
Copy link
Member

@weswigham weswigham commented Dec 7, 2019

The initial commit here is generated by executing https://github.com/weswigham/typeformer on our codebase. It can be regenerated as needed. That alone does not quite make this migration complete - with that done, the following needs doing to be able to adopt it (it is already editable and buildable):

  • Rewrite gulp generate-diagnostics to produce a module on each invocation
  • Use webpack and api-extractor to reproduce typescript.js, typescript.d.ts, typescriptServices.js, tsserverlibrary.js, and tsserverlibrary.d.ts from our modules.
  • Update gulp LKG to LKG all the new stuff (note: should change package.json's main to point at a module entrypoint after first module LKG)
  • Validate all other gulp tasks still function as expected

See weswigham#50 for progress on the above - I only plan on merging it into this when this whole this is near merging, to keep the rebasing easy.

@typescript-bot
Copy link
Collaborator

Thanks for the PR! It looks like you've changed the TSServer protocol in some way. Please ensure that any changes here don't break consumers of the current TSServer API. For some extra review, we'll ping @sheetalkamat, @amcasey, @mjbvz, @minestarks for you. Feel free to loop in other consumers/maintainers if necessary

@typescript-bot
Copy link
Collaborator

Thanks for the PR! It looks like you've changed 'preProcess.ts' in some way. Please ensure that any changes here don't break consumers with unique project systems such as Visual Studio. Pinging @sheetalkamat, @amcasey, and @minestarks so they are aware of the changes.

@@ -1,484 +1,422 @@
// Used by importFixes, getEditsForFileRename, and declaration emit to synthesize import module specifiers.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weswigham looks like this comment got lost during "terraforming"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, looks like I should copy the comment ranges from the namespace declaration I remove onto the first declaration inside the namespace.

@weswigham weswigham force-pushed the modules-are-the-new-black branch 7 times, most recently from cfacc0f to 987c8b9 Compare December 16, 2019 22:39
@orta orta mentioned this pull request Jan 31, 2020
4 tasks
@weswigham weswigham force-pushed the modules-are-the-new-black branch from 987c8b9 to 44fcd2d Compare February 27, 2020 19:34
@weswigham weswigham force-pushed the modules-are-the-new-black branch 4 times, most recently from 65cc165 to 00c1d9d Compare February 28, 2020 22:22
@weswigham weswigham force-pushed the modules-are-the-new-black branch from 00c1d9d to 802f955 Compare March 2, 2020 23:28
@weswigham
Copy link
Member Author

This seems to be blocked on microsoft/rushstack#1029 so long as we want to use API-Extractor to regenerate our old amalgamated .d.ts files.

@mihailik
Copy link
Contributor

Is there a rationale for this change explained anywhere?

It feels there is a chance it would set the compiler back in loading time and memory, as well as obvious risk of destabilisation due to many changes involved. The compatibility with older node or other hosts might worsen.

Following trend is an upside, but quite a vague one.

I hope I miss some obvious measurable upsides?

@csvn
Copy link

csvn commented May 18, 2020

@mihailik

The compatibility with older node or other hosts might worsen.

I'm pretty sure this will not be affected, since this PR is not doing commonJS => ESM, it's doing TS namespaces => ESM. There will likely still be a build step that bundles Typescript so that it works with Node as it did before.

I hope I miss some obvious measurable upsides?

If following coding standards by making real modules (and not using namespaces) is not enough, using modules will unlock the possibility to do more optimized builds due to treeshaking that I don't think was possible with the old approach. For example #27891 regarding the growing size of the Typescript project.

There are likely more benefits than the above, but one should not dismiss the benefit of code that is easier to learn and maintain. Future contributors will likely have an easier time to learn the code base after this change than before.

Of course, I might be mistaken. I'll just let a TS member correct any impressions I got wrong 😀

@mihailik
Copy link
Contributor

While these points would well apply to a typical project, they don't practically make sense for TypeScript.

Introducing a bundling step (which isn't there today) is a complex work, and there's virtually no dead code one could possibly shake.

@csvn
Copy link

csvn commented May 20, 2020

typescript.js is 8MB. I'm pretty sure they have not built Typescript all in a single file, so there must be a build step before Typescript is published to npm. I think there's a lot of duplicated code betweentypescript.js and tsserver.js, which are both around 8MB in size. Moving to modules and re-using code between different builds seems like one thing this PR was aimed to unlock.

@weswigham
Copy link
Member Author

Introducing a bundling step (which isn't there today)

There is a bundling step today - the TS compiler itself just performs it (stupid concatenation of namespaces, plus a manually added module.exports = ts appended to the bottom). I already have a second branch based off that one that reimplements our builds in terms of modules and indeed, it does have a proper bundler to produce the same outputs.

Part of the reason we want to do this is because we have massive piles of dead code that could be eliminated in many uses. For example, the commandline compiler doesn't need any of the language server bits.

@mihailik
Copy link
Contributor

There is a bundling step today - the TS compiler itself just performs it

Exactly what I meant, replacing a simple built-in option with a complex 3rd party piece. One is much more likely to generate support work than the other.

we have massive piles of dead code that could be eliminated in many uses. For example, the commandline compiler doesn't need any of the language server bits

I'm sorry could you be a little more specific?

It's a good convention for this repo for significant changes to be accompanied with at least a simple rationale. This here is one massive fundamental change. Way less involved changes are discussed in depth on merits, can at least an outline be here please?

Specifically on the point of piles of dead code in tsc.js -- could you give an example?

I remember from the very early days typescriptServices.js and tsc.js, had 2 carefully curated separate sets of sources. How did that end up with dead code?

But of course you've mentioned it's only a part of the reason for this change. It'd be fair to outline the key ones (maybe just edit the PR description?), to shut down all this unnecessary grumbling, wouldn't it?

Many thanks 🙏

return stableSort<[number, string]>(result, (x, y) => compareValues(x[0], y[0]));
}
export function formatSyntaxKind(kind: SyntaxKind | undefined): string {
return formatEnum(kind, (<any>ts).SyntaxKind, /*isFlags*/ false);
Copy link
Contributor

@ShuiRuTian ShuiRuTian Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi.
Not pretty sure, but I think (ts).SyntaxKind could not get the enum Object. (In my simple test case, it is undefined)
I use a new script to generate a new file like diagnosticMessages.generated.json to remove const keyword and get working enums.
I change it manually, and seems hard to use script to resolve this, so this comment is probably just a remind?
Hope this help!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We compile with preserveConstEnums to keep them in the resulting file.

@frank-dspeed
Copy link

frank-dspeed commented Aug 10, 2020

as people do code for bundlers today and not direct for the targeted environment.

I like this a lot i am creating typescript-rollup a rollup based module resolver

@pauldraper
Copy link

There will likely still be a build step that bundles Typescript so that it works with Node as it did before.

As the author of #27891...why?

There's no compatibility reason for bundling...support for commonjs modules is assumed for any setup using npm packages.

And extreme web-style compression isn't needed; npm authors don't minify or strip comments and READMEs and such from the package (or if they do, they included both versions).

The size instantly drops by 4x just by not including multiple versions of the code, i.e. using reusable modules.

In fact, I'm slightly perplexed that TS struggles with this; it feels as if the bundling process it currently does is harder than what ever other npm-published project has done which is use Node.js normally.

@csvn
Copy link

csvn commented Sep 21, 2020

There will likely still be a build step that bundles Typescript so that it works with Node as it did before.

As the author of #27891...why?

There's no compatibility reason for bundling...support for commonjs modules is assumed for any setup using npm packages.

I wrote this since Typescript is not only used in Node, it's also used in browsers by various online code editors (VSCode, stackblitz, codesandbox and others) and I assumed they wanted to keep similar support for this use case. Typescript has historically been bundled as single files for various purposes, I assumed they might still want bundles for e.g. tsc/tsserver/typescript, but with shared code split off separately so they could be re-used.

Of course, it's possible to leave bundling of TS for web usage up to users. Thought I believe it's definitely a requirement to bundle TS since it's not really feasible to load thousands of files in a browsers.

@mihailik
Copy link
Contributor

TypeScript sources live in multitude of files, and generate 3 key outputs: tsc, typescript and tsserver.

The source files can cross-reference each other without ceremony: firstDefined(...) from core.ts can be referenced just by name from checker.ts.

This module migration here intends to have similar number of inputs/outputs, but:

  1. requires extra tooling for bundling
  2. requires extra 'ceremony' i.e. typically 5-10 of import statements 'tax for the very same existing logic
  3. requires extensive, non-localisable bug-prone refactoring

Now there is an ongoing discussion about merits of specific approach of doing this migration, but no clarification for the elephant in the room.

Why? What's the significant measureable upside?

1) extra tooling, 2) extra code ceremony, 3) major code churn -- each have massive costs versus current setup.

It just feels surprising to see such significant bet with vague/non-existent rationale.

@weswigham
Copy link
Member Author

The idea is that we can refactor our codebase to modules, generate the existing outputs, and then factor everything into smaller, self-contained dependencies that you can include a-la-cart - so while typescript on npm may still contain 4 bundled copies of the code for API compatibility's sake, @typescript/tsc only depends on the modules required to implement tsc, and @typescript/tsserver - only those for the server and so on. But before we can ship those smaller, piecewise packages (which definitely have tangible benefits, like shared sources within a runtime), we have to break up our codebase to be piecewise in the firstplace.

Plus, to directly counter each of your points:

requires extra tooling for bundling

Consumers of TS already use bundlers in scenarios where they make sense to use - our monolithic structure makes our internals impossible to tree shake, resulting in exceedingly large bundles. That our "typescript.js" can load the language server API in a browser with a single script tag, while great for toys without much other meat, isn't great for real at-scale applications.

requires extra 'ceremony' i.e. typically 5-10 of import statements 'tax for the very same existing logic

Yes and no. That "ceremony" is implicitly being added by the compiler right now - all those cross-file references implicitly compile to ts. namespace references. Those actually have a real cost, compared to (native) named imports across modules - on the order of multiple percentage points of performance. (See #39247 for similar discussion). Beyond performance, we literally make the tool that makes the editor take care of that "ceremony" automatically for users - and we really should be dogfooding it more.

requires extensive, non-localisable bug-prone refactoring

the "extensive non-local refactoring" is done by a tool at this point, so... 🤷 if the results are typechecked and tested, and the transform done by tools... does it matter?

@weswigham
Copy link
Member Author

For context: we've been grumbling within the team that we should do this for literally years - since before my first internship on the team. It's simply a daunting task to migrate a codebase of this size. When I migrated tslint from namespaces to modules (by hand), that was actually a proof of concept that it could be done in a way that preserved API shape for consumers, which, at the time, was the most tangible concern.

@pauldraper
Copy link

pauldraper commented Sep 21, 2020

I wrote this since Typescript is not only used in Node, it's also used in browsers by various online code editors (VSCode, stackblitz, codesandbox and others) and I assumed they wanted to keep similar support for this use case.

While there's no question that JS modules are a clustercuss, consumers of npm packages are always using CommonJS-compatible tools. That's a core part of npm.

For other hypothetical types of distributables, like a Web CDN or a Nashorn-powered JAR, the concern would be warranted, but npm packages universally assume CommonJS support, both for their own modules and their dependencies.


ship those smaller, piecewise packages

That's be great. In #27891 I mention the example of prettier (code formatter), which needs the TS parser but literally nothing else.

But even without multiple npm packages, multiple modules will be a most welcome improvement. ❤️

@frank-dspeed
Copy link

@pauldraper i simply do my own builds of typescript and nashhorn is deprecated we now got graaljs it is easy to do own typescript builds from this messy codebase first you change or overwrite the tsconfig to throw out ESNext and then your near your goal feed it into rollup and it can be adjusted the results are amazing you do not even understand the build process that they use.

i plan to automate that now with github actions that is then the final step i will publish it as typees i will put a link here when i am done at present i only did that for my personal use but i think now after 5+ years it is time for a maintained typescript fork.

@mihailik
Copy link
Contributor

The idea is that we can refactor our codebase to modules, generate the existing outputs, and then factor everything into smaller, self-contained dependencies that you can include a-la-cart - so while typescript on npm may still contain 4 bundled copies of the code for API compatibility's sake, @typescript/tsc only depends on the modules required to implement tsc, and @typescript/tsserver - only those for the server and so on.

But before we can ship those smaller, piecewise packages (which definitely have tangible benefits, like shared sources within a runtime), we have to break up our codebase to be piecewise in the firstplace.

Pretty helpful explainer! Thank you very much @weswigham

Effectively this is a first step towards an alternative package/bundling structure better suitable to tree shaking, makes a good sense.

Reasonable question: is it viable instead of that to ONLY make the emit tree-shakeable?

@DanielRosenwasser
Copy link
Member

This seems to be blocked on microsoft/rushstack#1029 so long as we want to use API-Extractor to regenerate our old amalgamated .d.ts files.

Seems like the issue is fixed!

https://twitter.com/octogonz_/status/1410298195465310208

@KilianKilmister
Copy link

Oh my god, it's finally happening!

@mihailik is raising an important point here:

There are likely more benefits than the above, but one should not dismiss the benefit of code that is easier to learn and maintain. Future contributors will likely have an easier time to learn the code base after this change than before.

When I started learning JS and node back somewhere in Node-v13 I started using ESM instead of commonjs after about one month, and I hardly ever used commonjs since then. When I started with TypeScript, the documentation and various "best practices" were already recommending to use ESModules instead of file-crossing namespaces and ///-references. So my ~3y of JS and Node, and ~2y with TypeScript --while really useful when it comes to cutting-edge code-- is suddenly a whole lot less worth in an older code base, as I'll be struggling with a bunch of the basic stuff.

The TypeScript codebase is already pretty daunting on its own because of its size, complexity and popularity. It doesn't make it easier if you aren't hardened in the use of pre-esm code.

I'm really excited for this. I dove into the TS code base a few times, but any attempts at hacking something always failed miserably. Experimenting in a foreign code base is often a lot easier when it's in ESM. More and more packages are being published with an additional ESM distribution, which makes exploring them or tracking down bugs an absolute charm, as the project-and code-shape remains clearly visible, and the comments are often not stripped.

@Jack-Works
Copy link
Contributor

I have some worries about this. I'm using some API that is marked as @internal but still exported to the ts namespace because there is no choice. Those internal functions provide some critical abilities to make the job done. If everything turned to the module and bundled by things like Rollup, they'll become truly inaccessible.

@frank-dspeed
Copy link

@Jack-Works do not worry about that once this step is done and your more familar with the new sturcture you will find easy ways to workaround that often it can be done with less effort i for my self create a lot of dev bundels from my dependencies with custom own named exports. The Rollup tooling from today can easy handle all this scenarios.

@sandersn
Copy link
Member

Superceded by jakebailey#1

@sandersn sandersn closed this May 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.