Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable hashes in entry file paths #1001

Merged
merged 5 commits into from
Mar 19, 2021
Merged

enable hashes in entry file paths #1001

merged 5 commits into from
Mar 19, 2021

Conversation

evanw
Copy link
Owner

@evanw evanw commented Mar 19, 2021

This PR implements the --entry-names= flag. It's similar to the --chunk-names= and --asset-names= flags except it sets the output paths for entry point files. The pattern defaults to [dir]/[name] which should be equivalent to the previous entry point output path behavior, so this should be a backward-compatible change.

This PR has the following consequences:

  • It is now possible for entry point output paths to contain a hash. For example, this now happens if you pass --entry-names=[dir]/[name]-[hash]. This means you can now use esbuild to generate output files such that all output paths have a hash in them, which means it should now be possible to serve the output files with an infinite cache lifetime so they are only downloaded once and then cached by the browser forever. Fixes [Feature] Toggle contenthash for all output filenames #518.

  • It is now possible to prevent the generation of subdirectories inside the output directory. Previously esbuild replicated the directory structure of the input entry points relative to the outbase directory (which defaults to the lowest common ancestor directory across all entry points). This value is substituted into the newly-added [dir] placeholder. But you can now omit it by omitting that placeholder, like this: --entry-names=[name].

  • Source map names should now be equal to the corresponding output file name plus an additional .map extension. Previously the hashes were content hashes, so the source map had a different hash than the corresponding output file because they had different contents. Now they have the same hash so finding the source map should now be easier (just add .map).

  • Due to the way the new hashing algorithm works, all chunks can now be generated fully in parallel instead of some chunks having to wait until their dependency chunks have been generated first. The import paths for dependency chunks are now swapped in after chunk generation in a second pass (detailed below). This could theoretically result in a speedup although I haven't done any benchmarks around this.

Implementing this feature required overhauling how hashes are calculated to prevent the chicken-and-egg hashing problem due to dynamic imports, which can cause cycles in the import graph of the resulting output files when code splitting is enabled. Since generating a hash involved first hashing all of your dependencies, you could end up in a situation where you needed to know the hash to calculate the hash (if a file was a dependency of itself).

The hashing algorithm now works in three steps (potentially subject to change in the future):

  1. The initial versions of all output files are generated in parallel, with temporary paths used for any imports of other output files. Each temporary path is a randomly-generated string that is unique for each output file. An initial source map is also generated at this step if source maps are enabled.

    The hash for the first step includes: the raw content of the output file excluding the temporary paths, the relative file paths of all input files present in that output file, the relative output path for the resulting output file (with [hash] for the hash that hasn't been computed yet), and contents of the initial source map.

  2. After the initial versions of all output files have been generated, calculate the final hash and final output path for each output file. Calculating the final output path involves substituting the final hash for the [hash] placeholder in the entry name template.

    The hash for the second step includes: the hash from the first step for this file and all of its transitive dependencies.

  3. After all output files have a final output path, the import paths in each output file for importing other output files are substituted. Source map offsets also have to be adjusted because the final output path is likely a different length than the temporary path used in the first step. This is also done in parallel for each output file.

    This whole algorithm roughly means the hash of a given output file should change if an only if any input file in that output file or any output file it depends on is changed. So the output path and therefore the browser's cache key should not change for a given output file in between builds if none of the relevant input files were changed.

@lukeed
Copy link
Contributor

lukeed commented Mar 19, 2021

It's here! It's here!! 🎉🎉

Congratulations! I know this was a big effort.
Can't wait to check this out locally and start playing with it.

@evanw
Copy link
Owner Author

evanw commented Mar 19, 2021

Yeah it's finally here! Sorry it took so long. This feature has been extracted from my ongoing linker rewrite so it can be shipped independently, since it ended up more complicated than I had anticipated. Let me know what you think if you end up trying it out.

@evanw evanw merged commit 356ea17 into master Mar 19, 2021
@evanw evanw deleted the unique-key branch March 19, 2021 04:31
@somebee
Copy link

somebee commented Mar 19, 2021

This is incredible! Thank you so much for your work on esbuild.

@lukeed
Copy link
Contributor

lukeed commented Mar 19, 2021

Sorry it took so long.

There is absolutely zero reason to apologize. This is fantastic.

I'm excited to see the remaining/non-ported linker changes you still have lurking somewhere haha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Toggle contenthash for all output filenames
3 participants