Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guides - Explain Hash Changes in Caching Guide #652

Closed
jouni-kantola opened this issue Jan 16, 2017 · 19 comments
Closed

Guides - Explain Hash Changes in Caching Guide #652

jouni-kantola opened this issue Jan 16, 2017 · 19 comments
Assignees

Comments

@jouni-kantola
Copy link
Contributor

jouni-kantola commented Jan 16, 2017

I got a scenario here, that I would like to discuss and hopefully create a source for docs, for what a developer should really think about to prevent causing hash changes.
I'm making something, that might seem like a very small change in my code, which has quite large effect on long-term caching.

I'm writing this issue in docs, because I'm sure it's expected, just that I want to be able to understand why this is happening, and hopefully help others as well.

Example repo is here (using webpack@2.2.0-rc.7 and webpack-chunk-hash@0.3.0):
https://github.com/jouni-kantola/webpack-output-by-build-type/tree/why-hash-change

Here goes the scenario:

  1. I build:
Asset Size Chunks
0.9bbf83cf4b0bde5cdd21.dev.js.map 512 bytes 0, 6 [emitted]
0.9bbf83cf4b0bde5cdd21.dev.js 756 bytes 0, 6 [emitted]
2.a01031186175de379984.dev.js 455 bytes 2, 6 [emitted]
3.64a0d1a6257d07bf9c8e.dev.js 455 bytes 3, 6 [emitted]
vendor.72eb1607924636054b21.dev.js 254 kB 4, 6 [emitted]
app.d67c37259cfc27adf14e.dev.js 16.9 kB 5, 6 [emitted]
1.b255a8ecec8dd054c7f3.dev.js 756 bytes 1, 6 [emitted]
1.b255a8ecec8dd054c7f3.dev.js.map 512 bytes 1, 6 [emitted]
2.a01031186175de379984.dev.js.map 424 bytes 2, 6 [emitted]
3.64a0d1a6257d07bf9c8e.dev.js.map 430 bytes 3, 6 [emitted]
vendor.72eb1607924636054b21.dev.js.map 335 kB 4, 6 [emitted]
app.d67c37259cfc27adf14e.dev.js.map 18.9 kB 5, 6 [emitted]
index.html 6.35 kB [emitted]
  1. I change module-a.js from:
import is from 'is-thirteen';

export const a = () => console.log('module a says 12 + 1 === 13', is(12 + 1).thirteen());

to:

export const a = () => console.log('hello world');
  1. I build yet again:
Asset Size Chunks
0.3f19b2e7372b486647de.dev.js.map 512 bytes 0, 6 [emitted]
0.3f19b2e7372b486647de.dev.js 756 bytes 0, 6 [emitted]
2.a01031186175de379984.dev.js 455 bytes 2, 6 [emitted]
3.64a0d1a6257d07bf9c8e.dev.js 455 bytes 3, 6 [emitted]
vendor.503164abc295ee5b7ca1.dev.js 254 kB 4, 6 [emitted]
app.928d94b1c06014c2234b.dev.js 16.9 kB 5, 6 [emitted]
1.84b1d26493dcab2a97e9.dev.js 429 bytes 1, 6 [emitted]
1.84b1d26493dcab2a97e9.dev.js.map 370 bytes 1, 6 [emitted]
2.a01031186175de379984.dev.js.map 424 bytes 2, 6 [emitted]
3.64a0d1a6257d07bf9c8e.dev.js.map 430 bytes 3, 6 [emitted]
vendor.503164abc295ee5b7ca1.dev.js.map 335 kB 4, 6 [emitted]
app.928d94b1c06014c2234b.dev.js.map 18.9 kB 5, 6 [emitted]
index.html 6.35 kB [emitted]

So to what I think needs to be documented:

  1. What types of changes should a developer really be aware of when making changes.
  2. In this build, is-thirteen changed ID from 59 to 85, and that caused three bundles to be cache busted. In a larger app that could be quite a lot of code that needs to be shipped again to the end-user. Is there anyway to prevent this small change, from having this large effect?
  3. Is there anyway of preventing the vendor bundle from changing hash, when changing consuming code?

I think I've done the most important bit, to extract the manifest, but still it's very easy to get chunks cache busted. There are loads of blogs discussing this, but I still believe the truly nitty gritty is missing to be documented.

@jouni-kantola
Copy link
Contributor Author

jouni-kantola commented Jan 16, 2017

If I replace require('webpack-chunk-hash') with webpack.HashedModuleIdsPlugin() all seems to play nice.
Only changed bundle hash after that is 1.f948e0e64b866b43bc18.dev.js to 1.41c92261bb86681d218a.dev.js. Rest stays the same.

So I removed the inlining of the manifest to generate a manifest file, and it seems like it updates fine.
1.41c92261bb86681d218a.dev.js -> 1.f948e0e64b866b43bc18.dev.js
manifest.9262854228a18bb6d1d6.dev.js -> manifest.ff60e3d222a7b3d2143b.dev.js
Rest stays the same.

All bundles grew a bit, but in the long-run I think that definitely is worth it.

Anything else I need to be aware of, or is the change that should be recommended in https://webpack.js.org/guides/caching/#deterministic-hashes? That is from webpack-chunk-hash/webpack-md5-hash to webpack.HashedModuleIdsPlugin.

@bebraw and @okonet, do you have anything you want to add?

@jouni-kantola
Copy link
Contributor Author

Btw, the reason why I extracted the manifest is because of this issue, when manifest has not been inlined:
erm0l0v/webpack-md5-hash#9

@bebraw
Copy link
Contributor

bebraw commented Jan 17, 2017

@jouni-kantola HashedModuleIdsPlugin does most of the work (easier than module ids). I would recommend using NamedModulesPlugin (same but without hashing) during development (better output). Extracting manifest in a way or another becomes crucial when you split bundles (otherwise manifest changes can invalidate something you don't want).

There's one more option - records. Storing records across builds is apparently the ultimate option. You get an extra file to track in your repository then. The nice thing is that this works reliably with code splitting.

I think webpack should come with stronger defaults and treat numbered module ids as a special case that's enabled through a plugin rather than vice versa. I would default with NamedModulesPlugin myself. If webpack knew something about build targets, I would pick the hashed variant for production. In fact I might go as far as to merge these plugins as one and expose hashing through a plugin option. They are essentially the same after all apart from that tiny bit. Less to remember.

@jouni-kantola
Copy link
Contributor Author

I'm working with updating the docs at the moment. I realised I got things a bit backwards after-all yesterday.

Yes, use HashedModuleIdsPlugin to generate IDs that preserves over builds. However, webpack-chunk-hash/webpack-md5-hash needs to be used to base the file hashes on file contents.

So this is the simplest config I'm suggesting:

var path = require('path');
var webpack = require('webpack');
var ChunkManifestPlugin = require('chunk-manifest-webpack-plugin');
var WebpackChunkHash = require('webpack-chunk-hash');

module.exports = {
  entry: {
    vendor: './src/vendor.js',
    main: './src/index.js'
  },
  output: {
    path: path.join(__dirname, 'build'),
    filename: '[name].[chunkhash].js',
    chunkFilename: '[name].[chunkhash].js'
  },
  plugins: [
    new webpack.optimize.CommonsChunkPlugin({
      name: ["vendor", "manifest"],
      minChunks: Infinity,
    }),
    new webpack.HashedModuleIdsPlugin(),
    new WebpackChunkHash(),
    new ChunkManifestPlugin({
      filename: "chunk-manifest.json",
      manifestVariable: "webpackManifest"
    })
  ]
};

@jouni-kantola
Copy link
Contributor Author

Directly related issue: webpack/webpack#1479
(which in-turn links to the resource: https://github.com/dmitry/webpack-hash-test)

@jouni-kantola
Copy link
Contributor Author

All right. I went through a whole bunch of scenarios to check what updated hash.

I use:

  • HashedModuleIdsPlugin
  • WebpackChunkHash
  • CommonsChunkPlugin for vendor chunk
  • code splitting with import()

This is what I came up with:
https://gist.github.com/jouni-kantola/1c1e2bfaebf30de50d1b6a71b869da13

I'll try to document it in docs, in some kind of overview. I'm probably leaving out the files and only describe list the actions.

@jouni-kantola
Copy link
Contributor Author

I also tried adding recordsPath, recordsPath: path.resolve(__dirname, './recordsPath.json').

  • reordering code splits
  • renaming sub modules dir names
    still have large effects on long-term caching.

@jouni-kantola
Copy link
Contributor Author

jouni-kantola commented Jan 24, 2017

@sokra: Wouldn't at least the long-term caching for code splitted chunks be fixed if they weren't prefixed with the ordering number?

@simon04
Copy link
Collaborator

simon04 commented Feb 11, 2017

Related: #131.

@skipjack
Copy link
Collaborator

I'll try to document it in docs, in some kind of overview. I'm probably leaving out the files and only describe list the actions.

@jouni-kantola are you still interested in pr-ing some updates or a new article? It seems this is somewhat covered in the caching guide but could probably be improved.

@goesbysteve
Copy link

goesbysteve commented May 30, 2017

@jouni-kantola I found your Gist very helpful, thanks fore that. Can I ask why you added WebpackChunkHash to your set? Is HashedModuleIdsPlugin documented anywhere? Is it considered best practice to use name in chunkFilename: 'js/[name]-[chunkhash].js' or not?

@skipjack skipjack changed the title Reasons for hash changes explained Guides - Explain Hash Changes in Caching Guide Jun 12, 2017
@skipjack
Copy link
Collaborator

skipjack commented Jun 12, 2017

I think between @timse's post and the caching guide we can probably push this past the finish line. We should definitely link to @timse's post, and maybe this tool for testing, from that guide and make any other changes necessary to bring it up to date. Once that happens, I think this issue can be closed.

@jouni-kantola
Copy link
Contributor Author

jouni-kantola commented Jun 12, 2017

Hello! Sorry for not getting back to you guys in quite some time. The reason why I created this issue and went through the steps in the gist is to make developers aware that many things will affect long-term caching.

If @timse's post is the baseline for creating long-term cacheable assets, I think that should be the caching docs from now on. Added to that I think many steps could be elaborated a bit more, just to show the effect of i.e. an extra module dependency.

Basically I just repeated what @skipjack had already renamed the issue to 😉

@jouni-kantola
Copy link
Contributor Author

Regarding official webpack blogs posts that basically are written to be docs, I think they should be created in the docs repo directly. Redundant information is one of the reasons webpack/webpack#1315 is what it is. Many sources of truth just adds to confusion (/cc @TheLarkInn & @sokra).

@skipjack
Copy link
Collaborator

@jouni-kantola no worries, and I'm glad we agree on how it should be updated. I've been cleaning up issues and trying to turn the remaining ones into actionable items that we can continue knocking off the list.

Regarding official webpack blogs posts that basically are written to be docs, I think they should be created in the docs repo directly. Redundant information is one of the reasons webpack/webpack#1315 is what it is. Many sources of truth just adds to confusion.

I agree although I do think the blog serves a slightly different purpose for more loose discussion and walkthroughs. However, we need to be better about first, finishing the backlog, and second, keeping things up to date and staying on top it so we don't find ourselves in the same situation of a huge backlog of things that need to be documented or resolved. The problem is, webpack is a big tool with a lot of things that need doing and, as with all open source work, people only have so much time to spare. I'm hopeful that we'll get there but it is definitely taking some time.

Re this issue though, yeah I think we basically need to just review the caching guide, see what's missing, and add that. Then this issue, and probably a whole list of other ones on the main repo, can be closed out.

@skipjack skipjack self-assigned this Jul 15, 2017
@skipjack
Copy link
Collaborator

Going to try to tackle this now...

@skipjack
Copy link
Collaborator

@jouni-kantola so I think I've pretty much covered everything in #1436. I didn't mention webpack-chunk-hash within the guide as it seems we can use [chunkhash] within chunkFilename without any extra plugins in webpack 3. Could you remind me why that plugin was/is necessary?

@jouni-kantola
Copy link
Contributor Author

@skipjack: When I was testing the hashes were more consistent with the extra hashing plugins. However after a while I've seen this resulting in runtime errors. Better to leave the hashing only to webpack's substitutions.

@skipjack
Copy link
Collaborator

@jouni-kantola yeah I agree -- I think the simpler we can keep that guide the better. That's why I started from scratch with TheDutchCoder/webpack-guides-code-examples#17 and tried to only use the plugins that seemed necessary while testing. I think if any issues arise with the current guide we should first try to reproduce them in the examples repo, and then decide if it's a common enough issue to mention in the guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants