Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to disable __uncache parameter? #127

Open
jampy opened this issue Oct 31, 2016 · 24 comments
Open

How to disable __uncache parameter? #127

jampy opened this issue Oct 31, 2016 · 24 comments

Comments

@jampy
Copy link

jampy commented Oct 31, 2016

offline-plugin always adds a __uncache parameter to the URLs it loads from the server.

My assets already all contain the hash value of the contents in the file name itself, so __uncache seems useless to me (or, even worse, probably causes unnecessary invalidation):

http GET /webapp/7cb0d2482ecadc1b80eb0abe457371b6.png?__uncache=21a99d000ee055d2579e
http GET /webapp/324b4fcaf164735c627269504b7bc28e.png?__uncache=21a99d000ee055d2579e
http GET /webapp/2-chunk-480c0256dd051a966906-en.js?__uncache=21a99d000ee055d2579e
http GET /webapp/3-chunk-573b4ab141a3dab33cd7-en.js?__uncache=21a99d000ee055d2579e
http GET /webapp/4-chunk-ae79661295afed3d3543-en.js?__uncache=21a99d000ee055d2579e
http GET /webapp/5-chunk-01dbb996411a6dbe1eac-en.js?__uncache=21a99d000ee055d2579e
http GET /webapp/6-chunk-ff86e13721e2469ca83b-en.js?__uncache=21a99d000ee055d2579e
http GET /webapp/7-chunk-b3666d1292d69a0b0831-en.js?__uncache=21a99d000ee055d2579e

The only way I found to stop offline-plugin to do this is by supplying a version: () => null option, but that seems slightly hackish to me.

Is there a sane way to disable __uncache or am I trying to do something bad here?

@NekR
Copy link
Owner

NekR commented Oct 31, 2016

Hey @jampy! Indeed, offline-plugin always adds __uncache param to bust browser and proxy caches when it's installing/updating and downloading new assets. This is reasonable (safe) default behavior and unfortunately there is no way to prevent it at a moment, but I understand that it might be needed.

If we are speaking about disadvantages of it, then yes, such a way is causing unnecessary invalidation for assets which already have its unique name and are "forever cached" via HTTP.

The only way I found to stop offline-plugin to do this is by supplying a version: () => null option, but that seems slightly hackish to me.

It's indeed hackish and more, it's dangerous. Please don't do this. By not providing a version you are actually marking your SW unupdatable util you start specifying it again.

Though, while there are currently no way to remove __uncache from the requests, we can actually remove requests themselves. By using updateStrategy: 'changed' option you can to the offline-plugin that it shouldn't re-download assets if they weren't changed. This is done by files hash comparison which are stored separately by the offline-plugin. With this way, you will have actually requests to the assets only if they were changed so there won't be unnecessary invalidation, because they will be requested exactly once by your ServiceWorker.

Anyway, I see your concerns here and it indeed would be good to have an option to disabled __uncache behavior. Thanks for reporting this! 👍

@NekR
Copy link
Owner

NekR commented Nov 1, 2016

Also, take a look at this issue: webpack/webpack#1315
At stated from it, you cannot simply rely on chunk's hash because content of a chunk may change, but its hash not. This is why you need invalidation -- some proxy may cache with old content and you will get yourself into a trouble with incompatible code versions between chunks. So maybe on shouldn't disable it at but just use updateStrategy: 'changed' here.

@NekR
Copy link
Owner

NekR commented Nov 4, 2016

Hey @jampy, have my answers helped you?

@jampy
Copy link
Author

jampy commented Nov 8, 2016

Hi @NekR

The only way I found to stop offline-plugin to do this is by supplying a version: () => null option, but that seems slightly hackish to me.

It's indeed hackish and more, it's dangerous. Please don't do this. By not providing a version you are actually marking your SW unupdatable util you start specifying it again.

I don't understand - why? Since the filenames are already unique themselves, what's the use for __uncache? I mean, why is it "dangerous"?

Having unique file names (not just unique URLs) is an essential part of my deployment strategy. The app will be served from multiple servers (load balancing). Since deployment can take a few minutes, I must guarantee that each version (file bundles) of the app remains available from all servers.

When a client requests sw.js in the middle of a deployment process, then it will request Webpack chunks of the newer or older version, depending of which sw.js version the client catched.

Therefore, I plan to upload all chunks to Amazon S3 or something like that - and leave old versions there for a relatively long time. It's just the sw.js version deciding the "current" version of the app.

For this to work, I must absolutely avoid that a already-deployed chunk (for example, 5-chunk-5e41a8ff6b1b1778bb3a-en.js) gets replaced with new content without changing the name. That would be dangerous.

With these prerequisites, having an additional __uncache parameter seems redundant at best.

By using updateStrategy: 'changed' option you can to the offline-plugin that it shouldn't re-download assets if they weren't changed.

I was and I still am using that option. Last time I checked it seemed that the service worker always requests all URLs. I saw a lot of HTTP 304 responses.

However, I tried to reproduce it right now and I was not able to. In fact now it seems that the service worker really only requests files that changed. Maybe this is due to updated NPM packages, I don't know (my package.json is still the same, but I do reinstall all packages often).

Also, take a look at this issue: webpack/webpack#1315
At stated from it, you cannot simply rely on chunk's hash because content of a chunk may change, but its hash not.

I think that issue tells the exact opposite: The hash may change even if the contents remain the same - one more source of invalidation.

As a side note, I wonder why the service worker itself ( sw.js) does not have that __uncache dummy parameter. Isn't that the one URL that would need an anti-cache mechanism the most, since it contains all the filnames and hashes, AFAIK?

@NekR
Copy link
Owner

NekR commented Nov 8, 2016

I don't understand - why? Since the filenames are already unique themselves, what's the use for __uncache? I mean, why is it "dangerous"?

Dangerous here is that you deploy SW without its version and that you use undocumented (probably a bug) feature. You may do that, but I don't provide support for such cases.

With these prerequisites, having an additional __uncache parameter seems redundant at best.

It's indeed redundant in your case, but I don't see how it harms you. I already said that it probably makes sense to have a way to disable __uncache, but it makes 0 sense to me to just stop doing what I do now and start implementing this feature. After all, any PR is welcome.

However, I tried to reproduce it right now and I was not able to. In fact now it seems that the service worker really only requests files that changed. Maybe this is due to updated NPM packages, I don't know (my package.json is still the same, but I do reinstall all packages often).

There are enough reasons why hash of a file may change. See this: https://github.com/jhnns/webpack-hash-test

I think that issue tells the exact opposite: The hash may change even if the contents remain the same - one more source of invalidation.

This too. You can experience both problems. Do you use webpack generated hashes in your file names? If so, then you have that revalidation anyway. It's not related to offline-plugin or SW at all. But offline-plugin will solve this for you, see this issue: #129

As a side note, I wonder why the service worker itself ( sw.js) does not have that __uncache dummy parameter. Isn't that the one URL that would need an anti-cache mechanism the most, since it contains all the filnames and hashes, AFAIK?

  • You cannot change URL of your ServiceWorker and URL's search (?) part is used in comparison. Only hash (#) is dropped from comparison.
  • You should control SW's file caching on the server and better set it to not cache. If you don't, you probably already have problems with it.
  • New spec dictates to load SW's file with bypassing the cache by default.

You may also want to look at this issue #128 to get some info about when and why there could be problems related to webpack generated hashes in relying on them. The only thing that solves all those problems completely is using recordsPath + generating all hashes in offline-plugin without relying on webpack's hashes.

@jampy
Copy link
Author

jampy commented Nov 9, 2016

It's indeed redundant in your case, but I don't see how it harms you. I already said that it probably makes sense to have a way to disable __uncache, but it makes 0 sense to me to just stop doing what I do now and start implementing this feature. After all, any PR is welcome.

I never said you should stop what you're doing. You sound upset - if so, my apologies, I'm just trying to understand if offline-plugin is usable in my scenario or if I need to search for alternatives.

My problem with __uncache is a fundamental one. If that trick is really necessary, then my deployment strategy explained earlier is not applicable because it assumes that a filename (URL path, if you prefer) is really unique and that at least different contents will lead to different names.

Maybe I'm wrong, but this means there are only two possibilities in that scenario:

  • If it is unique, then this means that the __uncache parameter is absolutely unnecessary.
  • If it is not unique, then my deployment strategy is badly broken and will suffer from race conditions (meaning that I will need to find a solution to that problem, which is not related to offline-plugin). It also means that the __uncache parameter won't give any benefit because the URL path already differs.

You see, in either case, __uncache won't help anything. Again, I'm talking about a specific scenario.
I could simply live with that parameter, at least if the parameter does not force my users to re-download unchanged assets (since they are big), which is a problem.

It seems that offline-cache avoids to re-download unchanged assets even if a new app "version" has been deployed. Is that correct/guaranteed?

There are enough reasons why hash of a file may change. See this: https://github.com/jhnns/webpack-hash-test

Thanks for the link. None of these situations apply to my situation, though.

Do you use webpack generated hashes in your file names? If so, then you have that revalidation anyway. It's not related to offline-plugin or SW at all.

Yes, I'm using [chunkhash]. It seems to produce pretty stable hashes for me. I did a few tests and hashes don't change unless the chunk changed (I'm aware of edge cases where module IDs change, but there are fixes for that).

But offline-plugin will solve this for you, see this issue: #129

You mean sometime in the future, or it will do that now?

  • You cannot change URL of your ServiceWorker and URL's search (?) part is used in comparison. Only hash (#) is dropped from comparison.

What is the comparison are you talking about?

  • You should control SW's file caching on the server and better set it to not cache. If you don't, you probably already have problems with it.

Ah! That is new to me. I couldn't find that detail anywhere in the docs. :-o

Currently, I'm simply using express.static() to serve the assets. I think it sends a Cache-Control: public, max-age=0 by default. This will change when I use S3 to serve the assets.

Thanks for the link to #128 - very interesting!

@NekR
Copy link
Owner

NekR commented Nov 9, 2016

I never said you should stop what you're doing. You sound upset - if so, my apologies, I'm just trying to understand if offline-plugin is usable in my scenario or if I need to search for alternatives.

Just a bit, maybe because you sounded upset too at that moment.

My problem with __uncache is a fundamental one. If that trick is really necessary, then my deployment strategy explained earlier is not applicable because it assumes that a filename (URL path, if you prefer) is really unique and that at least different contents will lead to different names.

It's necessary for cases when file names doesn't change. Since it changes in your case, then it isn't necessary, but there is no way to disable it right now.

  • If it is unique, then this means that the __uncache parameter is absolutely unnecessary.
  • If it is not unique, then my deployment strategy is badly broken and will suffer from race conditions (meaning that I will need to find a solution to that problem, which is not related to offline-plugin). It also means that the __uncache parameter won't give any benefit because the URL path already differs.

webpack's hash problem is totally unrelated to __uncache, I just brought to do discussion to point you to possible problems in your build process. __uncache is necessary to bust HTTP and CDN caches when name file names are used between builds. That's not your case and it's great :-)

It seems that offline-cache avoids to re-download unchanged assets even if a new app "version" has been deployed. Is that correct/guaranteed?

Yes, it's correct and guaranteed as far as you use updateStrategy: 'changed'. It downloads only changed assets and only once during install of updated ServiceWorker. Then only case when it may need to re-download files is when install fails.

For example: 2 files changed. First one downloaded successfully, but second one failed during network problem, hence SW install is failed too. In this situation, when user will visit your site again SW will try to download those 2 files again, i.e. repeat the install process.

You mean sometime in the future, or it will do that now?

There is no error, I will. Right that issue is open and feature is in development.

What is the comparison are you talking about?

When you register SW browser check registration URL for the scope, if there is already a SW and it has different URL then registration fails. In other words, to change SW file you need to unregister current one first. Changing SW name is a bad-bad practice.

Ah! That is new to me. I couldn't find that detail anywhere in the docs. :-o

Unfortunately SW docs are weak. That's general suggestion from SW authors.

Currently, I'm simply using express.static() to serve the assets. I think it sends a Cache-Control: public, max-age=0 by default. This will change when I use S3 to serve the assets.

This should be Cache-Control: no-store or at least no-cache. This is only for ServiceWorker file itself, not any other your assets. Serve other assets as you want. For development it's probably fine to serve SW file with cache. But in production it's recommended to not.

@jampy
Copy link
Author

jampy commented Nov 10, 2016

In other words, to change SW file you need to unregister current one first.

Ah, I see. So, using a cache-busting parameter for sw.js would tell the browser, that it is a "different" SW - causing problems. Makes sense.

Ah! That is new to me. I couldn't find that detail anywhere in the docs. :-o

Unfortunately SW docs are weak. That's general suggestion from SW authors.

Actually I meant the offline-plugin docs. ;-) I think a sentence in the README would be well worth it (for those that are no SW experts).

Anyway, I'll change my code accordingly and will make sure that I take that detail into account once I start using S3 / the cluster.


Please allow me one more question about version: Where does it come from? The README says Default: Current date.. I thought it is the current date at the time of the request, which makes sense in terms of cache busting. But it seems rather to be the timestamp of the build, right?

Since the SW already knows the correct hashes for all assets, wouldn't it be ideal to use the hash value as __uncache value?

Thanks :-)

@NekR
Copy link
Owner

NekR commented Nov 10, 2016

I think a sentence in the README would be well worth it (for those that are no SW experts).

That's good idea. Can you open an issue about it or maybe even send a PR?

Please allow me one more question about version: Where does it come from?
...
But it seems rather to be the timestamp of the build, right?

It's a build version. It's generated on build time indeed and used to tell to SW/AppCache to perform an update (in case you don't change any file names).

Since the SW already knows the correct hashes for all assets, wouldn't it be ideal to use the hash value as __uncache value?

Probably. __uncache thing was added before SW became knowing about assets hashes, so __uncache used build version instead.

@jampy
Copy link
Author

jampy commented Dec 6, 2016

When you register SW browser check registration URL for the scope, if there is already a SW and it has different URL then registration fails. In other words, to change SW file you need to unregister current one first. Changing SW name is a bad-bad practice.

Could you please elaborate why it is bad-bad practice? I couln't find any hints why that could cause problems.

@NekR
Copy link
Owner

NekR commented Dec 6, 2016

@NekR
Copy link
Owner

NekR commented Dec 6, 2016

Another reason: unregistering SW (which is required to change its name) makes you to loose all your Push Subscriptions. Which in theory could be restored after new SW is installed, or may not if user changed permissions or if that permissions expired. Anyway, that's just wrong way to do handle SW updates.

@jampy
Copy link
Author

jampy commented Dec 7, 2016

Thanks for the link and the explanation - that helped a lot! :-)

I completely overlooked that the "index" page is cached, too.

I feared a race condition that clearly doesn't exist.

So the single source of truth (for a cached application) is sw.js - which is awesome :-)

The only little race condition perhaps is the index file itself (the HTML URL). All my assets have a unique name (contain a hash), except for the index file. It is theoretically possible that the index file gets updated after a new sw.js is fetched from the server but before the SW is able to load the index file.

However, being only the initial HTML I think I can circumvent this issue.

@NekR
Copy link
Owner

NekR commented Dec 7, 2016

It is theoretically possible that the index file gets updated after a new sw.js is fetched from the server but before the SW is able to load the index file.

If I understand you correctly, then that's isn't an issue. When new SW arrives and installs it downloads assets into completely separate cache not related to the previous (current) SW. So until new SW is activated all assets are still served from the old cache, even though new ones are already downloaded. Once new SW is activated it removes all old caches and uses new one which it downloaded during install phase.

No race there :-)

@jampy
Copy link
Author

jampy commented Dec 7, 2016

I see, but for the sake of complecity:

I had something like this in my index HTML:

  <script type="text/javascript">var ASSET_BUNDLES={"main":"main-bundle-c706cd4dca14133426b3208a452387-en.js","db-engine-worker":"db-engine-worker-bundle-c706cd4dca14133426b3208a452387-en.js","pdfjs-worker":"pdfjs-worker-bundle-c706cd4dca14133426b3208a452387-en.js"};</script>
  <script type="text/javascript" src="main-bundle-c706cd4dca14133426b3208a452387-en.js"></script>

The ASSET_BUNDLES is important because the name of the asset bundles is hashed and I need a way to know the correct URLs at runtime. These are web Worker source codes, so have nothing to do with Webpack require().

Let me illustrate a potential race condition:

  • T+0: user X returns to the PWA at /webapp
  • T+1: service worker of user X starts to grab /webapp/sw.js and gets version 100, it starts to download changed assets from a slow connection, including /webapp/sw.js and some JPG files. It will also download /webapp but it is currently busy with other URLs.
  • T+2: I deploy a new version of the application which also has different ASSET_BUNDLES hashes (takes a fraction of a second)
  • T+3: service worker of user X (still within the update process) finally grabs current index file (URL /webapp) from the server, which however now is already at version 101

You see, in such a situation the user gets all assets matching version 100 (since they have unique URLs) but with the index file of version 101 (the only asset that has not a hashed URL). In the end ASSET_BUNDLES mismatches the rest of the application and so the app won't work.

I solved this today by putting ASSET_BUNDLES into a dedicated, hashed asset. Now the HTML code looks like this:

  <script type="text/javascript" src="asset-bundles-45900c92ee5e488af14ea4dd02f693f3.js"></script>
  <script type="text/javascript" src="main-bundle-c706cd4dca14133426b3208a452387-en.js"></script>

...with the ASSET_BUNDLES variable being now in asset-bundles-45900c92ee5e488af14ea4dd02f693f3.js.

This will make sure that both <script> tags will always be "active" together, and thus match.

Of course this causes an additional request. It's not ideal but I can live with that.

The race condition still remains for the rest of the HTML file, but that does not contain anything critical.

It also remains sort of a race condition because it may still happen that /webapp/sw.js and /webapp are out of sync for the same reason, causing the offline cache to hold the wrong assets. But at least the right requests will be made and the app will still work online.

@jampy
Copy link
Author

jampy commented Dec 7, 2016

Of course, there would be no problem at all if ASSET_BUNDLES could be contained in main-bundle-xxx.js itself. But it's an asset automatically generated by a Webpack plugin and merging it into the main-bundle so that I can require() it would probably cause a cyclic dependency, but I'm not a Webpack plugin expert.

@NekR
Copy link
Owner

NekR commented Dec 7, 2016

You see, in such a situation the user gets all assets matching version 100 (since they have unique URLs) but with the index file of version 101 (the only asset that has not a hashed URL). In the end ASSET_BUNDLES mismatches the rest of the application and so the app won't work.

Yes, I see the race here. It won't however break the app completely, but rather just make it not operational in offline. Also on next web page navigation new SW will be downloaded which will recover the situation. But yeah, I see what you mean here.

But it's an asset automatically generated by a Webpack plugin and merging it into the main-bundle so that I can require() it would probably cause a cyclic dependency

Yes, hard to do that right.

@NekR
Copy link
Owner

NekR commented Dec 7, 2016

The only true way to solve this and #142 is by verifying assets hashes at runtime, i.e. generating hash of received content which is kinda slow. Might not be a problem on install phase though.

@jampy
Copy link
Author

jampy commented Dec 7, 2016

generating hash of received content which is kinda slow

FYI, I use client-side hashing a lot with megabyte-sized data and it's quite fast even on mobile. Also, most browsers have native (asynchronous) support for cryptographic calculations like SHA-1.

But yeah, every millisecond counts, that's right.

@jampy
Copy link
Author

jampy commented Dec 7, 2016

If your're interested, for a fast pure-JavaScript SHA1 implementation see rusha

I'm focused on SHA-1 because I need strong hashes. MD5 or similar would probably be probably good enough for asset versioning and it's faster.

I'm not asking you to integrate hashes, though (but it would be awesome ;-)

@NekR
Copy link
Owner

NekR commented Dec 7, 2016

Also, most browsers have native (asynchronous) support for cryptographic calculations like SHA-1.

Yeah, I know, this is why I made it generated SHA-1 hashes and not MD5 in #129

I'm focused on SHA-1 because I need strong hashes. MD5 or similar would probably be probably good enough for asset versioning and it's faster.

As you said, there is native support of SHA-1 in browsers so not need to calculate MD5 manually.

I'm not asking you to integrate hashes, though (but it would be awesome ;-)

I'll think about it :-)

@Zyclotrop-j
Copy link

Has there been progress on this issue?
I actually ran into a problem when trying to load from the google-sheets api.

What I want to do is, fetching data from a google spreadsheet and have it updated every 24h.
To do so, I use the offline plugin with autoUpdate: 1000 * 60 * 60 * 24, externals: [https://sheets.googleapis.com/v4/spreadsheets/...?key=...&fields=sheets.data.rowData.values.effectiveValue,sheets.properties(sheetId,title,index)]
This translates into a sw-request https://sheets.googleapis.com/v4/spreadsheets/...?key=...&fields=sheets.da...dex)&__uncache=2017-9-4%2007%3A35%3A40.
The response is 400, status: "INVALID_ARGUMENT", Unknown name "__uncache"!

So the __uncache param actually breaks the api-call! It would be awesome to be able to turn it off.

@NekR
Copy link
Owner

NekR commented Sep 5, 2017

Hi, no progress so far.

@ghost
Copy link

ghost commented Jun 15, 2018

Hi, this is something I'm interested in too- I'd love a straightforward way to simply disable the extra URL parameter. In my app, this is causing an avoidable duplicate fetch of a fairly large file every time the app upgrades. (The page itself requests it slightly before the SW gets around to it, and the SW can't rely on the browser's cache because of this parameter.) These resources are always versioned with a filename hash and I use Cache-Control:immutable on them anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants