-
Notifications
You must be signed in to change notification settings - Fork 335
Expiration changes hanging deploys #2224
Comments
We have the same problem with Version |
We encounter the same problem, we revert to previous version |
Investigating a bit, and we discovered that The answer might be to use Another way to improve this would be to catch the error and stop the GH Action to prevent hanging 💰 |
same we are stuck at the moment all of our deploys are failing .. Please advise |
We workarounded the problem by downgrading the package version in our bitpucket pipeline to
|
This sucks, sorry about this y'all. We'll have a look on monday. The current workaround is to use 1.19.8 |
This switches how we expire static assets with `[site]` uploads to use `expiration_ttl` instead of `expiration`. This is because we can't trust the time that a deploy target may provide (like in cloudflare/wrangler-legacy#2224).
This switches how we expire static assets with `[site]` uploads to use `expiration_ttl` instead of `expiration`. This is because we can't trust the time that a deploy target may provide (like in cloudflare/wrangler-legacy#2224).
Deployments are failing due to: cloudflare/wrangler-legacy#2224
* Add offensive 360 sponsoring offensive360.com joined as our sponsor. I added logo in aside and readme. * Add logo asset * Update wrangler action * Use wrangler 1.19.8 Deployments are failing due to: cloudflare/wrangler-legacy#2224 * Fix lighthouse * Use actions/cache@v2 * Remove cache
Urgh! So we can't rely on the current time given in a Docker container to be accurate!! |
This should be fixed with the latest release, v1.19.10. Can anyone confirm if this works for them or is still broken? |
Reopening until we have confirmation that this is fixed |
@caass I have a worker with ~ 3900 files in the KV store. These built up over the last few days when deleting files failed. I don't think its indicative of the patch not working necessarily - but for KV stores of this size
|
@fwwieffering I see...likely the cause of that is that An immediate fix that comes to mind would be to implement automatic HTTP retries for requests that time out, so we get 3 or 5 goes at downloading the files before giving up and throwing an error. I can work on this in wrangler 1 and raise an issue for wrangler 2. It may be worth simply running |
I was worried more about the number of files than the size of them. running |
Ah, I see. Did running |
I too am concerned about the number of existing files in the kv store, we've been deploying headless applications for over a year now. I am testing |
Seeing the same thing here but with a kv size of ~12k items. |
I believe that the Cloudflare API that Wrangler is using to expire these items is rate-limited to 1200 requests per 5 mins. |
TL;DR: This should be fixed with the latest release, v1.19.11. Upgrade and if you're still experiencing problems, please report them here <3.We've decided to revert the behavior of workers-sites to what it was before -- deleting unused files in the KV namespace when new files are uploaded. We had complaints where, on a
We remedied the second issue by setting an time-to-live instead of a expiration timestamp, but by this point the first issue had become so exacerbated that publishes were simply failing to download (or maybe re-upload, we're still not sure exactly which API requests were failing) the files from their KV namespaces. We could maybe solve our original problem by introducing an artificial delay of ~60s in-between publishing new assets and deleting old ones, but if you publish a new version of your site more often than once per minute, you would still run into the "my KV namespace is blowing up" issue. We (@petebacondarwin, @threepointone and myself) chatted about this and think we're going to leave this behavior as-is for the time being -- unless of course the "fix" (reversion) introduced in 1.19.11 doesn't work -- in wrangler 1, and focus further work on this issue in wrangler 2. Additionally, since we do offer a more complete site hosting solution in Cloudflare Pages (docs), once the dust settles here the original complaint is likely to become more low-priority. For those interested in moving their projects to Pages, we do have a migration guide that could be helpful. Wrangler2 will also have a dedicated Thank you so much to @ryan-moonwell for filing this issue and everyone who posted in the thread for providing valuable feedback and insight which allowed us to rather quickly iterate on this and reach what is (hopefully!) a working solution. I'll leave this open for now until we've either had reports of successful publishes or the issue goes stale |
We discovered critical issues with the way we expire unused assets with `[site]` (#666, cloudflare/wrangler-legacy#2224), that we're going back to the legacy manner of handling unused assets, i.e- deleting unused assets. Fixes #666
We discovered critical issues with the way we expire unused assets with `[site]` (see #666, cloudflare/wrangler-legacy#2224), that we're going back to the legacy manner of handling unused assets, i.e- deleting unused assets. Fixes #666
We discovered critical issues with the way we expire unused assets with `[site]` (see #666, cloudflare/wrangler-legacy#2224), that we're going back to the legacy manner of handling unused assets, i.e- deleting unused assets. Fixes #666
We discovered critical issues with the way we expire unused assets with `[site]` (see #666, cloudflare/wrangler-legacy#2224), that we're going back to the legacy manner of handling unused assets, i.e- deleting unused assets. Fixes #666
We discovered critical issues with the way we expire unused assets with `[site]` (see #666, cloudflare/wrangler-legacy#2224), that we're going back to the legacy manner of handling unused assets, i.e- deleting unused assets. Fixes #666
We discovered critical issues with the way we expire unused assets with `[site]` (see #666, cloudflare/wrangler-legacy#2224), that we're going back to the legacy manner of handling unused assets, i.e- deleting unused assets. Fixes #666
I don't understand that explanation. The expiry approach should have at most tripled the work (download X old files, upload to expire X old files, upload Y new files) versus the approach of just uploading Y new files. I'd expect X to be proportional to Y. To put it another way, if this approach doesn't work for 1,000 files I'd expect the old approach to stop working at 3,000 files, which is still a problem. It'd be a problem even if there was an api to set expiration on a key without uploading a value. If you're dealing with X being much larger than Y because people were using versions of wrangler that didn't delete anything for a long time, and you're not handling rate limiting correctly to have a slow but eventually successful first publish on a namespace that's gotten bloated with a bunch of old files, the easiest way for a customer to deal with that is just rename the project and delete the old KV namespace. |
We'll have to revisit the expiration logic, it's not clear what we were doing wrong; and it stands that we'll have to add rate limiting logic on our side at least, in addition to whatever else we were missing. That said, I'll assume this issue is fixed and folks are unblocked from making Sites deploys. Closing this issue. |
I'm still having problems with wrangler v1.19.11 that might be related to this, it throws an error when deleting stale files with no error message: 💁 Deleting stale files...
Error: Deleting stale files eventually works if I run the wrangler deploy command once again after it has failed. It was working OK soon after v1.19.11 was released however the above error popped up today (or I only noticed it today). I'm not sure if it is related, we deploy around about 20K files (for around 5000 pages). |
@jdddog - deleting 20K assets could be problematic in itself, with or without this fix... The API can only handle a few thousand assets at a time, so Wrangler should be batching them up. It is possible that an individual batch is failing for some reason. This would explain why the deletion works finally, after a number of attempts, since the some of the batches might be succeeding. 🤔 |
Thanks @petebacondarwin. I see, that makes sense. It used to work OK without hanging or giving errors, it has only recently become a problem. It is probably more like 10K assets that get updated each time (as some files stay the same and don't get deleted). |
We need to add a bit more sophistication in how we do this upload/sync/delete of assets. If you are currently unblocked, then this is likely to be work targeting Wrangler v2.1. |
🐛 Bug report
Describe the bug
Hi there, we use wrangler via https://github.com/cloudflare/wrangler-action to deploy a workers-site, and our builds stopped working within the past few hours.
Diving into why I notice that wrangler 1.19.9 was just cut a few hours ago.
This makes me think that the recent change is related. When I downgrade by setting
wranglerVersion: '1.19.8'
in my github actions config things work as expected.Reproduce the bug
I suspect if you deploy a site with other assets already uploaded it'll choke, not sure we do anything special that would solicit this behavior.
Expected behavior
I expect the API not to throw an error when running wrangler to deploy my site.
Environment and versions
Fill out the following information about your environment.
wrangler -V
: v1.19.9node -v
: https://github.com/cloudflare/wrangler-actionwrangler.toml
: happy to share privatelyThe text was updated successfully, but these errors were encountered: