You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
michaelblob had tons of tasks fail very early because its IP was not in the Wasabi whitelist.
we should not let a failing worker (considering that all its tasks are failing early) to continue to get new tasks.
we should harden the IP change process.
On that second point:
We've had an issue in the past, after the change to postgres which was linked to the new IP not being recorded to DB at time of generation of policy.
michaelblob's IP seems dynamic. 4d ago, it was already failing due to IP not in whitelist with 173.73.128.55. Now it's failing for same reason with IP 70.108.9.176.
New IP was already recorded in DB when I checked (17:30) but not in whitelist.
Whitelist had been modified on Oct 25, 2023, 2:09 AM. Don't know which worker triggered it. We should log that.
We limit each worker to 4 IP changes per day but I can't find any trace of this happening in grafana
Only Call to policy update was this one so it succeeded.
From those initial information it looks like a similar bug to the previous one (assuming michaelblob triggered that Oct25 change): we saw the IP change, recorded it but for some reason the IP is not in the updated list.
This matches with my record_ip_change("michaelblob") fixing it.
The text was updated successfully, but these errors were encountered:
So it looks like michaelblob worker triggered the policy update.
However, the new IP is NOT in the CreatePolicyVersion operation. Previous IP (I checked, it was still 173.73.128.55) is in the policy. So previous fix was not correct / sufficient.
First part regarding "we should not let a failing worker (considering that all its tasks are failing early) to continue to get new tasks." should be moved to a specific issue from my PoV, it has been like this for long and is probably not a small change since we have many failing tasks currently for reasons not linked to worker setup.
michaelblob
had tons of tasks fail very early because its IP was not in the Wasabi whitelist.On that second point:
michaelblob
's IP seems dynamic. 4d ago, it was already failing due to IP not in whitelist with173.73.128.55
. Now it's failing for same reason with IP70.108.9.176
.From those initial information it looks like a similar bug to the previous one (assuming michaelblob triggered that Oct25 change): we saw the IP change, recorded it but for some reason the IP is not in the updated list.
This matches with my
record_ip_change("michaelblob")
fixing it.The text was updated successfully, but these errors were encountered: