-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Downtime On initial deployment using Istio destination rule #2507
Comments
@zachaller please take a look, thanks. |
Another user had issues https://cloud-native.slack.com/archives/C01U781DW2E/p1674472008986509 |
+1 we are also experiencing this issue, is there an estimation for when it will be fixed? @zachaller |
we have a workaround to mediate this downtime for initial deployment:
after the initial deployment, we removed |
We just saw this happen in our infrastructure. I think this happens on every deployment and not just the initial creation of the Rollout resource. I don't see any "wait for canary RS to be available" before updating the Istio VirtualService, and this means the traffic gets routed to the canary pods even before they have spun up. This causes "no healthy upstream" error in our case (grpc traffic). Can we please add something similar to As a workaround, we have added first step that pauses for a configured amount of duration so that the canary replicaset can come up before it starts serving traffic.
|
@zachaller We also came across this bug during the implementation of the argo rollouts process for Istio. Is there any chance to fix that problem? |
From my extensive testing while creating a fix for this I noticed that also on the final step of a canary-release, the destinationrule is updated to 100% before the canary-replicaset is at full desired capacity, which might actually limit availability as well. My above PR fixes that issue as well. |
…estinationRule Subsets. Fixes argoproj#2507 Signed-off-by: Wietse Muizelaar <wmuizelaar@bol.com>
We are also experiencing this error. I noticed it happened even if I create the I created a new
We're looking forward to this fix as well |
…estinationRule Subsets. Fixes argoproj#2507 Signed-off-by: Wietse Muizelaar <wmuizelaar@bol.com>
Checklist:
Describe the bug
When we initialized rollout for a deployment with Istio traffic routing, we found rollout controller created a new RS from the current RS template with a new label key named 'rollouts-pod-template-hash', and the controller update Istio DR with the label immediately without considering whether the new RS is available or not. This leads 503 http error, no healthy upstreams.
To Reproduce
no healthy upstream
error because the the RS is not ready for traffic.Expected behavior
The rollout controller shift the traffic to rollout's RS while the RS is available.
Version
argo-rollouts: v1.2.2
Could we add some determine statements to ensure the new RS is available before updating the Istio DR, which is the same like ensureSVCTarget does?
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered: