Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm default flag --atomic leads to error when pulling latest #922

Closed
renedupont opened this issue Jul 20, 2022 · 7 comments
Closed

Helm default flag --atomic leads to error when pulling latest #922

renedupont opened this issue Jul 20, 2022 · 7 comments
Labels
bug Something isn't working

Comments

@renedupont
Copy link
Member

renedupont commented Jul 20, 2022

Describe the bug
We are using helm charts and in there, we configured to pull the latest image which we build earlier in the pipeline run:

image:
  ...
  tag: "latest"

As you can see here, the latest tag gets set AFTER helm upgrade was done in the rollout stage. Before that, we only have commit hash tag that was created in the image build stage.
It is also important to know that we set here the helm default flags --install and --atomic, where atomic does the following:

--atomic       if set, upgrade process rolls back changes made in case of failed upgrade. The --wait flag will be set automatically if --atomic is used
--wait         if set, will wait until all Pods, PVCs, Services, and minimum number of Pods of a Deployment, StatefulSet, or ReplicaSet are in a ready state before marking the release as successful. It will wait for as long as --timeout

The following happened: We did a pipeline run where we build successfully an image which was not able to run due to some misconfigurations. As a consequence, every pipeline run afterwards failed during helm upgrade due to --atomic not being successful and hence the latest tag was not set even though the image was successfully build. Since we are pulling latest, we never pulled the fixed image with the correct configuration but rather the one that failed in the past and we were basically stuck.

We "fixed" it by uninstalling the helm release in openshift and then provide via the rollout stage new helm default flags where we just pass --install without --atomic.

...
  odsComponentStageRolloutOpenShiftDeployment(context,
    [helmDefaultFlags: ['--install']])
}

Also, we had to remove the DeploymentConfig that was created from the provisioning process and which we don't use since we use Deployment in our Helm chart.

Another "workaround" that we discussed was to not pull latest and provide the commit sha tag to helm (via helmValues in jenkinsfile rolloutstage) and just pull that.

@michaelsauter @kuebler

@renedupont renedupont added the bug Something isn't working label Jul 20, 2022
@clemensutschig
Copy link
Member

clemensutschig commented Jul 20, 2022

@renedupont - #opendevstack/ods-jenkins-shared-library/916
we are currently revaming the whole thing - so if you want to try - enjoy :)

amongst many other things we pass the image tag already as helm value - and also the registry

@michaelsauter
Copy link
Member

Another "workaround" that we discussed was to not pull latest and provide the commit sha tag to helm

I think that would be the best solution long term. Using the latest tag approach is prone to be problematic.

BTW, when we discussed this yesterday I think we also concluded that the "atomic" behaviour might be applied to the old state, not the state after resources have been updated. It looks to me like #916 might fix that problem because it does not pause Helm deployments (but in turn it doesn't seem to auto-add the RM metadata labels anymore?).

@clemensutschig
Copy link
Member

@michaelsauter - feel free to help... baking this was already kind of a hell of a ride :)

@michaelsauter
Copy link
Member

@clemensutschig not sure I am of help at this stage, since you already made a lot of changes, and I would need to play a lot of catch-up. But maybe you want to walk me through the PR when it is closer to being done and we can cross-check a few things? There's also a few learnings about Helm that we had in the ODS pipeline ods-deploy-helm task that may be worthwhile to carry over.

BTW:

in turn it doesn't seem to auto-add the RM metadata labels anymore?

That wasn't to indicate that this should be put back in. More of a question. As noted #860 (which is actually somewhat related to what Rene encountered), I consider pausing/unpausing to be problematic and would rather avoid it if possible.

@metmajer
Copy link
Member

@serverhorror did you test your PR against this behavior? Shall we close this one?

@serverhorror
Copy link
Contributor

@metmajer

#916 uses the git commit tag there's no latest in the PR that would cause this behaviour. I did not explicitely test this.

jafarre-bi pushed a commit that referenced this issue Jan 27, 2023
Add Helm to possible methods for Component and Release Manger rollouts

* Fixes #916, #866
* Relates #922

Co-authored-by: Martin Marcher <serverhorror@users.noreply.github.com>
Co-authored-by: Martin Marcher <martin.marcher@boehringer-ingelheim.com>
@renedupont
Copy link
Member Author

Closing this as using git commit tag as image tag is by now the default way when using helm in ods.
See related issue #1027

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants