Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Sync chart mirror on chart spec change to prevent incorrect reconciliation #573

Merged
merged 1 commit into from
Dec 10, 2020

Conversation

amit-handda
Copy link
Contributor

@amit-handda amit-handda commented Dec 4, 2020

This PR fixes the issue observed in #562
if there is a change in the chartsource (ref change, eg), its possible that the mirror's ref-sha be obsolete w.r.t. upstream repo's ref-sha. To avoid spurious deploy, mirror is synced before doing helm reconciliation.

CHANGELOG :
Bug fixes:
operator: incase of a helmrelease's chartsource update, to prevent incorrect helm reconciliation, sync chart mirror beforehand

// repo's ref-sha. To avoid spurious deploy, mirror is synced before
// doing helm reconciliation
if csDiff := cmp.Diff(oldHr.Spec.ChartSource,newHr.Spec.ChartSource); csDiff != "" {
c.apiServer.SyncMirrors()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see how this is different from what the controller already does, as during the preparation stage GetMirrorCopy is called, which automatically results in an additional sync:

s, ok, err := c.sync(hr, mirror, repo)

Copy link
Contributor Author

@amit-handda amit-handda Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your kind review. IIUC, this sync ^ is between helmrelease sourceref and helm. It assumes Mirror is uptodate and works to reconcile helmrelease.
while c.apiServer.SyncMirrors() <- this one is basically git fetch --tags origin for each Mirror.
this ensures the Mirror is uptodate before doing helmrelease sync. I trigger it only if there is a change in helmrelease.spec.chartsource
Please let me know if I am wrong. I tested this behaviour locally, before creating a PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are indeed correct, what worries me about using SyncMirrors it that it triggers it for each mirror, while the task is just for a single repository. This is depending on the amount of repositories it mirrors, either inefficient or really inefficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it would be inefficient. For us, we are only using single repo.
Before doing that optimization, wanted a goahead from the project authors.
I can try my hand at it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have my blessings to go ahead and change it, but be aware that the efforts will likely have a short lifetime, as we are slowly moving users to the helm-controller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thanks. aware of the short lifetime. this fix is important to us since we are using it in production and its impacting our operations. we want to fix it before migrating to v2.

pkg/operator/operator.go Outdated Show resolved Hide resolved
pkg/api/api.go Outdated Show resolved Hide resolved
Copy link
Member

@hiddeco hiddeco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @amit-handda, thank you for following up 🥇

Before I merge, can you please smash your commits into one?

Signed-off-by: amit handa <amit.handa@doordash.com>
@amit-handda
Copy link
Contributor Author

Thanks @hiddeco, its a pleasure. merged the commits.

@hiddeco hiddeco changed the title to prevent incorrect helm reconciliation, sync chart mirror beforehand Sync chart mirror on chart spec change to prevent incorrect reconciliation Dec 10, 2020
@hiddeco hiddeco merged commit 797fe26 into fluxcd:master Dec 10, 2020
@amit-handda
Copy link
Contributor Author

@hiddeco morning ! may I know when a release will be cut from the master ? we would like to use the bug fix in our clusters ASAP. if the release is far away, please let me know if there is any way we could use a release based on master. thanks

@hiddeco
Copy link
Member

hiddeco commented Dec 14, 2020

@amit-handda there are pre-release builds available at https://hub.docker.com/r/fluxcd/helm-operator-prerelease/tags

@amit-handda
Copy link
Contributor Author

thanks @hiddeco. I am running and testing the prerelease into our cluster. I faced repeated upgrade helm deploys for a release using that image. not sure about the reason for the same. when I compare the manifest diff (between revision 43-48), I get nothing.

I reverted to helm-operator v1.1.0, it upgraded as well (revision 48) once but stopped thereafter.

41      	Tue Dec 15 00:05:32 2020	superseded	coredns-ds-0.1.0	1.1.3      	Upgrade complete
42      	Tue Dec 15 00:14:05 2020	superseded	coredns-ds-0.1.0	1.1.3      	Upgrade complete
43      	Tue Dec 15 00:22:18 2020	superseded	coredns-ds-0.1.0	1.1.3      	Upgrade complete
44      	Tue Dec 15 00:31:34 2020	superseded	coredns-ds-0.1.0	1.1.3      	Upgrade complete
45      	Tue Dec 15 00:40:57 2020	superseded	coredns-ds-0.1.0	1.1.3      	Upgrade complete
46      	Tue Dec 15 00:50:19 2020	superseded	coredns-ds-0.1.0	1.1.3      	Upgrade complete
47      	Tue Dec 15 00:58:24 2020	superseded	coredns-ds-0.1.0	1.1.3      	Upgrade complete
48      	Tue Dec 15 01:04:46 2020	deployed  	coredns-ds-0.1.0	1.1.3      	Upgrade complete

helm-operator v1.1.0 logs

ts=2020-12-15T01:04:35.025574026Z caller=release.go:75 component=release release=coredns-ds targetNamespace=infra-system resource=infra-system:helmrelease/coredns-ds helmVersion=v3 info="starting sync run"
ts=2020-12-15T01:04:44.324291244Z caller=release.go:289 component=release release=coredns-ds targetNamespace=infra-system resource=infra-system:helmrelease/coredns-ds helmVersion=v3 info="running upgrade" action=upgrade
ts=2020-12-15T01:04:44.420580682Z caller=helm.go:69 component=helm version=v3 info="preparing upgrade for coredns-ds" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T01:04:44.428575277Z caller=helm.go:69 component=helm version=v3 info="resetting values to the chart's original version" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T01:04:50.819186524Z caller=helm.go:69 component=helm version=v3 info="performing update for coredns-ds" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T01:04:51.120086301Z caller=helm.go:69 component=helm version=v3 info="creating upgraded release for coredns-ds" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T01:04:51.319565818Z caller=helm.go:69 component=helm version=v3 info="checking 4 resources for changes" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T01:04:51.325245619Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ServiceAccount \"coredns-ds\"" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T01:04:51.331345804Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ConfigMap \"coredns-ds\"" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T01:04:51.421809622Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"system:coredns-ds-infra-system\"" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T01:04:51.722961744Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for DaemonSet \"coredns-ds\"" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T01:04:51.7441015Z caller=helm.go:69 component=helm version=v3 info="updating status for upgraded release for coredns-ds" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T01:04:51.919348374Z caller=release.go:309 component=release release=coredns-ds targetNamespace=infra-system resource=infra-system:helmrelease/coredns-ds helmVersion=v3 info="upgrade succeeded" revision=2e248a4f7af84808dca04ea36ac087e63a07a0f1 phase=upgrade

helm-release prerelease logs are similar. but hard to pinpoint what triggers the upgrade.

ts=2020-12-15T00:31:20.921104181Z caller=release.go:79 component=release release=coredns-ds targetNamespace=infra-system resource=infra-system:helmrelease/coredns-ds helmVersion=v3 info="starting sync run"
ts=2020-12-15T00:31:30.21873566Z caller=release.go:353 component=release release=coredns-ds targetNamespace=infra-system resource=infra-system:helmrelease/coredns-ds helmVersion=v3 info="running upgrade" action=upgrade
ts=2020-12-15T00:31:30.327099447Z caller=helm.go:69 component=helm version=v3 info="preparing upgrade for coredns-ds" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T00:31:30.419331277Z caller=helm.go:69 component=helm version=v3 info="resetting values to the chart's original version" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T00:31:37.925599868Z caller=helm.go:69 component=helm version=v3 info="performing update for coredns-ds" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T00:31:38.220623061Z caller=helm.go:69 component=helm version=v3 info="creating upgraded release for coredns-ds" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T00:31:38.337627676Z caller=helm.go:69 component=helm version=v3 info="checking 4 resources for changes" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T00:31:38.422926519Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ServiceAccount \"coredns-ds\"" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T00:31:38.522672542Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ConfigMap \"coredns-ds\"" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T00:31:38.537033747Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"system:coredns-ds-infra-system\"" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T00:31:38.627546945Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for DaemonSet \"coredns-ds\"" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T00:31:38.818835129Z caller=helm.go:69 component=helm version=v3 info="updating status for upgraded release for coredns-ds" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T00:31:39.219042077Z caller=release.go:364 component=release release=coredns-ds targetNamespace=infra-system resource=infra-system:helmrelease/coredns-ds helmVersion=v3 info="upgrade succeeded" revision=2e248a4f7af84808dca04ea36ac087e63a07a0f1 phase=upgrade

I am digging into it.

@amit-handda
Copy link
Contributor Author

from logs, it seems

changed = func() bool {

it decides the files are changed and upgrades the deploy. but why repeatedly ?

@amit-handda
Copy link
Contributor Author

@hiddeco I think the bug is the following. lastAttemptedRevision is not updated to revision.
but I cant figure out the why, from reading the code :(

  lastAttemptedRevision: 0ae69df149b7fd691e2370a6f3989a35d8a3279c
  observedGeneration: 20
  phase: Succeeded
  releaseName: coredns-ds
  releaseStatus: deployed
  revision: c2f674826e963aa5eaf92f5f968c1ce0bf6168fa

which leads to repeated upgrades as following:

ts=2020-12-15T19:03:58.421169076Z caller=release.go:79 component=release release=coredns-ds targetNamespace=infra-system resource=infra-system:helmrelease/coredns-ds helmVersion=v3 info="starting sync run"
ts=2020-12-15T19:04:09.818923841Z caller=release.go:353 component=release release=coredns-ds targetNamespace=infra-system resource=infra-system:helmrelease/coredns-ds helmVersion=v3 info="running upgrade" action=upgrade
ts=2020-12-15T19:04:11.020610242Z caller=helm.go:69 component=helm version=v3 info="preparing upgrade for coredns-ds" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T19:04:11.030409308Z caller=helm.go:69 component=helm version=v3 info="resetting values to the chart's original version" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T19:04:16.12112867Z caller=helm.go:69 component=helm version=v3 info="performing update for coredns-ds" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T19:04:16.421382615Z caller=helm.go:69 component=helm version=v3 info="creating upgraded release for coredns-ds" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T19:04:17.119197762Z caller=helm.go:69 component=helm version=v3 info="checking 4 resources for changes" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T19:04:17.223713507Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ServiceAccount \"coredns-ds\"" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T19:04:17.22996025Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ConfigMap \"coredns-ds\"" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T19:04:17.235995814Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"system:coredns-ds-infra-system\"" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T19:04:17.324402092Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for DaemonSet \"coredns-ds\"" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T19:04:17.430592934Z caller=helm.go:69 component=helm version=v3 info="updating status for upgraded release for coredns-ds" targetNamespace=infra-system release=coredns-ds
ts=2020-12-15T19:04:17.618977793Z caller=release.go:364 component=release release=coredns-ds targetNamespace=infra-system resource=infra-system:helmrelease/coredns-ds helmVersion=v3 info="upgrade succeeded" revision=c2f674826e963aa5eaf92f5f968c1ce0bf6168fa phase=upgrade

@amit-handda
Copy link
Contributor Author

amit-handda commented Dec 15, 2020

should I create an issue to track it ?
Also, I used log-release-diffs flag. but its not useful, it logs only dry-run release diffs, here, the chart deploy goes straight to upgradeaction.

UPDATE:
current helm-operator version in prod (1.1.0) doesnt show same behaviour. lastAttemptedRevision is updated correctly.

  lastAttemptedRevision: c2f674826e963aa5eaf92f5f968c1ce0bf6168fa
  observedGeneration: 20
  phase: Succeeded
  releaseName: coredns-ds
  releaseStatus: deployed
  revision: c2f674826e963aa5eaf92f5f968c1ce0bf6168fa

@hiddeco
Copy link
Member

hiddeco commented Feb 11, 2021

@amit-handda were you able to debug this further? I have been looking into it as I am planning to prepare a release, but have been unable to reproduce the issue.

One thing that comes to mind is that you may have an older version of the CRD applied? This has been the cause of spurious upgrades before...

@fredr
Copy link
Contributor

fredr commented May 19, 2021

@hiddeco @amit-handda I can confirm that if you upgrade from 1.1.0 to 1.2.0 without updating your CRDs, and then update a chart, it will update forever without being able to update lastAttemptedRevision. I guess it is somehow related to the new Phases that where introduced in 1.2.0

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants