Update prometheus operator 0.43.2 #1162

ipochi · 2020-11-04T09:44:36Z

This PR updates the Prometheus operator component to the latest upstream chart that is now renamed from prometheus-operator to kube-prometheus-stack.

Release notes:

Steps(1-4) must be performed before updating the cluster i.e before running lokoctl cluster apply.
Steps 5,6 must be performed after updating the cluster ie after running lokoctl cluster apply.

Upgrading this component requires downtime and some steps to be executed manually before upgrading:

Patch the PersistenceVolume created/used by the prometheus-operator chart to Retain claim policy:

kubectl patch pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-prometheus-prometheus-operator-prometheus-0")].metadata.name}')

kubectl patch pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-alertmanager-prometheus-operator-alertmanager-0")].metadata.name}')

Note: To execute the above command, the user must have a cluster wide permission.

Uninstall the prometheus-operator release and delete the existing PersistentVolumeClaim, and verify PV become Released.
```
lokoctl component delete prometheus-operator
kubectl delete pvc/data-prometheus-prometheus-operator-prometheus-0" -n monitoring
kubectl delete pvc/data-alertmanager-prometheus-operator-alertmanager-0" -n monitoring
```
You can choose to remove all your existing CRDs (ServiceMonitors, Podmonitors, etc.) if you want to.

Remove current spec.claimRef values to change the PV's status from Released to Available.

kubectl patch pv --type json -p='[{"op": "remove", "path": "/spec/claimRef"}]' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-prometheus-prometheus-operator-prometheus-0")].metadata.name}')

kubectl patch pv --type json -p='[{"op": "remove", "path": "/spec/claimRef"}]' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-alertmanager-prometheus-operator-alertmanager-0")].metadata.name}')

Note: To execute the above command, the user must have a cluster wide permission.

Make sure that the storage_class and prometheus.storage_size are unchanged during the upgrade process.
If monitoring was enabled for rook, contour, metallb components, then the user needs to update the said components as well. Backwards compatibility exists and its not mandatory to update the said components before installing updated prometheus-operator component.
Proceed to a fresh prometheus-operator component installation. The new release should now re-attach your previously released PV with its content.

Closes #1079

invidian

Just one question, otherwise LGTM.

assets/charts/components/prometheus-operator/charts/grafana/Chart.yaml

invidian

LGTM

In 5. of release notes, I would add "... otherwise metrics from those components won't be collected."

ipochi · 2020-11-05T16:45:16Z

LGTM

In 5. of release notes, I would add "... otherwise metrics from those components won't be collected."

done.

surajssd

In step 2 of the PR description:

kubectl delete pvc/<PersistenceVolumeClaim name> -n monitoring

I think we know what the PVCs are called in monitoring namespace we should list them explicitly. If user has their own stuff here we just add a caveat that they handle them on their own.

In step 1 & step 3:

Since we already know the names of PVCs we can extract PV names from those PVCs, user does not have to bother with finding names.

The names we found in step 1 can be used in step 3.

assets/charts/components/prometheus-operator/CONTRIBUTING.md

assets/charts/control-plane/kubernetes/templates/etcd-endpoints.yaml

ipochi · 2020-11-09T07:40:16Z

In step 2 of the PR description:
kubectl delete pvc/<PersistenceVolumeClaim name> -n monitoring
I think we know what the PVCs are called in monitoring namespace we should list them explicitly. If user has their own stuff here we just add a caveat that they handle them on their own.

In step 1 & step 3:

Since we already know the names of PVCs we can extract PV names from those PVCs, user does not have to bother with finding names.

The names we found in step 1 can be used in step 3.

@surajssd For step 1, would this be legible ?

kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-alertmanager-prometheus-operator-alertmanager-0")].metadata.name}' | xargs kubectl patch pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

 kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="ddata-prometheus-prometheus-operator-prometheus-0")].metadata.name}' | xargs kubectl patch pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

and similar for step 3.

surajssd · 2020-11-09T07:57:33Z

@ipochi yes for step1 it works fine. But then in step3 we won't have those PVs then above steps won't work. So we might need to save it in an intermediate vars.

ipochi · 2020-11-09T08:03:25Z

@ipochi yes for step1 it works fine. But then in step3 we won't have those PVs then above steps won't work. So we might need to save it in an intermediate vars.

@surajssd For step 3, we need to change the name of the PVCs.

 kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-alertmanager-prometheus-operator-kube-p-alertmanager-0")].metadata.name}' | xargs kubectl patch pv --type json -p='[{"op": "remove", "path": "/spec/claimRef"}]'

invidian · 2020-11-09T08:05:14Z

Why use xargs instead of $() ?

ipochi · 2020-11-09T08:50:50Z

Why use xargs instead of $() ?

That works too. My question is "Is it still legible ?"

invidian · 2020-11-09T08:57:07Z

That works too. My question is "Is it still legible ?"

Seems like a common pattern to me, so I don't mind. More common than using xargs IMO.

ipochi · 2020-11-09T10:53:54Z

@surajssd @invidian Updated release notes with the desired changes.

Addressed. Please re-review.

invidian · 2020-11-09T11:16:44Z

I think changelog can be still simplified by using OR operator to just run one command instead of two.

ipochi · 2020-11-09T13:39:23Z

@surajssd @invidian

This step doesnt work if PVC name is used to get the PV.

Remove current spec.claimRef values to change the PV's status from Released to Available.

kubectl patch pv --type json -p='[{"op": "remove", "path": "/spec/claimRef"}]' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-prometheus-prometheus-operator-kube-p-prometheus-0")].metadata.name}')

kubectl patch pv --type json -p='[{"op": "remove", "path": "/spec/claimRef"}]' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-alertmanager-prometheus-operator-kube-p-alertmanager-0")].metadata.name}')

Not sure whats the problem here. Command works, but the status of second PVC changes to Bound state if the first command is applied and vice-versa.

Changing the claimRef with pv name works fine. Same problem with xargs

invidian · 2020-11-09T14:14:18Z

Not sure whats the problem here. Command works, but the status of second PVC changes to Bound state if the first command is applied and vice-versa.

I'll have a look.

invidian · 2020-11-09T14:39:01Z

We could do the following to perform just one patch operation 😄 But it seems less readable to me:

kubectl patch pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' $(for i in data-prometheus-prometheus-operator-prometheus-0 data-alertmanager-prometheus-operator-alertmanager-0; do kubectl get pv -o jsonpath="{.items[?(@.spec.claimRef.name=='$i')].metadata.name}"; echo -n " "; done; echo)

invidian · 2020-11-09T14:54:45Z

For sure names in 3. are wrong:

-kubectl patch pv --type json -p='[{"op": "remove", "path": "/spec/claimRef"}]' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-prometheus-prometheus-operator-kube-p-prometheus-0")].metadata.name}')
-kubectl patch pv --type json -p='[{"op": "remove", "path": "/spec/claimRef"}]' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-alertmanager-prometheus-operator-kube-p-alertmanager-0")].metadata.name}')
+kubectl patch pv --type json -p='[{"op": "remove", "path": "/spec/claimRef"}]' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-prometheus-prometheus-operator-prometheus-0")].metadata.name}')
+kubectl patch pv --type json -p='[{"op": "remove", "path": "/spec/claimRef"}]' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-alertmanager-prometheus-operator-alertmanager-0")].metadata.name}')

This commit also updates sub-charts : grafana, kube-state-metrics and node-exporter New versions: * Grafana to 7.2.1 * Prometheus Operator to v0.43.2 * kube-state-metrics to v1.9.7 * node_exporter to v1.0.1 Signed-off-by: Imran Pochi <imran@kinvolk.io>

This commit updates the name of the resources such as Deployments, StatefulSets, DaemonSet and Pod whereever needed to reflect the new name in the Helm chart. Signed-off-by: Imran Pochi <imran@kinvolk.io>

ipochi · 2020-11-24T13:26:31Z

@invidian @invidian @iaguis
Build is passing. Feedback recieved on metrics backward compatibility is retained as mentioned here.

Please review.

surajssd

small nits otherwise LGTM

assets/charts/components/contour/templates/alerts-for-prometheus-operator-0.43.2.yaml

...arts/components/rook/templates/prometheus-ceph-v14-rules-for-prometheus-operator-0.43.2.yaml

surajssd

LGTM

invidian

LGTM

I added a comment in #1212, that maybe we can structure the commit here, that for the next release we only need to revert this commit without manual changes.

This commit updates labels used by dependent components and etcd for scraping metrics. This change is due to the Prometheus operator helm chart is now renamed to Kube-prometheus-stack and as a result uses a new label to pick up custom alerts. old label: ``` app: prometheus-operator ``` new label: ``` app:kube-prometheus-stack ``` Signed-off-by: Imran Pochi <imran@kinvolk.io>

NOTE - This commit is to be removed from the branch once a newer version of Lokomotive post v0.6.0 is released. This commit is added to ensure backwards compatibility with older Prometheus-operator component by undoing the changes made in the previous commit, hence it is a temporary commit in th branch. This commit creates the following addtional resources: - Endpoint and Service objects for etcd. - PrometheusRule object for Contour. - PrometheusRule object for Rook. - PrometheusRule object for Metallb. New resources created above differ in name and the label only offering compatibility with updated Prometheus Operator. The old resources offer backward compatibility, meaning if the the cluster is updated etcd metrics collection won't be broken and if prometheus-operator component is updated then the metrics collection of contour, rook and metallb wont be broken even if those components are updated before updating prometheus-operator. Old label: ``` app: prometheus-operator ``` New label: ``` app: kube-prometheus-stack ``` Signed-off-by: Imran Pochi <imran@kinvolk.io>

Signed-off-by: Imran Pochi <imran@kinvolk.io>

ipochi · 2020-11-25T15:39:10Z

LGTM

I added a comment in #1212, that maybe we can structure the commit here, that for the next release we only need to revert this commit without manual changes.

@invidian Just did that :)

Now the commit should be removed from the branch once we are ready to roll out a newer version of Lokomotive > 0.6.0

invidian

LGTM

@invidian Just did that :)

Thanks! It's looking good as far as I see 👍

ipochi requested review from surajssd and invidian November 4, 2020 09:44

ipochi force-pushed the imran/update-prometheus-operator-0.43.0 branch from c669822 to ba2b390 Compare November 4, 2020 10:32

invidian reviewed Nov 4, 2020

View reviewed changes

assets/charts/components/prometheus-operator/charts/grafana/Chart.yaml Show resolved Hide resolved

ipochi force-pushed the imran/update-prometheus-operator-0.43.0 branch 4 times, most recently from 7c95444 to 7e68844 Compare November 5, 2020 14:20

ipochi requested a review from invidian November 5, 2020 16:40

invidian previously approved these changes Nov 5, 2020

View reviewed changes

ipochi requested a review from iaguis November 5, 2020 16:50

surajssd previously requested changes Nov 6, 2020

View reviewed changes

assets/charts/components/prometheus-operator/CONTRIBUTING.md Show resolved Hide resolved

assets/charts/control-plane/kubernetes/templates/etcd-endpoints.yaml Outdated Show resolved Hide resolved

ipochi dismissed invidian’s stale review via ed46b72 November 9, 2020 10:53

ipochi force-pushed the imran/update-prometheus-operator-0.43.0 branch from 7e68844 to ed46b72 Compare November 9, 2020 10:53

ipochi requested review from surajssd and invidian November 9, 2020 10:54

ipochi force-pushed the imran/update-prometheus-operator-0.43.0 branch 2 times, most recently from 0cbc489 to 7645a64 Compare November 24, 2020 09:14

ipochi added 2 commits November 24, 2020 14:46

e2e tests: update name of the K8s resources

984aa33

This commit updates the name of the resources such as Deployments, StatefulSets, DaemonSet and Pod whereever needed to reflect the new name in the Helm chart. Signed-off-by: Imran Pochi <imran@kinvolk.io>

ipochi requested review from johananl and invidian November 24, 2020 13:26

invidian previously approved these changes Nov 25, 2020

View reviewed changes

surajssd reviewed Nov 25, 2020

View reviewed changes

assets/charts/components/contour/templates/alerts-for-prometheus-operator-0.43.2.yaml Outdated Show resolved Hide resolved

...arts/components/rook/templates/prometheus-ceph-v14-rules-for-prometheus-operator-0.43.2.yaml Outdated Show resolved Hide resolved

ipochi dismissed invidian’s stale review via 7fb6924 November 25, 2020 10:56

ipochi force-pushed the imran/update-prometheus-operator-0.43.0 branch 2 times, most recently from 7fb6924 to d788908 Compare November 25, 2020 11:10

ipochi requested review from invidian and surajssd November 25, 2020 11:11

surajssd previously approved these changes Nov 25, 2020

View reviewed changes

invidian previously approved these changes Nov 25, 2020

View reviewed changes

ipochi added 3 commits November 25, 2020 20:58

docs: Update docs to reflect the new labels

63fa436

Signed-off-by: Imran Pochi <imran@kinvolk.io>

ipochi dismissed stale reviews from invidian and surajssd via 63fa436 November 25, 2020 15:36

ipochi force-pushed the imran/update-prometheus-operator-0.43.0 branch from d788908 to 63fa436 Compare November 25, 2020 15:36

ipochi mentioned this pull request Nov 25, 2020

Task: Remove resources to maintain scraping of backwards compatible; post Lokomotive 0.6.0 version. #1212

Open

invidian approved these changes Nov 25, 2020

View reviewed changes

ipochi requested a review from surajssd November 25, 2020 15:46

surajssd approved these changes Nov 25, 2020

View reviewed changes

ipochi merged commit 2e2e430 into master Nov 26, 2020

ipochi deleted the imran/update-prometheus-operator-0.43.0 branch November 26, 2020 08:07

invidian mentioned this pull request Dec 7, 2020

Release v0.6.0 #1064

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update prometheus operator 0.43.2 #1162

Update prometheus operator 0.43.2 #1162

ipochi commented Nov 4, 2020 •

edited

Loading

invidian left a comment

invidian left a comment

ipochi commented Nov 5, 2020

surajssd left a comment

ipochi commented Nov 9, 2020 •

edited

Loading

surajssd commented Nov 9, 2020

ipochi commented Nov 9, 2020

invidian commented Nov 9, 2020

ipochi commented Nov 9, 2020

invidian commented Nov 9, 2020 •

edited

Loading

ipochi commented Nov 9, 2020

invidian commented Nov 9, 2020

ipochi commented Nov 9, 2020 •

edited

Loading

invidian commented Nov 9, 2020

invidian commented Nov 9, 2020

invidian commented Nov 9, 2020

ipochi commented Nov 24, 2020 •

edited

Loading

surajssd left a comment

surajssd left a comment

invidian left a comment

ipochi commented Nov 25, 2020

invidian left a comment •

edited

Loading

Update prometheus operator 0.43.2 #1162

Update prometheus operator 0.43.2 #1162

Conversation

ipochi commented Nov 4, 2020 • edited Loading

invidian left a comment

Choose a reason for hiding this comment

invidian left a comment

Choose a reason for hiding this comment

ipochi commented Nov 5, 2020

surajssd left a comment

Choose a reason for hiding this comment

ipochi commented Nov 9, 2020 • edited Loading

surajssd commented Nov 9, 2020

ipochi commented Nov 9, 2020

invidian commented Nov 9, 2020

ipochi commented Nov 9, 2020

invidian commented Nov 9, 2020 • edited Loading

ipochi commented Nov 9, 2020

invidian commented Nov 9, 2020

ipochi commented Nov 9, 2020 • edited Loading

invidian commented Nov 9, 2020

invidian commented Nov 9, 2020

invidian commented Nov 9, 2020

ipochi commented Nov 24, 2020 • edited Loading

surajssd left a comment

Choose a reason for hiding this comment

surajssd left a comment

Choose a reason for hiding this comment

invidian left a comment

Choose a reason for hiding this comment

ipochi commented Nov 25, 2020

invidian left a comment • edited Loading

Choose a reason for hiding this comment

ipochi commented Nov 4, 2020 •

edited

Loading

ipochi commented Nov 9, 2020 •

edited

Loading

invidian commented Nov 9, 2020 •

edited

Loading

ipochi commented Nov 9, 2020 •

edited

Loading

ipochi commented Nov 24, 2020 •

edited

Loading

invidian left a comment •

edited

Loading