Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add machine images info to labels #689

Merged
merged 3 commits into from
Mar 6, 2024

Conversation

tedteng
Copy link
Contributor

@tedteng tedteng commented Jan 29, 2024

How to categorize this PR?

/area ops-productivity
/kind enhancement
/platform gcp

What this PR does / why we need it:
add os information as labels in machine class objects
image

Which issue(s) this PR fixes:
Fixes gardener/machine-controller-manager#880

Special notes for your reviewer:

Release note:

add os information as labels in machine class objects.

/invite @rishabh-11

@tedteng tedteng requested review from a team as code owners January 29, 2024 02:29
@gardener-robot gardener-robot added the needs/review Needs review label Jan 29, 2024
@gardener-robot gardener-robot added area/ops-productivity Operator productivity related (how to improve operations) kind/enhancement Enhancement, improvement, extension platform/gcp Google cloud platform/infrastructure size/xs Size of pull request is tiny (see gardener-robot robot/bots/size.py) labels Jan 29, 2024
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 29, 2024
@gardener-robot-ci-2 gardener-robot-ci-2 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 29, 2024
@tedteng
Copy link
Contributor Author

tedteng commented Jan 29, 2024

Hey, @rishabh-11 Take GCP as the first sample, if everything is all good, will open the rest of the others. (AWS,Azure,Alicloud,CCloud)

Comment on lines 181 to 183
if pool.MachineImage.Name != "" && pool.MachineImage.Version != "" {
gceInstanceLabels["machine_image_name"] = pool.MachineImage.Name
gceInstanceLabels["machine_image_version"] = pool.MachineImage.Version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an example error when trying to run the change as is:

failed: googleapi: Error 400: Invalid value for field ''resource.labels'': ''''.
      Label value ''934.10.0'' violates format constraints. The value can only contain
      lowercase letters, numeric characters, underscores and dashes. The value can
      be at most 63 characters long. International characters are allowed., invalid]'

The issue is that both key and values need to adhere to gcp conventions and semver that is used in gardenlinux for example doesn't adhere to that (dots being the issue here). You need to sanitize them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ff28099
btw how did you get this error return? is it from the integration test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You get this error if you try to create a machine with invalid GCP labels. The error is from GCP API itself

@gardener-robot gardener-robot added the needs/changes Needs (more) changes label Jan 29, 2024
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 30, 2024
@@ -177,6 +177,12 @@ func (w *workerDelegate) generateMachineConfig(_ context.Context) error {
}

gceInstanceLabels := getGceInstanceLabels(w.worker.Name, pool)

if pool.MachineImage.Name != "" && pool.MachineImage.Version != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not move this code into function getGceInstanceLabels itself ? anyways MachineImage/Version fields are part of pool obj which is passed to getGceInstanceLabels

@kon-angelo
Copy link
Contributor

Actually after reading the issue that this PR tries to solve I am not convinced about the approach here. What is exactly the end-goal that you are trying to achieve ? Should the annotations be in the machine, machineclass etc. or in the actual virtual machine or the node object?

The place you are changing now would add labels to the node object and the VM in gcp itself, which the original ticket does not mention as a solution. I would suggest that before considering this change, maybe you add a comment in the initial ticket with the exact solution you are trying to achieve, specifying all the places that you want to add these labels. We can then verify the approach first and review the subsequent PRs accordingly.

/hold

@gardener-robot gardener-robot added the reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies label Jan 30, 2024
@tedteng
Copy link
Contributor Author

tedteng commented Jan 31, 2024

What is exactly the end-goal that you are trying to achieve?

The main idea from our SRE colleague is to extract the os name and os version data from any one of them (machine, machineDeployment, machineclass, machinesets, node) to generate our report. The specific location of the data, whether in labels, annotations or the spec of these objects, is not a concern.

Should the annotations be in the machine, machineclass etc., or the actual virtual machine or the node object?

per sync with MCM colleague, The MCM is more independent and cannot directly retrieve OS name and version information from Gardener's cloudprofile or worker objects (machine image information) in the source code. gardener/machine-controller-manager#880 (comment) Consequently, we propose adding data as labels in machineclass object, extending the labels in generatemachineconfig.

Please feel free to suggest any alternative solutions if available.

cc @rishabh-11 @etiennnr

@kon-angelo
Copy link
Contributor

The issue does entirely lie with this PR. Let's look at the machineclass:

https://github.com/gardener/gardener-extension-provider-gcp/blob/6c91b945ead507f2b38d0b2aa614f2f1b64b34e0/charts/internal/machineclass/templates/machineclass.yaml#L21C1-L24C11 and https://github.com/gardener/gardener-extension-provider-gcp/blob/6c91b945ead507f2b38d0b2aa614f2f1b64b34e0/charts/internal/machineclass/templates/machineclass.yaml#L43C1-L46C11

You see the labels are used both in the k8s object as well as the VM creation call. Now you try to sanitize the labels to adhere to the restrictions of GCP API, but the OS image name will not match what is in the cloudprofile (due to the transformation).

Different providers have different standards regarding the allowed characters. In the end the SREs trying to retrieve the information of the OS, would still need to know how the provider does the transformation to valid labels to match the name as it is in the cloudprofile.

Maybe we should split the labels for the machineclass with the labels that we use for the machine itself. In provider-azure for example (https://github.com/gardener/gardener-extension-provider-azure/blob/master/charts/internal/machineclass/templates/machineclass.yaml) we don't use any labels for the machineclass itself. I don't think that the labels as they are offer much value as the information are basically duplicated inside it's spec in the machine. We could therefore either decide to use a separate set of labels for the machineclass or use them unsanizited.

@tedteng tedteng marked this pull request as draft January 31, 2024 13:56
@gardener-robot gardener-robot added the needs/rebase Needs git rebase label Feb 5, 2024
@gardener-robot
Copy link

@tedteng You need rebase this pull request with latest master branch. Please check.

@gardener-robot gardener-robot added size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) and removed size/xs Size of pull request is tiny (see gardener-robot robot/bots/size.py) labels Feb 5, 2024
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Feb 5, 2024
@tedteng tedteng force-pushed the add_machine_image_label branch from a4bd901 to 61b7011 Compare February 5, 2024 07:21
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 5, 2024
Copy link
Contributor

@kon-angelo kon-angelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are not labels of the machineclass. These are properties of the machine class's provider spec as defined here: https://github.com/gardener/machine-controller-manager-provider-gcp/blob/8e6f154abbace7c6fb4a85051191de8211739677/pkg/api/v1alpha1/provider_spec.go#L207

This just adds labels on the provider side for disks but it doesnt change anything on the k8s side

@rishabh-11
Copy link
Contributor

Maybe we should split the labels for the machineclass with the labels that we use for the machine itself

Yes, we should do this.

I have one question. I looked at the code and am unable to understand how some of the fields go under providerSpec for the machine class and the rest do not. And how are the labels for the machine class and providerSpec in the machine class kept the same? Is it applied in a way where it looks for the fields and simply replaces them when found and does not care about anything else?

@tedteng tedteng marked this pull request as draft February 19, 2024 08:47
@tedteng
Copy link
Contributor Author

tedteng commented Feb 19, 2024

The default label managed in https://github.com/gardener/machine-controller-manager/blob/7f7878e3176f495a1d8fb031f0105486f0cc2937/pkg/apis/machine/v1alpha1/machineclass_types.go#L41, it also passed to GCP API call.
if I add a new field called machineClassLables then I need to extend a new property in https://github.com/gardener/machine-controller-manager/blob/7f7878e3176f495a1d8fb031f0105486f0cc2937/pkg/apis/machine/v1alpha1/machineclass_types.go#L37C1-L46C60. However I don't want to change the machineclass API type.

but after a new PR merged https://github.com/gardener/gardener-extension-provider-gcp/pull/669 . it seems the labels are separated. I will try it add new labels in, "labels": machineClassLabelsonly and keep poolLabels in the rest of the places(disk, instance) to check if blocked by the GCP API call.

@tedteng tedteng force-pushed the add_machine_image_label branch from 4a90b9c to 782294a Compare February 29, 2024 08:35
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 29, 2024
@tedteng tedteng force-pushed the add_machine_image_label branch from 782294a to a104548 Compare February 29, 2024 11:19
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Feb 29, 2024
@tedteng
Copy link
Contributor Author

tedteng commented Feb 29, 2024

Hi @kon-angelo Finally I built a test environment, commit 8b80530 and tested.
image
image
but I have no idea how to fix the make test in there, Do you have any idea? Thanks.

@tedteng tedteng force-pushed the add_machine_image_label branch from a104548 to 6908337 Compare March 4, 2024 05:05
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Mar 4, 2024
@tedteng tedteng force-pushed the add_machine_image_label branch from 6908337 to 8b80530 Compare March 4, 2024 05:38
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Mar 4, 2024
@tedteng tedteng marked this pull request as ready for review March 4, 2024 05:41
@tedteng tedteng requested a review from kon-angelo March 4, 2024 05:41
Copy link
Contributor

@kon-angelo kon-angelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@gardener-robot gardener-robot added reviewed/lgtm Has approval for merging and removed needs/changes Needs (more) changes needs/rebase Needs git rebase needs/review Needs review labels Mar 6, 2024
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Mar 6, 2024
@kon-angelo kon-angelo merged commit eecee67 into gardener:master Mar 6, 2024
14 checks passed
@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ops-productivity Operator productivity related (how to improve operations) kind/enhancement Enhancement, improvement, extension needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) platform/gcp Google cloud platform/infrastructure reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies reviewed/lgtm Has approval for merging reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) size/xs Size of pull request is tiny (see gardener-robot robot/bots/size.py) status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add OS details in machine spec
8 participants