Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove linkage to cloud providers in persistent volume label controller #51629

Closed
rrati opened this issue Aug 30, 2017 · 14 comments · Fixed by #52169
Closed

Remove linkage to cloud providers in persistent volume label controller #51629

rrati opened this issue Aug 30, 2017 · 14 comments · Fixed by #52169
Assignees
Labels
area/cloudprovider kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Milestone

Comments

@rrati
Copy link

rrati commented Aug 30, 2017

/kind feature

#44680 introduced the persistent volume label controller to the cloud-controller-manager. The first revision of the controller was basically a transplant of the admission controller, but to truly function as is intended the linkage to the EC2 and GCE volumes needs to be removed.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 30, 2017
@k8s-github-robot
Copy link

@rrati
There are no sig labels on this issue. Please add a sig label by:

  1. mentioning a sig: @kubernetes/sig-<group-name>-<group-suffix>
    e.g., @kubernetes/sig-contributor-experience-<group-suffix> to notify the contributor experience sig, OR

  2. specifying the label manually: /sig <label>
    e.g., /sig scalability to apply the sig/scalability label

Note: Method 1 will trigger an email to the group. You can find the group list here and label list here.
The <group-suffix> in the method 1 has to be replaced with one of these: bugs, feature-requests, pr-reviews, test-failures, proposals

@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Aug 30, 2017
k8s-github-robot pushed a commit that referenced this issue Sep 1, 2017
Automatic merge from submit-queue (batch tested with PRs 51574, 51534, 49257, 44680, 48836)

Add a persistent volume label controller to the cloud-controller-manager

Part of kubernetes/enhancements#88

Outstanding concerns needing input:
- [x] Why 5 threads for controller processing?
- [x] Remove direct linkage to aws/gce cloud providers [#51629]
- [x] Modify shared informers to allow added event handlers ability to include uninitialized objects/using unshared informer #48893
- [x] Use cache.MetaNamespaceKeyFunc in event handler?

I'm willing to work on addressing the removal of the direct linkage to aws/gce after this PR gets in.
@luxas luxas added this to the v1.8 milestone Sep 1, 2017
@luxas luxas added area/cloudprovider sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 1, 2017
@luxas
Copy link
Member

luxas commented Sep 1, 2017

@cheftako @rrati @wlan0 Should we fix this in time for v1.8?

That would probably make sense to me; current state is kind of a bug, but let's ask @jdumars as well if he thinks that would be a bug fix or not

@jdumars
Copy link
Member

jdumars commented Sep 1, 2017

I would lean toward this being a bug as well. ETA on possibly fixing this?

@cheftako
Copy link
Member

cheftako commented Sep 1, 2017

1.8 code freezes today. I'll defer to @wlan0 but my guess is we aren't getting a fix that quickly. We definitely should fix it for 1.9.

@wlan0
Copy link
Member

wlan0 commented Sep 1, 2017

Please punt it to 1.9.

How do we approach this @rrati? Would we add an interface in the cloudprovider for volume info?

Should this volume info interface, if we implement, pave the way for creating out-of-tree EBS/PD controllers in the future? Please advise @thockin @luxas

@luxas luxas added kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed kind/feature Categorizes issue or PR as related to a new feature. labels Sep 1, 2017
@luxas
Copy link
Member

luxas commented Sep 1, 2017

@wlan0 I think doing this in the most simple way right now is just fine; i.e. moving the imports from the generic package into an other cloud-specific package; and then passing a cloud implementation struct to the generic code taking a generic interface makes a lot of sense, and is very low-volume code-wise and risk-wise. This would remove the hard-coding we have right now

That is; move this code

func (pvlc *PersistentVolumeLabelController) findAWSEBSLabels(volume *v1.PersistentVolume) (map[string]string, error) {
	// Ignore any volumes that are being provisioned
	if volume.Spec.AWSElasticBlockStore.VolumeID == vol.ProvisionedVolumeName {
		return nil, nil
	}
	ebsVolumes, err := pvlc.getEBSVolumes()
	if err != nil {
		return nil, err
	}

	// TODO: GetVolumeLabels is actually a method on the Volumes interface
	// If that gets standardized we can refactor to reduce code duplication
	spec := aws.KubernetesVolumeID(volume.Spec.AWSElasticBlockStore.VolumeID)
	labels, err := ebsVolumes.GetVolumeLabels(spec)
	if err != nil {
		return nil, err
	}

	return labels, nil
}

// getEBSVolumes returns the AWS Volumes interface for ebs
func (pvlc *PersistentVolumeLabelController) getEBSVolumes() (aws.Volumes, error) {
	pvlc.mutex.Lock()
	defer pvlc.mutex.Unlock()

	if pvlc.ebsVolumes == nil {
		awsCloudProvider := pvlc.cloud.(*aws.Cloud)
		awsCloudProvider, ok := pvlc.cloud.(*aws.Cloud)
		if !ok {
			// GetCloudProvider has gone very wrong
			return nil, fmt.Errorf("error retrieving AWS cloud provider")
		}
		pvlc.ebsVolumes = awsCloudProvider
	}
	return pvlc.ebsVolumes, nil
}

func (pvlc *PersistentVolumeLabelController) findGCEPDLabels(volume *v1.PersistentVolume) (map[string]string, error) {
	// Ignore any volumes that are being provisioned
	if volume.Spec.GCEPersistentDisk.PDName == vol.ProvisionedVolumeName {
		return nil, nil
	}

	provider, err := pvlc.getGCECloudProvider()
	if err != nil {
		return nil, err
	}

	// If the zone is already labeled, honor the hint
	zone := volume.Labels[kubeletapis.LabelZoneFailureDomain]

	labels, err := provider.GetAutoLabelsForPD(volume.Spec.GCEPersistentDisk.PDName, zone)
	if err != nil {
		return nil, err
	}

	return labels, nil
}

// getGCECloudProvider returns the GCE cloud provider, for use for querying volume labels
func (pvlc *PersistentVolumeLabelController) getGCECloudProvider() (*gce.GCECloud, error) {
	pvlc.mutex.Lock()
	defer pvlc.mutex.Unlock()

	if pvlc.gceCloudProvider == nil {
		gceCloudProvider, ok := pvlc.cloud.(*gce.GCECloud)
		if !ok {
			// GetCloudProvider has gone very wrong
			return nil, fmt.Errorf("error retrieving GCE cloud provider")
		}
		pvlc.gceCloudProvider = gceCloudProvider
	}
	return pvlc.gceCloudProvider, nil
}

into a sub-package or something specific for aws and gce (in separate dirs, but)

Create an interface something like this:

type PVLabeler interface {
    GetLabelsForVolume(pv *v1.PersistentVolume) (map[string]string, error)
}

that the generic controller then consumes; instead of

func (pvlc *PersistentVolumeLabelController) addLabelsToVolume(vol *v1.PersistentVolume) error {
	var volumeLabels map[string]string

	// Only add labels if in the list of initializers
	if needsInitialization(vol.Initializers, initializerName) {
		if vol.Spec.AWSElasticBlockStore != nil {
			labels, err := pvlc.findAWSEBSLabels(vol)
			if err != nil {
				return fmt.Errorf("error querying AWS EBS volume %s: %v", vol.Spec.AWSElasticBlockStore.VolumeID, err)
			}
			volumeLabels = labels
		}
		if vol.Spec.GCEPersistentDisk != nil {
			labels, err := pvlc.findGCEPDLabels(vol)
			if err != nil {
				return fmt.Errorf("error querying GCE PD volume %s: %v", vol.Spec.GCEPersistentDisk.PDName, err)
			}
			volumeLabels = labels
		}
		return pvlc.updateVolume(vol, volumeLabels)
	}

	return nil
}

we'd have something like

func (pvlc *PersistentVolumeLabelController) addLabelsToVolume(vol *v1.PersistentVolume) error {

	// Only add labels if in the list of initializers
	if needsInitialization(vol.Initializers, initializerName) {
		volumeLabels, err := pvlc.GetLabelsForVolume(vol)
		if err != nil {
			return fmt.Errorf("error querying cloud provider for volume labels: %v", err)
		}

		return pvlc.updateVolume(vol, volumeLabels)
	}

	return nil
}

which would be way cleaner.

@rrati could you please send a PR for this quickly as a bugfix?
As a side effect, any external cloud provider can easily make their own impl of the PVLabeler interface and pass that to the new controller.

@k8s-github-robot
Copy link

[MILESTONENOTIFIER] Milestone Labels Complete

@rrati

Issue label settings:

  • sig/cluster-lifecycle: Issue will be escalated to these SIGs if needed.
  • priority/important-longterm: Escalate to the issue owners; move out of the milestone after 1 attempt.
  • kind/bug: Fixes a bug discovered during the current release.
Additional instructions available here The commands available for adding these labels are documented here

@karataliu
Copy link
Contributor

+1 for fixing this.

Verified against v1.9.0-alpha.0, standalone cloud controller manager built for any cloud will have to depend on aws/gce sdk, which does not make sense.

@dims
Copy link
Member

dims commented Sep 8, 2017

Will help fix one of the problems mentioned in #49402

@luxas
Copy link
Member

luxas commented Sep 8, 2017

@dims can I assign you this and so we can get this bugfix into v1.8?
Otherwise all ccms will depend on GCE and AWS sdks as @karataliu said...

@dims
Copy link
Member

dims commented Sep 8, 2017

@luxas, am travelling all week next week (OpenStack meeting in Denver), so i won't be able to work this problem. If i get some time to look, i will assign it to myself.

@luxas
Copy link
Member

luxas commented Sep 8, 2017

ok, thanks anyway

@dims
Copy link
Member

dims commented Sep 8, 2017

@luxas trying a tactical fix, dunno if it will work, please take a look

@rrati
Copy link
Author

rrati commented Sep 8, 2017

@luxas Sorry, I seem to have missed the updates to the this issue. I agree with your proposed implementation. That was along the lines I was thinking of too.

@roberthbailey roberthbailey added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Sep 8, 2017
dims added a commit to dims/kubernetes that referenced this issue Sep 8, 2017
We should be able to build a cloud-controller-manager without having to
pull in code specific to GCE and AWS clouds. Note that this is a tactical
fix for now, we should have allow PVLabeler to be passed into the
PersistentVolumeController, maybe come up with better interfaces etc. Since
it is too late to do all that for 1.8, we just move cloud specific code
to where they belong and we check for PVLabeler method and use it where
needed.

Fixes kubernetes#51629
k8s-github-robot pushed a commit that referenced this issue Sep 12, 2017
…oviders

Automatic merge from submit-queue (batch tested with PRs 52007, 52196, 52169, 52263, 52291)

Remove links to GCE/AWS cloud providers from PersistentVolumeCo…

…ntroller




**What this PR does / why we need it**:

We should be able to build a cloud-controller-manager without having to
pull in code specific to GCE and AWS clouds. Note that this is a tactical
fix for now, we should have allow PVLabeler to be passed into the
PersistentVolumeController, maybe come up with better interfaces etc. Since
it is too late to do all that for 1.8, we just move cloud specific code
to where they belong and we check for PVLabeler method and use it where
needed.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

Fixes #51629

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloudprovider kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

Successfully merging a pull request may close this issue.