Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch calico to be deployed with the Tigera operator #1297

Merged
merged 3 commits into from
Dec 16, 2020

Conversation

tmjd
Copy link
Contributor

@tmjd tmjd commented Nov 25, 2020

This switches the Calico install to be done using the Tigera operator. This PR includes a manifest to install the operator which will install Calico v3.17.

What type of PR is this?
update config?

Which issue does this PR fix:

What does this PR do / Why do we need it:

If an issue # is not available please add repro steps and logs from IPAMD/CNI showing the issue:

Testing done on this change:
I have tested upgrading a cluster that had a previous version of Calico (with Amazon VPC CNI) installed and also installed on a cluster that only had the Amazon VPC CNI plugin installed (with no existing Calico).

Automation added to e2e:

Will this break upgrades or downgrades. Has updating a running cluster been tested?:
Because of the way the upgrade will happen with the operator there is a problem upgrading on a small cluster, 3 nodes or less. This is because for 3 nodes or less the operator tries to deploy a typha for each node and the current calico install uses at least one typha and multiple typhas cannot run on a single node.
Once a cluster is upgraded with these changes, it will not be simple to downgrade back to a versions of Calico that was installed without the operator.

Does this change require updates to the CNI daemonset config files to work?:

This is updating the daemonset config files for Calico. A 'kubectl patch' of the image tag does not work because the change switches Calico to be installed by an oeprator.

Does this PR introduce any user-facing change?:

Users of Calico can now use kubectl get tigerastatus to get an overview status of the Calico installation. They also would have the installations.operator.tigera.io default resource available for making configuration changes.

Switch Calico to be installed and managed by the Tigera Operator.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@tmjd tmjd force-pushed the use-operator branch 2 times, most recently from 91e5de7 to 4a32b96 Compare December 4, 2020 18:32
@jayanthvn
Copy link
Contributor

/cc @caseydavenport - can you please review this?

format: int32
type: integer
keepOriginalNextHop:
default: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmjd could you remove this default: false here?

It doesn't add any value since it matches the code default, and can cause issues applying the CRD as per: projectcalico/calico#4237

@jayanthvn
Copy link
Contributor

Thanks @caseydavenport :)

@jayanthvn jayanthvn merged commit 01ce017 into aws:master Dec 16, 2020
@jayanthvn jayanthvn added this to the v1.7.9 milestone Dec 16, 2020
@tmjd tmjd deleted the use-operator branch December 17, 2020 22:07
@jayanthvn jayanthvn modified the milestones: v1.7.9, v1.8.0 Jan 19, 2021
couralex6 pushed a commit to couralex6/amazon-vpc-cni-k8s that referenced this pull request Jan 29, 2021
* Switch calico to be deployed with the operator

* Update operator update from v3.17.0 to v3.17.1

* Review update
jayanthvn pushed a commit that referenced this pull request Feb 1, 2021
* Switch calico to be deployed with the operator

* Update operator update from v3.17.0 to v3.17.1

* Review update
@jayanthvn jayanthvn removed this from the v1.8.0 milestone Mar 17, 2021
@couralex6
Copy link
Contributor

Hey @tmjd,

We noticed an issue while trying to apply the latest calico config file. It seems like we cannot install Calico with the newest version of the yaml file. However, we are able to install it on a cluster after installing the old version of the file, removing it, and then applying the latest version.

This is the error on a new cluster (no prior calico installation) :

$ k apply -f https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/master/config/master/calico.yaml
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/installations.operator.tigera.io created
customresourcedefinition.apiextensions.k8s.io/tigerastatuses.operator.tigera.io created
namespace/tigera-operator created
podsecuritypolicy.policy/tigera-operator created
serviceaccount/tigera-operator created
clusterrole.rbac.authorization.k8s.io/tigera-operator created
clusterrolebinding.rbac.authorization.k8s.io/tigera-operator created
deployment.apps/tigera-operator created
error: unable to recognize "https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/master/config/master/calico.yaml": no matches for kind "Installation" in version "operator.tigera.io/v1"

In order to install the latest version, we have to follow this path:

  • New cluster
  • install calico with this command: kubectl apply -f https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/v1.7.8/config/v1.7/calico.yaml
  • Remove it : kubectl delete -f https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/v1.7.8/config/v1.7/calico.yaml
  • Install the latest: k apply -f https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/master/config/master/calico.yaml
$ kgp -n calico-system
NAME                                       READY   STATUS    RESTARTS   AGE
calico-kube-controllers-57b4f8758f-mj75g   1/1     Running   1          9m48s
calico-node-p65hf                          1/1     Running   0          9m48s
calico-node-rhfgx                          1/1     Running   0          9m48s
calico-typha-844fd97d84-49949              1/1     Running   0          9m48s
calico-typha-844fd97d84-trrtr              1/1     Running   0          7m49s

Do you know what could be causing this?

cc @caseydavenport @jayanthvn

@tmjd
Copy link
Contributor Author

tmjd commented Mar 23, 2021

There seems to be an inconsistent issue where it is necessary to apply the new calico.yaml twice. I don't believe removing should be necessary.

@caseydavenport
Copy link
Contributor

@couralex6 I am working on a fix for that at the moment. You should be able to work around it by simply applying the calico.yaml twice for now.

Here's the first PR in my fix: #1410

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants