Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ADR for the provisioning functionality in KIM #202

Merged
merged 50 commits into from
May 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
b1d72f2
Added ADR for the provisioning functionality in KIM
akgalwas May 8, 2024
d1e330e
Diagram updated
akgalwas May 8, 2024
32e7b22
Update provisioning.md
akgalwas May 8, 2024
6db8a66
Added labels to the examples
akgalwas May 8, 2024
e9ac688
Added example with additional oidc provider, and ingress filtering
akgalwas May 8, 2024
8ae664e
Minor updates
akgalwas May 8, 2024
1e975a7
Minor diagram update
akgalwas May 8, 2024
3ec8899
Examples updated
akgalwas May 8, 2024
a317f5e
Folder name changed
akgalwas May 8, 2024
68b441c
Minor fixes
akgalwas May 8, 2024
033249b
Minor fix
akgalwas May 8, 2024
754b5d4
Update provisioning.md
akgalwas May 8, 2024
d56059a
Update provisioning.md
akgalwas May 8, 2024
93ca887
Minor fix
akgalwas May 8, 2024
7b68956
Update provisioning.md
akgalwas May 8, 2024
adf0029
shoot-name label removed
akgalwas May 8, 2024
c47deac
Revert "shoot-name label removed"
akgalwas May 8, 2024
20029d5
last changes reverted
akgalwas May 8, 2024
e63d364
Update provisioning.md
akgalwas May 8, 2024
5125080
Update provisioning.md
akgalwas May 8, 2024
01c53d7
Update provisioning.md
akgalwas May 8, 2024
eca112a
Update provisioning.md
akgalwas May 8, 2024
ff8a2f0
Update provisioning.md
akgalwas May 8, 2024
c7468f0
README in adr folder added
akgalwas May 8, 2024
3d91dcb
README in adr folder added
akgalwas May 8, 2024
fa9269b
Update provisioning.md
akgalwas May 8, 2024
35f2134
Update provisioning.md
akgalwas May 8, 2024
8c65e3a
Review remarks applied
akgalwas May 9, 2024
a3147e6
Update provisioning.md
akgalwas May 9, 2024
ce71c36
Update provisioning.md
akgalwas May 9, 2024
506e938
Apply suggestions from code review
akgalwas May 9, 2024
a00237d
Minor refactoring
akgalwas May 9, 2024
af9a219
Added Provider Specific Config to the examples
akgalwas May 10, 2024
b379f63
Licence type added
akgalwas May 10, 2024
a3267a8
Optional seedName added
akgalwas May 10, 2024
5e598bd
Adjusted to ADR format
akgalwas May 10, 2024
c12ce30
Added information on the additional fields
akgalwas May 10, 2024
2a54485
File renamed
akgalwas May 10, 2024
5a52cb3
Mentioned creating cluster role bindings
akgalwas May 10, 2024
52234ed
Update docs/adr/001-provisioning.md
akgalwas May 10, 2024
aba0080
Removed seed name
akgalwas May 10, 2024
7dc7e90
Fixed hierarchy to have the same as in the shoot
akgalwas May 10, 2024
18b165c
Update 001-provisioning.md
akgalwas May 10, 2024
2de7101
Update 001-provisioning.md
akgalwas May 10, 2024
ccdd130
KIM is responsible for provider specific config.
akgalwas May 14, 2024
06630bb
Added code for provider specific config
akgalwas May 14, 2024
d8707aa
Added spec.shoot.platformRegion
akgalwas May 16, 2024
7b07657
spec.shoot.provider.zones removed
akgalwas May 16, 2024
53aa78b
Add spec.shoot.enforceSeedLocation property added
akgalwas May 17, 2024
8dbbca7
Merge branch 'main' into adr-for-provisioning
akgalwas May 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
354 changes: 354 additions & 0 deletions docs/adr/001-provisioning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,354 @@
# Context
This document defines the architecture and API for the Gardener cluster provisioning functionality.

# Status
Proposed

# Decision

The following diagram shows the proposed architecture:
![](./assets/keb-kim-target-arch.drawio.svg)

> Note: At the time of writing, the GardenerCluster CR was used to generate kubeconfig. The [workplan](https://github.com/kyma-project/infrastructure-manager/issues/112) for delivering provisioning functionality includes renaming the CR to maintain consistency.

The following assumptions were taken:
- Kyma Environment Broker must not contain all the details of the cluster infrastructure.
- Kyma Infrastructure Manager's API must expose properties that:
- can be set in the BTP cockpit by the user
- are directly related to plans in the KEB
- Kyma Infrastructure Manager's API must not expose properties that are:
- hardcoded in the Provisioner, or the KEB
- statically configured in the management-plane-config

Kyma Environment Broker has the following responsibilities:
- Create Runtime CR containing the following data:
- Provider config (type, region, and secret with credentials for hyperscaler)
- Worker pool specification
- Cluster networking settings (nodes, pods, and services API ranges)
- OIDC settings
- Cluster administrators list
- Egress network filter settings
- Control Plane failure tolerance config
- Observe the status of the CR to determine whether provisioning succeeded

Kyma Infrastructure Manager has the following responsibilities:
- Create shoots based on:
- Corresponding `Runtime` CR properties
- Corresponding `Runtime` CR labels:
- `kyma-project.io/platform-region` for determining if the cluster is located in EU
- Predefined defaults for the optional properties:
- Kubernetes version
- Machine image version
- Predefined configuration for the following functionalities:
- configuring DNS extension
- configuring Certificates extension
- providing maintenance settings (Kubernetes, and image autoupdates)
- creating provider specific config
- Upgrade and delete shoots for the corresponding `Runtime` CRs
- Apply the audit log configuration on the shoot resource
- Create cluster role bindings for administrators
- Generate the kubeconfig

## API proposal

### CR examples

Mind that the Runtime CR must be labeled to make searching for a particular instance easier.
The proposed list of labels to be added to the Runtime CR:
```yaml
kyma-project.io/instance-id: instance-id
kyma-project.io/runtime-id: runtime-id
kyma-project.io/broker-plan-id: plan-id
kyma-project.io/broker-plan-name: plan-name
kyma-project.io/global-account-id: global-account-id
kyma-project.io/subaccount-id: subAccount-id
kyma-project.io/shoot-name: shoot-name
kyma-project.io/region: region
operator.kyma-project.io/kyma-name: kymaName
```

The labels are skipped in the following examples due to clarity.

The example below shows the CR that must be created by the KEB to provision the AWS production cluster:
```yaml
apiVersion: infrastructuremanager.kyma-project.io/v1alpha1
kind: Runtime
metadata:
name: runtime-id
namespace: kcp-system
spec:
shoot:
# spec.shoot.name is required
name: shoot-name
# spec.shoot.purpose is required
purpose: production
# spec.shoot.region is required
region: eu-central-1
# spec.shoot.platformRegion is required
platformRegion: "cf-eu11"
# spec.shoot.secretBindingName is required
secretBindingName: "hyperscaler secret"
a-thaler marked this conversation as resolved.
Show resolved Hide resolved
kubernetes:
kubeAPIServer:
# spec.shoot.kubernetes.kubeAPIServer.oidcConfig is required
oidcConfig:
clientID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
groupsClaim: groups
issuerURL: https://my.cool.tokens.com
signingAlgs:
- RS256
usernameClaim: sub
provider:
# spec.shoot.provider.type is required
type: aws
# spec.shoot.provider.workers is required
workers:
- machine:
# spec.shoot.workers.machine.type is required
type: m6i.large
# spec.shoot.workers.zones is required
zones:
- eu-central-1a
- eu-central-1b
- eu-central-1c
# spec.shoot.workers.minimum is required
minimum: 3
# spec.shoot.workers.maximum is required
maximum: 20
# spec.shoot.workers.maxSurge is required in the first release.
# It can be optional in the future, as it equals to zone count
maxSurge: 3
# spec.shoot.workers.maxUnavailable is required in the first release.
# It can be optional in the future, as it is always set to 0
maxUnavailable: 0
# spec.shoot.Networking is required
networking:
pods: 100.64.0.0/12
nodes: 10.250.0.0/16
services: 100.104.0.0/13
# spec.shoot.controlPlane is required
controlPlane:
highAvailability:
failureTolerance:
type: node
security:
networking:
filter:
# spec.security.networking is required
egress:
enabled: false
# spec.security.administrators is required
administrators:
- admin@myorg.com
```

There are some additional optional fields that could be specified:
Copy link
Member

@ebensom ebensom May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add support for these workers[*].machineControllerManager fields:
https://gardener.cloud/docs/gardener/api-reference/core/#core.gardener.cloud/v1beta1.Worker

SRE sometimes manually tweak these settings, knowing that provisioner doesn't revert this change, however these are important details to control. Especially with KIM, we would like to provide default values for machineDrainTimeout and maxEvictRetries attributes different than the gardeber defaults.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not included in the examples, but will be supported out of the box (we use full Gardener type for workers definition).

- `spec.shoot.enforceSeedLocation` ; if not provided `false` value will be used
- `spec.shoot.licenceType` ; if not provided `nil` value will be used
- `spec.shoot.kubernetes.version` ; if not provided, the default value will be read by the KIM from the configuration
- `spec.shoot.kubernetes.kubeAPIServer.additionalOidcConfig` ; if not provided, no additional OIDC provider will be configured
- `spec.shoot.workers.name` ; if not provided, a Gardener default will be used
- `spec.shoot.workers.machine.image` ; if not provided, the default value will be read by the KIM from the configuration
- `spec.security.networking.filtering.ingress.enabled` ; if not provided, the `false` value will be used

The following example shows the Runtime CR that must be created to provision a cluster with an additional OIDC provider and to enable ingress network filtering:
```yaml
apiVersion: infrastructuremanager.kyma-project.io/v1alpha1
kind: Runtime
metadata:
name: runtime-id
namespace: kcp-system
spec:
shoot:
# spec.shoot.name is required
name: shoot-name
# spec.shoot.purpose is required
purpose: production
# spec.shoot.region is required
region: eu-central-1
# spec.shoot.platformRegion is required
platformRegion: "cd-eu11"
# spec.shoot.secretBindingName is required
secretBindingName: "hyperscaler secret"
# spec.shoot.enforceSeedLocation is optional ; it allows to make sure the seed cluster will be located in the same region as the shoot cluster
enforceSeedLocation: "true"
kubernetes:
# spec.shoot.kubernetes.version is optional, when not provided default will be used
# Will be modified by the SRE
version: "1.28.7"
kubeAPIServer:
# spec.shoot.kubernetes.kubeAPIServer.oidcConfig is required
oidcConfig:
clientID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
groupsClaim: groups
issuerURL: https://my.cool.tokens.com
signingAlgs:
- RS256
usernameClaim: sub
# spec.shoot.kubernetes.kubeAPIServer.additionalOidcConfig is optional, not implemented in the first KIM release
additionalOidcConfig:
- clientID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
groupsClaim: groups
issuerURL: https://some.others.tokens.com
signingAlgs:
- RS256
usernameClaim: sub
usernamePrefix: 'someother'
provider:
# spec.shoot.provider.type is required
type: aws
# spec.shoot.provider.workers is required
workers:
- machine:
# spec.shoot.workers.machine.type is required
type: m6i.large
# spec.shoot.workers.machine.image is optional, when not provider default will be used
# Will be modified by the SRE
image:
name: gardenlinux
version: 1312.3.0
# spec.shoot.workers.volume is required for the first release
# Probably can be moved into KIM, as it is hardcoded in KEB, and not dependent on plan
volume:
type: gp2
size: 50Gi
# spec.shoot.workers.zones is required
zones:
- eu-central-1a
- eu-central-1b
- eu-central-1c
# spec.shoot.workers.name is optional, if not provided default will be used
name: cpu-worker-0
# spec.shoot.workers.minimum is required
minimum: 3
# spec.shoot.workers.maximum is required
maximum: 20
# spec.shoot.workers.maxSurge is required in the first release.
# It can be optional in the future, as it equals to zone count
maxSurge: 3
# spec.shoot.workers.maxUnavailable is required in the first release.
# It can be optional in the future, as it is always set to 0
maxUnavailable: 0
# spec.shoot.Networking is required
networking:
pods: 100.64.0.0/12
nodes: 10.250.0.0/16
services: 100.104.0.0/13
# spec.shoot.controlPlane is required
controlPlane:
highAvailability:
failureTolerance:
type: zone
security:
networking:
filter:
# spec.security.networking.filter.egress.enabled is required
egress:
enabled: false
# spec.security.networking.filter.ingress.enabled is optional (default=false), not implemented in the first KIM release
ingress:
enabled: true
# spec.security.administrators is required
administrators:
- admin@myorg.com
```
> Note: please mind that the additional OIDC providers, and ingress network filtering will not be implemented in the first release.

Please see the following examples to understand what CRs must be created for particular KEB plans:
- [AWS trial plan](assets/runtime-examples/aws-trial.yaml)
- [Azure](assets/runtime-examples/azure.yaml)
- [Azure lite](assets/runtime-examples/azure-lite.yaml)
- [GCP](assets/runtime-examples/gcp.yaml)
- [SAP Converge Cloud](assets/runtime-examples/sap-converged-cloud.yaml)

## API structures

```go
package v1

import (
gardener "github.com/gardener/gardener/pkg/apis/core/v1beta1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// Runtime is the Schema for the runtimes API
type Runtime struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`

Spec RuntimeSpec `json:"spec,omitempty"`
Status RuntimeStatus `json:"status,omitempty"`
}

// RuntimeSpec defines the desired state of Runtime
type RuntimeSpec struct {
Shoot RuntimeShoot `json:"shoot"`
Security Security `json:"security"`
}

// RuntimeStatus defines the observed state of Runtime
type RuntimeStatus struct {
// State signifies current state of Runtime
State State `json:"state,omitempty"`
// List of status conditions to indicate the status of a ServiceInstance.
Conditions []metav1.Condition `json:"conditions,omitempty"`
}

type RuntimeShoot struct {
Name string `json:"name"`
Purpose gardener.ShootPurpose `json:"purpose"`
PlatformRegion string `json:"platformRegion"`
Region string `json:"region"`
LicenceType *string `json:"licenceType,omitempty"`
SecretBindingName string `json:"secretBindingName"`
EnforceSeedLocation *bool `json:"enforceSeedLocation,omitempty"`
Kubernetes Kubernetes `json:"kubernetes"`
Provider Provider `json:"provider"`
Networking Networking `json:"networking"`
ControlPlane gardener.ControlPlane `json:"controlPlane"`
}

type Kubernetes struct {
Version *string `json:"version,omitempty"`
KubeAPIServer APIServer `json:"kubeAPIServer,omitempty"`
}

type APIServer struct {
OidcConfig gardener.OIDCConfig `json:"oidcConfig"`
AdditionalOidcConfig *[]gardener.OIDCConfig `json:"additionalOidcConfig,omitempty"`
}

type Provider struct {
Type string `json:"type"`
Workers []gardener.Worker `json:"workers"`
}

type Networking struct {
Pods string `json:"pods"`
Nodes string `json:"nodes"`
Services string `json:"services"`
}

type Security struct {
Administrators []string `json:"administrators"`
Networking NetworkingSecurity `json:"networking"`
}

type NetworkingSecurity struct {
Filter Filter `json:"filter"`
}

type Filter struct {
Ingress *Ingress `json:"ingress,omitempty"`
Egress Egress `json:"egress"`
}

type Ingress struct {
Enabled bool `json:"enabled"`
}

type Egress struct {
Enabled bool `json:"enabled"`
}

```
8 changes: 8 additions & 0 deletions docs/adr/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Overview

This folder contains architecture decision records.

# Documents

- [Provisioning functionality](./001-provisioning.md)

4 changes: 4 additions & 0 deletions docs/adr/assets/keb-kim-target-arch.drawio.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading