Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set Up Publishing-Bot #520

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 85 additions & 0 deletions publishing-bot/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Copyright 2020 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

all: build
.PHONY: all

-include $(CONFIG)
-include $(CONFIG)-token

DOCKER_REPO ?= k8s-publishing-bot
NAMESPACE ?=
KUBECTL ?= kubectl
SCHEDULE ?= 0 5 * * *
INTERVAL ?= 86400
MEMORY_REQUESTS ?= 200Mi
MEMORY_LIMITS ?= 1.6Gi
GOOS ?= linux

build_cmd = GO111MODULE=on mkdir -p _output && GOOS=$(GOOS) go build -o _output/$(1) ./cmd/$(1)
prepare_spec = sed 's,DOCKER_IMAGE,$(DOCKER_REPO),g;s,MEMORY_REQUESTS,$(MEMORY_REQUESTS),g;s,MEMORY_LIMITS,$(MEMORY_LIMITS),g'

SHELL := /bin/bash

build:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to apply at all in this directory? The code is not here

$(call build_cmd,collapsed-kube-commit-mapper)
$(call build_cmd,publishing-bot)
$(call build_cmd,sync-tags)
$(call build_cmd,init-repo)
$(call build_cmd,godeps-gen)
$(call build_cmd,gomod-zip)
.PHONY: build

build-image: build
docker build -t $(DOCKER_REPO) .
.PHONY: build-image

push-image:
docker push $(DOCKER_REPO):latest

clean:
rm -rf _output
.PHONY: clean

update-deps:
glide update --strip-vendor
.PHONY: update-deps

validate:
if [ -f $(CONFIG)-rules-configmap.yaml ]; then \
go run ./cmd/validate-rules <(sed '1,/config: /d;s/^ //' $(CONFIG)-rules-configmap.yaml); \
else \
go run ./cmd/validate-rules $$(grep "rules-file: " $(CONFIG)-configmap.yaml | sed 's/.*rules-file: //'); \
fi
.PHONY: validate

init-deploy: validate
$(KUBECTL) delete -n "$(NAMESPACE)" --ignore-not-found=true replicaset publisher
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a little baroque - why are we trying to delete things?

$(KUBECTL) delete -n "$(NAMESPACE)" --ignore-not-found=true pod publisher
while $(KUBECTL) get pod -n "$(NAMESPACE)" publisher -a &>/dev/null; do echo -n .; sleep 1; done
$(KUBECTL) apply -n "$(NAMESPACE)" -f manifests/storage-class.yaml || true
$(KUBECTL) get StorageClass ssd
$(KUBECTL) apply -n "$(NAMESPACE)" -f $(CONFIG)-configmap.yaml
$(KUBECTL) apply -n "$(NAMESPACE)" -f $(CONFIG)-rules-configmap.yaml; \
$(KUBECTL) apply -n "$(NAMESPACE)" -f manifests/pvc.yaml

run: init-deploy
{ cat manifests/pod.yaml && sed 's/^/ /' manifests/podspec.yaml; } | \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really weird way of splitting config across 2 files. Can we at least comment these rules and the underlying files, so it's more obvious WHY?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original script had a mode to run a simple fire-and-forget pod for testing. Hence, the podspec was in an extra files, used by a pod manifest (like here) and for a replicaset.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured that out. I can't say it is NOT useful, so I deleted my request to un-do this split, but I am asking for comments :)

$(call prepare_spec) | $(KUBECTL) apply -n "$(NAMESPACE)" -f -

deploy: init-deploy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document assumptions - like the current context

$(KUBECTL) apply -n "$(NAMESPACE)" -f manifests/service.yaml
{ cat manifests/rs.yaml && sed 's/^/ /' manifests/podspec.yaml; } | \
$(call prepare_spec) | sed 's/-interval=0/-interval=$(INTERVAL)/g' | \
$(KUBECTL) apply -n "$(NAMESPACE)" -f -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we duplicating deployment code from the publishing bot? 😱 This repo is invisible to anybody doing changes in the bot repository, which will eventually break this copy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the migration of the publishing bot to a new community-owned GKE cluster (aaa), we want to migrate some parts of the deployment code. What's the right approach in this scenario ?

Copy link
Contributor

@sttts sttts Feb 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code in the bot repo is used by a number of parties. So we won't remove it. Hence, you are creating a copy here which will most certainly divert.

Where is the problem of keeping the deployment code where it is? The people deploying the publishing bot (which is @nikhita, @dims and myself only; nobody without deep knowledge of what the bot does should do this) all are used to the bot repo. Which pain points should this duplication solve?

I would also be fine to do some git clone magic here checking out the bot repo and call the Makefile in there. Then it's well documented what is going on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine to leave the config in the bot repo, I guess. This repo/dir could be reduced to a README explaining how we deploy it in "aaa"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But all my comments here apply - especially around CPu and memory limits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ameukam can you move the fixes to the publishing-bot repo instead? @nikhita and I are happy to review over there. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sttts @thockin @ameukam @nikhita I have added to todays agenda for k8s-infra-wg bi-weekly meeting discussion about this topic. Please attend if you want to discuss about it in syncronous manner. http://bit.ly/wg-k8s-infra-notes

36 changes: 36 additions & 0 deletions publishing-bot/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Kubernetes Publishing Bot

The publishing-bot for the Kubernetes project is running on a CNCF sponsored
GKE cluster `aaa` in the `kubernetes-public` project.

The bot is responsible for updating `go.mod`/`Godeps` and `vendor` for target repos.
To support both godeps for releases <= v1.14 and go modules for the master branch
and releases >= v1.14, we run two instances of the bot today.

The instance of the bot responsible for syncing releases <= v1.14 runs in the
`k8s-publishing-bot-godeps` namespace and the instance of the bot responsible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This NS does not exist yet - should this PR add it to infra/gcp/namespaces/ensure-namespaces.sh ?

for syncing the master branch and releases >= v1.14 runs in the `k8s-publishing-bot`
namespace.

The code for the former can be found in the [godeps branch] and the latter in the master
branch of the publishing-bot repo.

## Permissions

The cluster can be accessed by members of the [k8s-infra-cluster-admins] and the [k8s-infra-rbac-publishing-bot]
google groups. Members can be added to the groups by updating [groups.yaml].

## Running the bot

Make sure you are at the root of the publishing-bot repo before running these commands.

### Deploying the bot

```sh
make deploy CONFIG=configs/kubernetes
```

[godeps branch]: https://github.com/kubernetes/publishing-bot/tree/godeps
[k8s-infra-cluster-admins]: https://groups.google.com/forum/#!forum/k8s-infra-cluster-admins
[k8s-infra-rbac-publishing-bot]: https://groups.google.com/forum/#!forum/k8s-infra-rbac-publishing-bot
[groups.yaml]: https://git.k8s.io/k8s.io/groups/groups.yaml
6 changes: 6 additions & 0 deletions publishing-bot/configs/kubernetes
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
DOCKER_REPO = gcr.io/k8s-staging-publishing-bot/k8s-publishing-bot
NAMESPACE = publishing-bot
SCHEDULE = * */4 * * *
INTERVAL = 14400
MEMORY_REQUESTS = 2Gi
MEMORY_LIMITS = 3Gi
12 changes: 12 additions & 0 deletions publishing-bot/configs/kubernetes-configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: publisher-config
data:
config: |
source-org: kubernetes
source-repo: kubernetes
target-org: kubernetes
rules-file: https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/publishing/rules.yaml
github-issue: 56876
thockin marked this conversation as resolved.
Show resolved Hide resolved
dry-run: false
6 changes: 6 additions & 0 deletions publishing-bot/configs/kubernetes-nightly
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
DOCKER_REPO = gcr.io/k8s-staging-publishing-bot/k8s-publishing-bot-nightly
NAMESPACE = k8s-publishing-bot-nightly
ameukam marked this conversation as resolved.
Show resolved Hide resolved
SCHEDULE = * */4 * * *
INTERVAL = 14400
MEMORY_REQUESTS = 2Gi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory request should == memory limit. Anything else is madness in prod.

MEMORY_LIMITS = 3Gi
12 changes: 12 additions & 0 deletions publishing-bot/configs/kubernetes-nightly-configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: publisher-config
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have 2 subdirs, "kubernetes" and "kubernetes-nightly" to isolate the files better? would make diffing them easier, too

data:
config: |
source-org: kubernetes
source-repo: kubernetes
target-org: kubernetes-nightly
rules-file: https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/publishing/rules.yaml
# github-issue: 56876
dry-run: false
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: publisher-rules
data:
config: |
ameukam marked this conversation as resolved.
Show resolved Hide resolved
6 changes: 6 additions & 0 deletions publishing-bot/configs/kubernetes-rules-configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: publisher-rules
data:
config: |
ameukam marked this conversation as resolved.
Show resolved Hide resolved
1 change: 1 addition & 0 deletions publishing-bot/manifests/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
local.*
18 changes: 18 additions & 0 deletions publishing-bot/manifests/cronjob.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: publisher
namespace: publishing-bot
spec:
schedule: "SCHEDULE"
startingDeadlineSeconds: 36000
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 14
jobTemplate:
metadata:
labels:
app: publisher
template:
spec:
restartPolicy: Never
7 changes: 7 additions & 0 deletions publishing-bot/manifests/pod.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
apiVersion: v1
kind: Pod
metadata:
name: publisher
namespace: publishing-bot
spec:
restartPolicy: Never
84 changes: 84 additions & 0 deletions publishing-bot/manifests/podspec.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
initContainers:
- name: initialize-repos
command:
- /init-repo
- --alsologtostderr
- --config=/etc/munge-config/config
- --rules-file=/etc/publisher-rules/config
- 2>&1
image: DOCKER_IMAGE
imagePullPolicy: Always
resources:
requests:
cpu: 300m
memory: 200Mi
limits:
cpu: 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't set CPU limits

memory: 1.6Gi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory request should == memory limit. Anything else is madness in prod.

volumeMounts:
- mountPath: /etc/munge-config
name: munge-config
- mountPath: /go-workspace
name: publisher-gopath
- mountPath: /etc/publisher-rules
name: publisher-rules
- mountPath: /.cache
name: cache
containers:
- name: publisher
command:
- /publishing-bot
- --alsologtostderr
- --config=/etc/munge-config/config
- --token-file=/etc/secret-volume/token
- --interval=0
- --server-port=8080
- 2>&1
image: DOCKER_IMAGE
imagePullPolicy: Always
livenessProbe:
httpGet:
path: /healthz
port: 80
resources:
requests:
cpu: 300m
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is CPU hardcoded but memory is parameterized?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, .3 cores is a LOT of CPU

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have client-go smoke tests in there, which require a certain amount of CPU in order to be stable.

memory: MEMORY_REQUESTS
limits:
cpu: 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't set CPU limits. There's no point to it.

memory: MEMORY_LIMITS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory request should == memory limit. Anything else is madness in prod.

volumeMounts:
- mountPath: /etc/munge-config
name: munge-config
- mountPath: /etc/publisher-rules
name: publisher-rules
- mountPath: /etc/secret-volume
name: secret-volume
- mountPath: /netrc
name: netrc
- mountPath: /gitrepos
name: repo
- mountPath: /go-workspace
name: publisher-gopath
- mountPath: /.cache
name: cache
volumes:
- name: munge-config
configMap:
name: publisher-config
- name: publisher-rules
configMap:
name: publisher-rules
- name: secret-volume
secret:
secretName: github-token
- name: repo
emptyDir: {}
- name: cache
emptyDir: {}
- name: netrc
emptyDir:
medium: Memory
- name: publisher-gopath
persistentVolumeClaim:
claimName: publisher-gopath
14 changes: 14 additions & 0 deletions publishing-bot/manifests/pvc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
labels:
app: publisher
name: publisher-gopath
namespace: publishing-bot
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: ssd
15 changes: 15 additions & 0 deletions publishing-bot/manifests/rs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
apiVersion: apps/v1
kind: Replicaset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not using Deployment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikhita asked to keep it as Replicaset in a previous conversation: #520 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this is needed, though. Is it JUSt to avoid changing 2 things at once?

metadata:
name: publisher
namespace: publishing-bot
spec:
replicas: 1
selector:
matchLabels:
name: publisher
template:
metadata:
labels:
name: publisher
spec:
13 changes: 13 additions & 0 deletions publishing-bot/manifests/service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
kind: Service
apiVersion: v1
metadata:
name: health
namespace: publishing-bot
spec:
selector:
name: publisher
ports:
- name: http
protocol: TCP
port: 80
targetPort: 8080
7 changes: 7 additions & 0 deletions publishing-bot/manifests/storage-class.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
kind: StorageClass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go ... somewhere else. Probably in clusters/kubernetes-public/aaa - see https://www.terraform.io/docs/providers/kubernetes/r/storage_class.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a PR of its own

apiVersion: storage.k8s.io/v1beta1
metadata:
name: ssd
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd