Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prow presubmit job to build mlkube.io container. #4951

Merged
merged 22 commits into from
Oct 13, 2017

Conversation

jlewi
Copy link
Contributor

@jlewi jlewi commented Oct 10, 2017

  • This job is supposed to build a Docker container with the CRD. Once
    this works we will add another job to actually run the tests.
  • We create a new image to be used by this job.
    • The image needs docker so that we can do a Docker build.

@jlewi jlewi requested a review from krzyzacy as a code owner October 10, 2017 13:19
@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 10, 2017
@jlewi
Copy link
Contributor Author

jlewi commented Oct 10, 2017

@krzyzacy @foxish PTAL.

* This job is supposed to build a Docker container with the CRD. Once
  this works we will add another job to actually run the tests.
* We create a new image to be used by this job.
  * The image needs docker so that we can do a Docker build.
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Oct 10, 2017
prow/config.yaml Outdated
ports:
# TODO(jlewi): Do we need this port?
- containerPort: 9999
hostPort: 9999
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the port is for node affinity since they shares a git cache, you probably don't need it here.

prow/config.yaml Outdated
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /etc/service-account/service-account.json
# TODO(jlewi): Do we need privileged mode in order to run Docker?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you probably need to figure out kubernetes/kubernetes#1806, have you tried locally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried running my container locally using docker run but not running on K8s. Going to try it on my test prow cluster soon.

@krzyzacy
Copy link
Member

/assign

prow/config.yaml Outdated
secretName: service-account
- name: docker
hostPath:
path: /var/run/docker.sock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @ixdy, I think we don't intend to allow jobs access to the host docker socket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenTheElder Can you suggest an alternative approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlewi what I've been doing is using bazel to build/push the image instead, not sure if that's feasible to you

* Need to specify namespace, resync_period and some other values or
  the config won't be parseable.
* Add pre and postsubmit jobs for mlkube.
* Remove the periodic job; I don't think we want that.
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 12, 2017
@jlewi
Copy link
Contributor Author

jlewi commented Oct 12, 2017

@krzyzacy This is ready for another look but there are are some test failures I could use help with.

verify_test is complaining that there is no test-grid associated with my presubmit job.

//jobs:config_test is failing complaining about duplicate projects. I added an entry here because otherwise the //prow/config:go_default_test failed complaining

jobs_test.go:152: Cannot find test-infra/jobs/mlkube-build-presubmit.sh for mlkube-build-presubmit

but it looks like my fix might have caused other problems.

jobs/config.json Outdated
@@ -9907,6 +9907,27 @@
"UNKNOWN"
]
},
"mlkube-build-presubmit": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you'll need an equivalent one for mlkube-build-postsubmit

jobs/config.json Outdated
"--extract=gke",
"--gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging",
"--gcp-node-image=gci",
"--gcp-project=mlkube-testing",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the presubmit error you need to add something to https://github.com/kubernetes/test-infra/blob/master/jobs/config_test.py#L681-L791

@krzyzacy
Copy link
Member

I think we don't want to allow sock mapping on prow pods. We should either use bazel to build docker images, or we need to implement some docker-in-docker service in prow.

This reverts commit 7507edb.

If we load all presubmits we end up with failures due to other presubmits
with no testgroup.
@krzyzacy
Copy link
Member

hit by #4989

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 13, 2017
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 13, 2017
…ubmit

tests.

* All tests should be Prow jobs; delete jobs/config.json entry for the periodic
  job.
* All jobs should use the same container image for the tests.
* Create dashboard tabs on the bigdata dashboard. Remove tabs from the
  apps dashboard.
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 13, 2017
@jlewi
Copy link
Contributor Author

jlewi commented Oct 13, 2017

@krzyzacy This is ready for another review.

  • Only test failure appears to be due to pull-test-infra-bazel flakes upon installing gubernator requirements.txt #4989
  • I made the existing periodic test that @foxish had added consistent with the new pre/post submit tests
    • I created dashboards for each test on the bigdata dashboard group
    • Before it was using both big-data and apps
  • I removed the use of docker build and used GCR instead to build the image
    • As discussed offline building the images with bazel might be better but I didn't want to introduce
      bazel in the mlkube toolchain just for this.

# using this project there will just be one entry.
'mlkube-build-presubmit': 'mlkube-*',
'mlkube-build-postsubmit': 'mlkube-*',
'ci-kubernetes-e2e-mlkube-gke': 'mlkube-*',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now you probably don't need this block. The test basically scans duplicated projects from config.json.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

test_group_name: mlkube-build-postsubmit
- name: mlkube-build-presubmit
description: Presubmit tests of TfJob CRD running on stable GKE.
test_group_name: mlkube-build-presubmit
notifications:
- summary: Please configure this skeleton dashboard and remove this notification
context_link: https://github.com/kubernetes/test-infra/tree/master/testgrid/config
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(probably can delete this notification now)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@jlewi
Copy link
Contributor Author

jlewi commented Oct 13, 2017

Thanks.

prow/config.yaml Outdated
secretName: service-account
- name: docker
hostPath:
path: /var/run/docker.sock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the sock map here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -2489,6 +2489,31 @@ presubmits:
hostPath:
path: /mnt/disks/ssd0

jlewi/mlkube.io:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hummm, I feel like our bot might not have access to your repo, you might want to configure somewhere around https://github.com/kubernetes/test-infra/blob/master/prow/plugins.yaml#L5-L11

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I also think I need to configure webhooks on my repo but I'll do that after this PR is done.

* Configure trigger plugin for mlkube.io repo.
@krzyzacy
Copy link
Member

it's a good start and probably we have more stuff need to configure. we'll see how this goes :-)

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 13, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi, krzyzacy

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 13, 2017
@k8s-ci-robot k8s-ci-robot merged commit ad548b4 into kubernetes:master Oct 13, 2017
@k8s-ci-robot
Copy link
Contributor

@jlewi: I updated Prow config for you!

I updated Prow plugins config for you!

In response to this:

  • This job is supposed to build a Docker container with the CRD. Once
    this works we will add another job to actually run the tests.
  • We create a new image to be used by this job.
  • The image needs docker so that we can do a Docker build.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jlewi
Copy link
Contributor Author

jlewi commented Oct 13, 2017

Thanks

@krzyzacy
Copy link
Member

(whoops, forgot to ask you to squash 😂 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants