Fix computation of changed files in generate-changed-only rule #665

jlewi · 2019-12-12T06:54:48Z

Fix computation of changed files in generate-changed-only rule

use git diff --name-only @{upstream} to compute the diff against the
upstream branch. This should be better then the current branch
which is making assumptions about the remote repo names.
Furthermore we need to make sure that when the base kustomization package
changes that we also regenerate the tests for the overlay packages.
- to support that we replace gen-test-targets.sh with a python script.
- The bash scripts are pretty impenetrable; migrating to python should
  make the code easier to maintain.
The name of the go test files generated is slightly different from what
the shell scripts were generating.
- This is intended to make the naming more consistent
- specifically a/b/c/kustimazation.yaml results in
  tests/a-b-c_test.go
- It looks like the shell script was sometimes not including a in the name.
The python script also checks that for _test.go file there is a corresponding
kustomization.yaml file; otherwise it will remove the test. This
ensures if we move or remove a kustomize package we will end up
removing the test.
Fix: make generate - assumptions about remote names #509
Update the pull request template with the command to only generate
tests for changed files
Fix gen-test-targets should function regardless of checkout name #171 - gen-test-target.sh should function regardless
of checked out name for the repository. We can use git to get the
base directory and then do an appropriate string replace.
Need to update the test target generation
to not assume the repository is named manifests

This change is

jlewi · 2019-12-12T06:54:58Z

/assign @kkasravi

jlewi · 2019-12-12T07:52:10Z

/hold

@kkasravi make generate-changed-only doesn't seem to correctly update the tests. Specifically I did the following

Modify profiles/base/kustomization.yaml - change the image
Run make generate-changed-only
Run make test

The tests fail

I think the problem is that make generate-changed-only ends up only changing the test for the profiles base package and not any of the test files corresponding to overlays.

Is that a problem with my PR or is that an existing issue with make generate-changed-only ?

jlewi · 2019-12-12T23:13:22Z

/assign @gabrielwen
/hold cancel

@gabrielwen @kkasravi This is ready for review.

krishnadurai

Thanks, @jlewi ! This has been a low hanging fruit for quite some time.

I've left a few comments to consider. PTAL.

krishnadurai · 2019-12-12T23:43:43Z

hack/generate_tests.py

+import os
+import subprocess
+
+TOP_LEVEL_EXCLUDES = ["docs"]


Could we include 'hack' and 'tests'?

.github/pull_request_template.md

krishnadurai · 2019-12-13T00:24:47Z

hack/generate_tests.py

+  # Generate a list of the files which have changed with respect to the upstream
+  # branch
+  modified_files = subprocess.check_output(
+    ["git", "diff", "--name-only", "@{upstream}"])


I was thinking if it would be better to have a config file which can be populated with the upstream repo's name? If the config is empty - we could prompt the user to set it.

I just tried this out with my workflow:

git checkout -b branchname git commit git push personal_fork branch-name

There's no default upstream value set and I run into this error:

fatal: no upstream configured for branch 'current_branch'

How would prompting the user be different from what happens now? i.e. you run it, see an error about no upstream branch; you set the upstream and then rerun it.

Wouldn't having a config file be duplicating what git is doing? i.e. git is already giving you a way to configure the upstream branch.

To clarify here: is the aim to generate the diff from the last upstream commit in the working branch?
If this is the case, a developer could forget to generate tests and push their changes upstream. The diffs wouldn't reflect the committed changes and the required tests might not get generated.

Alternatively, should we always compare against upstream/master assuming it is almost always updated?
This way, we avoid the branch locality of 'upstream' configuration changing and we can set our local configuration to upstream/master which will be independent of the branch which we are working on.

Please correct me if I'm mistaken.

If this is the case, a developer could forget to generate tests and push their changes upstream. The diffs wouldn't reflect the committed changes and the required tests might not get generated.

upstream should be kubeflow/manifests users should not be able to push changes to that commit because of an error.

Alternatively, should we always compare against upstream/master assuming it is almost always updated?

Isn't that what the code is doing? The code is using a diff whatever the user set as the upstream branch to determine what files are changed in the PR and if so generate the changed files.

My expectation is users set the upstream branch to kubeflow/manifests; not their personal fork of the kubeflow/manifests.

See for example
https://hackernoon.com/sync-a-fork-from-upstream-repo-in-github-c2c29c8eca3b

Got it. Thanks.

krishnadurai · 2019-12-13T01:01:58Z

hack/generate_tests.py

+  changed_dirs = set()
+  for top in os.listdir(root):
+    if top.startswith("docs"):
+      print("donotsubmit")


Could we log instead?

Thanks for catching this. This is debug code I'll remove in my next commit.

side bar: I'm really hoping there is a github action that we can use to automatically catch donotsubmit in code.

krishnadurai · 2019-12-13T01:41:05Z

hack/generate_tests.py

+    test_target_name = test_target_name.replace("-", "")
+    with open(test_path, "w") as test_file:
+      subprocess.check_call(["./hack/gen-test-target.sh", full_dir,
+                             test_target_name],


I'm not able to understand the use of this variable 'test_target_name'. Could you please explain?

gen-test-target uses only 1 argument as of now, if I'm not wrong.

Thanks for catching this. Its a bug; will revert in my next commit.

krishnadurai · 2019-12-13T01:56:37Z

Unrelated to this PR:

Another thing which I'm considering is to change gen-test-target.sh. Right now gen-test-target.sh interprets the resources from kustomization.yaml before copying over the files to a test folder.

Should we interpret kustomization.yaml using this line?

manifests/hack/gen-test-target.sh

Line 46 in 67eabbf

    
           for i in $(echo $(cat $directory/kustomization.yaml | grep '^- .*yaml$' | sed 's/^- //') $(cat $directory/kustomization.yaml | grep '  path: ' | sed 's/^.*: \(.*\)$/\1/') $(cat $directory/kustomization.yaml | sed '1,/^[ \t]*files:/d;/^[^ \t]/,$d' | sed 's/^[ \t]*- //') params.env secrets.env kustomization.yaml | sed 's/ /\\n/g' | sort | uniq | awk '{gsub(/\\n/,"\n")}1'); do

An alternative would be to generate outputs from the existing kustomize folder and comparing newer changes to kustomize against the older outputs.

Could you please explain if the current method has more benefits than the alternative suggested?

/cc @kkasravi @jlewi

k8s-ci-robot · 2019-12-13T01:56:39Z

@krishnadurai: GitHub didn't allow me to request PR reviews from the following users: jlewi.

Note that only kubeflow members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

Unrelated to this PR:

Another thing which I'm considering is to change gen-test-target.sh. Right now gen-test-target.sh interprets the resources from kustomization.yaml before copying over the files to a test folder.

Should we interpret kustomization.yaml using this line?

manifests/hack/gen-test-target.sh

Line 46 in 67eabbf

for i in $(echo $(cat $directory/kustomization.yaml | grep '^- .*yaml$' | sed 's/^- //') $(cat $directory/kustomization.yaml | grep ' path: ' | sed 's/^.*: $.*$$/\1/') $(cat $directory/kustomization.yaml | sed '1,/^[ \t]*files:/d;/^[^ \t]/,$d' | sed 's/^[ \t]*- //') params.env secrets.env kustomization.yaml | sed 's/ /\\n/g' | sort | uniq | awk '{gsub(/\\n/,"\n")}1'); do

An alternative would be to generate outputs from the existing kustomize folder and comparing newer changes to kustomize against the older outputs.

Could you please explain if the current method has more benefits than the alternative suggested?

/cc @kkasravi @jlewi

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jlewi · 2019-12-13T02:35:02Z

@krishnadurai whatever you do; please; please no more bash :)

The line of code

manifests/hack/gen-test-target.sh

Line 46 in 67eabbf

    
           for i in $(echo $(cat $directory/kustomization.yaml | grep '^- .*yaml$' | sed 's/^- //') $(cat $directory/kustomization.yaml | grep '  path: ' | sed 's/^.*: \(.*\)$/\1/') $(cat $directory/kustomization.yaml | sed '1,/^[ \t]*files:/d;/^[^ \t]/,$d' | sed 's/^[ \t]*- //') params.env secrets.env kustomization.yaml | sed 's/ /\\n/g' | sort | uniq | awk '{gsub(/\\n/,"\n")}1'); do

is pretty unreadable/impenetrable; @krishnadurai kudos to for figuring it out.

See also #306

kkasravi · 2019-12-14T16:27:00Z

hack/generate_tests.py

+def remove_unmatched_tests(repo_root, package_dirs):
+  """Remove any tests that don't map to a kustomization.yaml file.
+
+  This ensures tests don't linger if a pakage is deleted.


pakage -> package

kkasravi · 2019-12-14T16:29:23Z

hack/generate_tests.py

+      logging.info("Ignoring %s", name)
+      continue
+
+    if not name in expected_tests:


do we handle the case where a manifest has been deleted so the unit tests no longer apply?

@kkasravi This code should ensure that if a/b/c/kustomization.yaml doesn't exist then the test a-b-c_test.go should not exist. So I believe that covers the use case you described or did I miss something?

* use git diff --name-only @{upstream} to compute the diff against the upstream branch. This should be better then the current branch which is making assumptions about the remote repo names. * Furthermore we need to make sure that when the base kustomization package changes that we also regenerate the tests for the overlay packages. * to support that we replace gen-test-targets.sh with a python script. * The bash scripts are pretty impenetrable; migrating to python should make the code easier to maintain. * The name of the go test files generated is slightly different from what the shell scripts were generating. * This is intended to make the naming more consistent * specifically a/b/c/kustimazation.yaml results in tests/a-b-c_test.go * It looks like the shell script was sometimes not including a in the name. * The python script also checks that for _test.go file there is a corresponding kustomization.yaml file; otherwise it will remove the test. This ensures if we move or remove a kustomize package we will end up removing the test. * Fix: kubeflow#509 * Update the pull request template with the command to only generate tests for changed files * Fix kubeflow#171 - gen-test-target.sh should function regardless of checked out name for the repository. We can use git to get the base directory and then do an appropriate string replace. * Need to update the test target generation to not assume the repository is named manifests * Update the github pull_request template to tell users to run `make generate-changed-only`

krishnadurai

/lgtm

krishnadurai · 2019-12-16T20:47:35Z

hack/generate_tests.py

+  # Generate a list of the files which have changed with respect to the upstream
+  # branch
+  modified_files = subprocess.check_output(
+    ["git", "diff", "--name-only", "@{upstream}"])


Got it. Thanks.

jlewi · 2019-12-16T21:04:14Z

/approve

k8s-ci-robot · 2019-12-16T21:04:21Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jlewi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

googlebot added the cla: yes label Dec 12, 2019

k8s-ci-robot requested review from IronPan and krishnadurai December 12, 2019 06:54

k8s-ci-robot assigned kkasravi Dec 12, 2019

k8s-ci-robot added size/XS size/S and removed size/XS labels Dec 12, 2019

jlewi force-pushed the changed branch from 227c616 to 7bbd59c Compare December 12, 2019 07:31

k8s-ci-robot added size/M and removed size/S labels Dec 12, 2019

k8s-ci-robot added do-not-merge/hold size/XXL size/XL and removed size/M size/XXL labels Dec 12, 2019

jlewi force-pushed the changed branch 3 times, most recently from 7feb746 to 5b5dd9e Compare December 12, 2019 23:12

k8s-ci-robot removed the do-not-merge/hold label Dec 12, 2019

k8s-ci-robot assigned gabrielwen Dec 12, 2019

jlewi mentioned this pull request Dec 13, 2019

Continuous build of docker images and updating kustomize manifests kubeflow/testing#450

Closed

krishnadurai suggested changes Dec 13, 2019

View reviewed changes

k8s-ci-robot requested a review from kkasravi December 13, 2019 01:56

jlewi mentioned this pull request Dec 13, 2019

How do we manually verify that the expected value in the unittests is correct? #306

Closed

kkasravi reviewed Dec 14, 2019

View reviewed changes

Jeremy Lewi added 3 commits December 16, 2019 11:59

Address comments.

1ea09ec

Latest.

ea4efa1

jlewi force-pushed the changed branch from ed9d202 to ea4efa1 Compare December 16, 2019 19:59

jlewi mentioned this pull request Dec 16, 2019

Add permissions to JWA for SubjectAccessReviews #655

Merged

1 task

krishnadurai approved these changes Dec 16, 2019

View reviewed changes

k8s-ci-robot assigned krishnadurai Dec 16, 2019

k8s-ci-robot added the lgtm label Dec 16, 2019

k8s-ci-robot added the approved label Dec 16, 2019

k8s-ci-robot merged commit 980d9e6 into kubeflow:master Dec 16, 2019

jlewi mentioned this pull request Dec 17, 2019

manifests test structure should mirror manifest directory tree #683

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix computation of changed files in generate-changed-only rule #665

Fix computation of changed files in generate-changed-only rule #665

jlewi commented Dec 12, 2019 •

edited

Loading

jlewi commented Dec 12, 2019

jlewi commented Dec 12, 2019

jlewi commented Dec 12, 2019 •

edited

Loading

krishnadurai left a comment

krishnadurai Dec 12, 2019

jlewi Dec 13, 2019

krishnadurai Dec 13, 2019

jlewi Dec 13, 2019

krishnadurai Dec 13, 2019

jlewi Dec 16, 2019

krishnadurai Dec 16, 2019

krishnadurai Dec 13, 2019

jlewi Dec 13, 2019

krishnadurai Dec 13, 2019

jlewi Dec 13, 2019

krishnadurai commented Dec 13, 2019

k8s-ci-robot commented Dec 13, 2019

jlewi commented Dec 13, 2019

kkasravi Dec 14, 2019

jlewi Dec 16, 2019

kkasravi Dec 14, 2019

jlewi Dec 16, 2019

krishnadurai left a comment

krishnadurai Dec 16, 2019

jlewi commented Dec 16, 2019

k8s-ci-robot commented Dec 16, 2019

Fix computation of changed files in generate-changed-only rule #665

Fix computation of changed files in generate-changed-only rule #665

Conversation

jlewi commented Dec 12, 2019 • edited Loading

jlewi commented Dec 12, 2019

jlewi commented Dec 12, 2019

jlewi commented Dec 12, 2019 • edited Loading

krishnadurai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krishnadurai commented Dec 13, 2019

k8s-ci-robot commented Dec 13, 2019

jlewi commented Dec 13, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krishnadurai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlewi commented Dec 16, 2019

k8s-ci-robot commented Dec 16, 2019

jlewi commented Dec 12, 2019 •

edited

Loading

jlewi commented Dec 12, 2019 •

edited

Loading