-
Notifications
You must be signed in to change notification settings - Fork 874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix computation of changed files in generate-changed-only rule #665
Conversation
/assign @kkasravi |
/hold @kkasravi
The tests fail I think the problem is that Is that a problem with my PR or is that an existing issue with |
7feb746
to
5b5dd9e
Compare
/assign @gabrielwen @gabrielwen @kkasravi This is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @jlewi ! This has been a low hanging fruit for quite some time.
I've left a few comments to consider. PTAL.
hack/generate_tests.py
Outdated
import os | ||
import subprocess | ||
|
||
TOP_LEVEL_EXCLUDES = ["docs"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we include 'hack' and 'tests'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
# Generate a list of the files which have changed with respect to the upstream | ||
# branch | ||
modified_files = subprocess.check_output( | ||
["git", "diff", "--name-only", "@{upstream}"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking if it would be better to have a config file which can be populated with the upstream repo's name? If the config is empty - we could prompt the user to set it.
I just tried this out with my workflow:
git checkout -b branchname
git commit
git push personal_fork branch-name
There's no default upstream value set and I run into this error:
fatal: no upstream configured for branch 'current_branch'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would prompting the user be different from what happens now? i.e. you run it, see an error about no upstream branch; you set the upstream and then rerun it.
Wouldn't having a config file be duplicating what git is doing? i.e. git is already giving you a way to configure the upstream branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify here: is the aim to generate the diff from the last upstream commit in the working branch?
If this is the case, a developer could forget to generate tests and push their changes upstream. The diffs wouldn't reflect the committed changes and the required tests might not get generated.
Alternatively, should we always compare against upstream/master assuming it is almost always updated?
This way, we avoid the branch locality of 'upstream' configuration changing and we can set our local configuration to upstream/master which will be independent of the branch which we are working on.
Please correct me if I'm mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the case, a developer could forget to generate tests and push their changes upstream. The diffs wouldn't reflect the committed changes and the required tests might not get generated.
upstream should be kubeflow/manifests users should not be able to push changes to that commit because of an error.
Alternatively, should we always compare against upstream/master assuming it is almost always updated?
Isn't that what the code is doing? The code is using a diff whatever the user set as the upstream branch to determine what files are changed in the PR and if so generate the changed files.
My expectation is users set the upstream branch to kubeflow/manifests
; not their personal fork of the kubeflow/manifests.
See for example
https://hackernoon.com/sync-a-fork-from-upstream-repo-in-github-c2c29c8eca3b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thanks.
hack/generate_tests.py
Outdated
changed_dirs = set() | ||
for top in os.listdir(root): | ||
if top.startswith("docs"): | ||
print("donotsubmit") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we log instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this. This is debug code I'll remove in my next commit.
side bar: I'm really hoping there is a github action that we can use to automatically catch donotsubmit in code.
hack/generate_tests.py
Outdated
test_target_name = test_target_name.replace("-", "") | ||
with open(test_path, "w") as test_file: | ||
subprocess.check_call(["./hack/gen-test-target.sh", full_dir, | ||
test_target_name], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not able to understand the use of this variable 'test_target_name'. Could you please explain?
gen-test-target
uses only 1 argument as of now, if I'm not wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this. Its a bug; will revert in my next commit.
Unrelated to this PR: Another thing which I'm considering is to change Should we interpret kustomization.yaml using this line? manifests/hack/gen-test-target.sh Line 46 in 67eabbf
An alternative would be to generate outputs from the existing kustomize folder and comparing newer changes to kustomize against the older outputs. Could you please explain if the current method has more benefits than the alternative suggested? |
@krishnadurai: GitHub didn't allow me to request PR reviews from the following users: jlewi. Note that only kubeflow members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@krishnadurai whatever you do; please; please no more bash :) The line of code manifests/hack/gen-test-target.sh Line 46 in 67eabbf
is pretty unreadable/impenetrable; @krishnadurai kudos to for figuring it out. See also #306 |
hack/generate_tests.py
Outdated
def remove_unmatched_tests(repo_root, package_dirs): | ||
"""Remove any tests that don't map to a kustomization.yaml file. | ||
|
||
This ensures tests don't linger if a pakage is deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pakage -> package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
logging.info("Ignoring %s", name) | ||
continue | ||
|
||
if not name in expected_tests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we handle the case where a manifest has been deleted so the unit tests no longer apply?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kkasravi This code should ensure that if a/b/c/kustomization.yaml doesn't exist then the test a-b-c_test.go should not exist. So I believe that covers the use case you described or did I miss something?
* use git diff --name-only @{upstream} to compute the diff against the upstream branch. This should be better then the current branch which is making assumptions about the remote repo names. * Furthermore we need to make sure that when the base kustomization package changes that we also regenerate the tests for the overlay packages. * to support that we replace gen-test-targets.sh with a python script. * The bash scripts are pretty impenetrable; migrating to python should make the code easier to maintain. * The name of the go test files generated is slightly different from what the shell scripts were generating. * This is intended to make the naming more consistent * specifically a/b/c/kustimazation.yaml results in tests/a-b-c_test.go * It looks like the shell script was sometimes not including a in the name. * The python script also checks that for _test.go file there is a corresponding kustomization.yaml file; otherwise it will remove the test. This ensures if we move or remove a kustomize package we will end up removing the test. * Fix: kubeflow#509 * Update the pull request template with the command to only generate tests for changed files * Fix kubeflow#171 - gen-test-target.sh should function regardless of checked out name for the repository. We can use git to get the base directory and then do an appropriate string replace. * Need to update the test target generation to not assume the repository is named manifests * Update the github pull_request template to tell users to run `make generate-changed-only`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
# Generate a list of the files which have changed with respect to the upstream | ||
# branch | ||
modified_files = subprocess.check_output( | ||
["git", "diff", "--name-only", "@{upstream}"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thanks.
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jlewi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fix computation of changed files in generate-changed-only rule
use git diff --name-only @{upstream} to compute the diff against the
upstream branch. This should be better then the current branch
which is making assumptions about the remote repo names.
Furthermore we need to make sure that when the base kustomization package
changes that we also regenerate the tests for the overlay packages.
to support that we replace gen-test-targets.sh with a python script.
The bash scripts are pretty impenetrable; migrating to python should
make the code easier to maintain.
The name of the go test files generated is slightly different from what
the shell scripts were generating.
This is intended to make the naming more consistent
specifically a/b/c/kustimazation.yaml results in
tests/a-b-c_test.go
It looks like the shell script was sometimes not including a in the name.
The python script also checks that for _test.go file there is a corresponding
kustomization.yaml file; otherwise it will remove the test. This
ensures if we move or remove a kustomize package we will end up
removing the test.
Fix: make generate - assumptions about remote names #509
Update the pull request template with the command to only generate
tests for changed files
Fix gen-test-targets should function regardless of checkout name #171 - gen-test-target.sh should function regardless
of checked out name for the repository. We can use git to get the
base directory and then do an appropriate string replace.
Need to update the test target generation
to not assume the repository is named manifests
This change is