Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug where dirty submodules broke hash generation #711

Merged

Conversation

mattkelly
Copy link

This is a proposal for a fix for #685. Please let me know what you think of the approach! I'm new to the Skaffold code, so hopefully I'm not misunderstanding how this should work.

Regarding splitting the status lines: ?? isn't the only special case (two characters). MM is also a valid status, for example. If my understanding of the purpose of that code was correct, then my new implementation should be more robust. Ideally I would have split this out as a separate commit, but it's pretty tangled in there.


Instead of appending the entire contents of modified files to the sha256
input, we should only append the diff as computed by git diff in order
to handle all modifications properly, including submodules.

This also fixes a bug where the processing of the git status output was
too fragile. It makes it more generic by splitting the status string on
fields instead.

Signed-off-by: Matt Kelly matt.kelly@containership.io

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

@mattkelly mattkelly force-pushed the bugfix/dirty-submodule branch from 8517821 to 35375dc Compare June 20, 2018 01:40
@r2d4 r2d4 added the kokoro:run runs the kokoro jobs on a PR label Jun 20, 2018
@kokoro-team kokoro-team removed the kokoro:run runs the kokoro jobs on a PR label Jun 20, 2018
@mattkelly
Copy link
Author

mattkelly commented Jun 20, 2018

Whoops, totally missed that there are tests that need fixing now.

Along those lines, this approach needs some tweaking if it's going to have reproducible results. This is because the default git diff output includes some index-dependent headers. For example:

diff --git a/pkg/skaffold/build/tag/git_commit.go b/pkg/skaffold/build/tag/git_commit.go
index ad5d08c..4bbed3f 100644 // this is a problem
--- a/pkg/skaffold/build/tag/git_commit.go
+++ b/pkg/skaffold/build/tag/git_commit.go
<actual diff contents follow>

We should really only hash the diff content itself. That still might be a problem for git diff on submodules, though, because the diff content looks like this:

-Subproject commit e82a80da031152a55fed6fec2697c6ab2961f27b
+Subproject commit b75d42f81c22009fb9194592c87e6c606fb7a7c6

Before I go off and change the diff command and fix the tests (for non-submodules at least), what do people think of this approach in general given these considerations?

@dlorenc
Copy link
Contributor

dlorenc commented Jun 22, 2018

This makes sense overall to me. Will you be able to sign the CLA?

@mattkelly
Copy link
Author

@dlorenc yup, sorry - I started getting that sorted out but now I'm traveling for a few days. I'll have the CLA signed and the tests fixed by Monday. Thanks!

@mattkelly mattkelly force-pushed the bugfix/dirty-submodule branch from 35375dc to 2a86e0c Compare June 24, 2018 19:45
@googlebot
Copy link

CLAs look good, thanks!

@mattkelly mattkelly force-pushed the bugfix/dirty-submodule branch from 2a86e0c to cd11ef6 Compare June 24, 2018 19:47
Instead of appending the entire contents of modified files to the sha256
input, we should only append the diff as computed by `git diff` in order
to handle all modifications properly, including submodules.

Signed-off-by: Matt Kelly <matt.kelly@containership.io>
@mattkelly mattkelly force-pushed the bugfix/dirty-submodule branch from cd11ef6 to f06a691 Compare June 24, 2018 19:57
@mattkelly
Copy link
Author

mattkelly commented Jun 24, 2018

Fixed tests and rebased / fixed conflicts to incorporate b8cfc37#diff-7b47f8007b648de24c6b5c948a869dd6R171 properly. Two notes:

  1. I decided to leave the git command git diff as it is. Now that I understand how the tests work (and commit hash generation) at a deeper level, I realized that having specific hashes in the diff is fine.

  2. I removed the change to use strings.Fields() to separate the status code from the file path. The reason is that the XY form of git status codes can actually start with a blank (for example M). strings.Fields() would miss the blank in this case. That said, I do still think that the existing implementation is fragile/broken because there are many possible two-character status codes - none of which are checked for except for ??. A bigger problem is that if we change the code to properly support XY codes, then we break a lot of unit tests because go-git (AFAICT) only has support for single character status codes. So in general I think that this should be solved under a new issue because it's a nontrivial change.

@r2d4 r2d4 added the kokoro:run runs the kokoro jobs on a PR label Jun 24, 2018
@kokoro-team kokoro-team removed the kokoro:run runs the kokoro jobs on a PR label Jun 24, 2018
@dgageot
Copy link
Contributor

dgageot commented Jun 25, 2018

Thanks @mattkelly I'm going to merge your fix. However, in #714, I stop computing a sha256 of file changes because of the reasons you mention. Instead I use the imageID when the repo is dirty.

@dgageot dgageot merged commit be7329c into GoogleContainerTools:master Jun 25, 2018
@bhack bhack mentioned this pull request Jun 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants