Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in git_diff_find_similar. #5839

Merged
merged 2 commits into from
May 13, 2021
Merged

Conversation

staktrace
Copy link
Contributor

When a split src gets turned into a rename, it should
also lose the IS_RENAME_TARGET flag, so that it doesn't
get processed in a subsequent iteration as a rename target.
Doing so can cause an assertion failure because it no
longer has the SPLIT flag.

Fixes #5811.

@staktrace
Copy link
Contributor Author

Looking for feedback on this patch. I spent a long time trying to wrap my head around this find_similar code and I'm not really confident I understood it properly. But assuming my understanding is correct, this seems like the right fix. If so I can write up tests and so on, but I'd rather get the proper fix first. Thanks!

@ethomson
Copy link
Member

I spent a long time trying to wrap my head around this find_similar code and I'm not really confident I understood it properly.

Indeed! I didn't write this code and I haven't played with rename detection / rewrite splitting in a loooong time, so I'll need to get up to speed, too. Any chance that you have a minimum reproducible example that we could use for testing?

@staktrace
Copy link
Contributor Author

There's a fairly reduced example in #5811. I'm working on reducing it further to include as a testcase for this patch but I have other time commitments so progress is a bit slow. Hopefully I'll get it done this coming weekend (or sooner).

@staktrace
Copy link
Contributor Author

@ethomson Ok, I further reduced the testcase and added it to the clar suite in the second commit in this PR. Please take a look when you get a chance!

@staktrace
Copy link
Contributor Author

Please let me know if there's anything else I can do to help move this along!

When a split src gets turned into a rename, it should
also lose the IS_RENAME_TARGET flag, so that it doesn't
get processed in a subsequent iteration as a rename target.
Doing so can cause an assertion failure because it no
longer has the SPLIT flag.

Fixes libgit2#5811.
@staktrace staktrace changed the title [WIP] Fix bug in git_diff_find_similar. Fix bug in git_diff_find_similar. May 13, 2021
@staktrace
Copy link
Contributor Author

Rebased to main and dropped [wip] status. I have used this patch on a bunch of repos and it seems to be working well as far as I can tell.

@staktrace
Copy link
Contributor Author

@arrbee as it seems you have touched/written most of this file, could you do a review?

@arrbee
Copy link
Member

arrbee commented May 13, 2021

@arrbee as it seems you have touched/written most of this file, could you do a review?

I have not touched libgit2 code in about 6 years, so I feel like it would be inappropriate for me to approve this PR. It would be stepping on the toes of the active maintainers. That being said, I've read through the reported issue and traced through the code, at this does seem like a good fix to me. Once the file in question has been labeled as a successfully matched rename, it seems correct to clear the flag and remove it from the target list so as the iteration continues it is not reconsidered and does not cause the assertion to fail.

This isn't fresh enough in my mind to say if there is a different underlying issue where the assertion is actually incorrect and a file should be allowed to be the result of a rewrite of 2 sources (e.g. a hypothetical where two almost identical files are merged into a single resulting file that is labeled as the rewritten result of both), but I don't think that would be expected behavior in the first place. Therefore I expect this fix is good.

This is me trying to dredge up 6 year old recollection of the code, so I apologize that I'm not super confident in giving feedback here. I just felt I should comment in case it has any value to the active maintainers. 👋 😍

@staktrace
Copy link
Contributor Author

Thank you, that's all I was looking for - just some extra confidence that the patch seems reasonable. @ethomson I realize you're busy but if there's anything else I can do to get this merged please let me know. If not I'll probably have to end up maintaining a fork which is quite inconvenient just for a two line fix :/

@ethomson
Copy link
Member

Thanks for the fix and the very helpful test. This is great!

And thanks @arrbee, and welcome back. Any time you want to take back over as maintainer, we'd be happy to have you. 😁

@ethomson ethomson merged commit be95f68 into libgit2:main May 13, 2021
@staktrace
Copy link
Contributor Author

Great, thank you both!

@staktrace staktrace deleted the find_similar branch May 13, 2021 19:42
staktrace added a commit to staktrace/git2-rs that referenced this pull request May 15, 2021
This is specifically to pick up the fix
at libgit2/libgit2#5839
staktrace added a commit to staktrace/git2-rs that referenced this pull request May 18, 2021
This is specifically to pick up the fix
at libgit2/libgit2#5839
alexcrichton pushed a commit to rust-lang/git2-rs that referenced this pull request May 18, 2021
This is specifically to pick up the fix
at libgit2/libgit2#5839
@ethomson ethomson mentioned this pull request Jun 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reproducible "unrecoverable internal error: 'delta_is_split(tgt)'" from git_diff_find_similar
3 participants