-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Within-line edit highlighting is unpredictable #140
Comments
FWICT this only seems to happen for some inter-line changes. However I haven't figured out the pattern yet. Considering this line as the base:
changing it to:
Correctly highlights the diff: However changing the line to:
Fails: Related: #119 |
Hi @jakabk @0xC0FFEE thanks for this feedback. @0xC0FFEE as you've noted this is the same issue as #119. So, your example starts to be highlighted with Basically, this needs to be improved. The issue here is that in both examples (@jakabk's and @0xC0FFEE's) the changed section is a fairly large proportion of the line length, and delta is being (too) conservative and refusing to treat them as homologous lines. But that doesn't make much sense seeing as there is one removed line and one added line. It starts to make a bit more sense when there are, say, 1 removed line and 2 added lines, because then delta needs to make a decision about which of the 2 added lines is the replacement. If you have any thoughts about how we should go about improving this then please let me know! My initial thought is that the algorithm should perhaps be different in the two cases: (1) same number of red and green lines vs (2) different numbers of red and green lines. GitHub/GitLab do not output the within-line highlights at all in case (2), and delta is trying to go one better here, and I think needs to do some work to stop making a mess of the common case. |
Hi @dandavison, the approach to differentiate the two cases (1) and (2) seems to make sense to me. From my experience diff-so-fancy does a good job with (1). As I've never had tooling for (2), I currently have no educated thoughts on this, except that it sounds fun and I've seen diffs today, where it really helped 👍. My initial take was, that a generic distance is not enough and instead (or in addition) something like the longest common sub string could be utilized. IMHO most of the time we want to highlight within-line changes when they happen to be contiguous and this is what I would explore. However I'd have a look at diff-so-fancy (and possibly other tools as well e.g. vim, gitlab, ...) before solving this from scratch. This sounds like a really common problem to me and maybe even writing this in library form might be beneficial. |
Hi, @dandavison I'm joining to @0xC0FFEE I'd like to support the idea handling the two cases separated as well. In the first case (one + and one -) I'd like to see the change highlighted. The recommended switch: --max-line-distance, makes much more usable the diff highlighting but it's not perfect. In the second case, I find @0xC0FFEE idea good: using the longest common substring to choose the appropriate lines and showing the diff between the two chosen lines. |
Thanks @0xC0FFEE @jakabk @igordovgaluk. I definitely want to get an improved v2 within-line edit inference algorithm out. It would be really helpful if you could carry on submitting more examples that you encounter of bad (or good) within-line highlighting (as plain text, just as comments in this issue is fine, or as a PR against the Here's a counter-example: diff --git a/file b/file
index 1dc5384..fa4f6e9 100644
--- a/file
+++ b/file
@@ -1,3 +1,3 @@
-aaaa a aaa
-bbbb b bbb
-cccc c ccc
+bbbb ! bbb
+dddd d ddd
+cccc ! ccc Current delta does this one correctly (and this is what I was going for with the original algorithm): Do you see value in that sort of behavior? I don't think any of diff-so-fancy, GitLab, GitHub etc get that one right. If we switch to "naive line matching" when there are equal numbers of plus and minus lines (i.e. always assume that the i-th green evolved from the i-th red line), then we of course will "fail" on this example: I'm not saying that it's essential to do this example correctly: it's obviously important to do the common cases well, and I think the equal-line-numbers that you're highlighting are common cases. So, this needs more thought but off the top of my head here are some considerations:
|
I definitely think that this is can be valuable (even if it is still somewhat hard for me to comprehend and understand what changed). Also I guess you are right, as I've not seen cases like this handled before in other tools 👍. I think gathering (real world) examples were |
Here's something to play with if you can build from source: I've pushed 5162437 to master. This allows you to activate an experimental feature by setting an environment variable:
Here's what that does:
"distance" is a number between 0 (identical lines) and 1 (totally different lines). It is basically defined as
See edits.rs for exact calculation. So, for example, if the env var has a value > 0.32 then we get In practice I'm thinking that values somewhere above 0.6 will be appropriate but it would be great to hear what you find works with this version of the algorithm. |
Thanks for providing a usable version of your idea so promptly. I'll test-drive this next week and report interesting results back. Kudos! |
Just to be clear, I put this up simply because it was very easy to do from what we have already; there could well be merit in more involved changes. |
If anyone tries out This aspect of delta development (improving the within-line highlight algorithms and options) is one that could benefit a lot from having multiple people's thoughts (since we use different languages, work in code bases with different styles, etc). |
With PR #1244 this is now highlighted correctly without having to tweak |
I wonder if it would be practical to use the patience diff algorithm to pair up lines based on unique tokens within them and then use the current LCS calculation to highlight them once they're paired. That would avoid a quadratic search to pair the lines up while hopefully avoiding the suboptimal matches from the greedy matching that delta currently uses. |
Delta version: delta 0.1.1
I'm using mate-terminal (1.20.2) on Debian Buster.
The 24-bit color test seems to work, but the diff highlighting does not appear,
What do I wrong? Is it some extra config needed? I attach a screenshot about my terminal.
The text was updated successfully, but these errors were encountered: