Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check fails after Apply using PaddedCell #105

Closed
fvgh opened this issue Apr 29, 2017 · 13 comments
Closed

Check fails after Apply using PaddedCell #105

fvgh opened this issue Apr 29, 2017 · 13 comments
Labels

Comments

@fvgh
Copy link
Member

fvgh commented Apr 29, 2017

When providing the Groovy support (#13), I found that the grecipse formatter formats certain code with an ambiguity.
We had a misunderstanding about the purpose of paddedCell.
Due to the description and the corresponding review comment, I got the impression that in case paddedCell is set, spotlessCheck should only log a warning.
By adapting the existing unit test I am able to reproduce the problem.

@fvgh
Copy link
Member Author

fvgh commented Apr 29, 2017

I actually use the greclipse formatter in Eclipse as well as with spotless. The greclipse formatter has a slight cycle problem when it comes to closures, but works fine otherwise. So in the editor I basically always selecting one of the cycle solutions I consider best. I was never happy with the custom rule I introduced for #13 to overcome the apply/check problem I had with paddedCell. But I am also not fond of the strict spotless approach to select the canonical minimum from the cycle solutions offered by the formatter. So I took this opportunity to propose a way forward which solves both of my problems:

  1. The spotless groovy step check fails after I did an apply though I am using paddedCell
  2. The paddedCell forces me to use a particular solution I am not always pleased with

@fvgh
Copy link
Member Author

fvgh commented Apr 29, 2017

Let me explain my proposed fix

Let's assume we have a cycle behavior of the formatter like this one Aa, Bb, Aa, Bb
Currently spotless decides that I should use Aa . Maybe I am not the user who maintained the build script, so I am not aware of the cycle problems and what paddedCell is all about. So when I use the editor, I decide to use Bb and will be surprised that I find spotless to use Aa instead.

Hence I thought that paddedCell should better check whether the original file already matches one of the formatter cycle solutions. That approach solves automatically the check/apply problem which was the first reason for me to open this issue.

But this leads to the problem currently solved by the canonical approach.
Imagine I have already adapted the sequence Aa to Bb, and no add a further invalid sequence, which would look like BbBB. The valid sequence should be BbBb, so spotlessApply has to correct my mistake. Currently it would change my file to AaAa. To overcome the problem I select form all cycle solutions proposed by the formatter the one with the minimum edit distance.

To avoid external dependencies of the lib, I provided an implementation of the O(NP) algorithm.

@fvgh
Copy link
Member Author

fvgh commented Apr 29, 2017

I am aware that my current solution changes the behavior and has furthermore a performance impact.
The performance impact should be alright, since it only affect files where incorrect (not compliant to format) modifications have been made and if we have detected cycle problems.
So if you principally agree to my proposal, let me know what you propose to do about the behavior change. Currently I just left the old two methods and marked them as deprecated, but they are not used anymore by paddedCell. It should be easy to let both solutions coexist and choose by paddedCell configuration between them.

My proposal is basically driven by my experience using the GrEclipse formatter side by side with the IDE and spotless. So in my use-case I think it gives you a much better feeling, since you practically can be unaware of paddedCell. Let me know if there are other use-case wher the propose approach would fail.

@nedtwigg
Copy link
Member

Check fails after Apply using PaddedCell

So you have paddedCell(), and spotlessCheck is failing after you have called spotlessApply? That is definitely a bug - it should not fail. I thought we had integration tests for this.

I select from all cycle solutions ... the one with the minimum edit distance.

Given the problem of a cycle, I think there are two reasonable solutions. One is define a repeatable canonical resolution (current behavior), and another is to find the closest match to the current text using edit distance.

Let's go back to your Aa Bb cycle. My preference is Aa. Your preference is Bb. We keep getting merge conflicts caused by simple formatting disagreements. In the edit-distance case, we continue to get merge conflicts. In the consistent-canonical case, the merge conflicts are gone forever.

The point of Spotless is that formatting doesn't matter, so we should have an automatic canonical authority. The current behavior is that Spotless remains an automatic canonical authority, even in the face of a bug in the formatter. The edit distance behavior means that Spotless will give up on being an automatic canonical authority, relying instead on the current user's preferences, as distinct from the team's preferences.

In the case that Spotless is picking "the wrong" element in the cycle, you can do a quick patch with a "replaceRegex" rule. Older versions of the eclipse formatter did a bad job with Java 8 lambdas, so we had to use regexes to fix it.

Possible resolution

Right now paddedCell is either off, in which case a bad formatter can cause spotlessCheck to fail right after spotlessApply, or it is on, in which case an arbitrary solution is chosen.

It could instead be an enum: OFF, CANONICAL, NEAREST

It would be important for this change to be backward-compatible, and there's a fair amount of docs to update. If you really want NEAREST, I'm happy to merge it. But I think OFF/CANONICAL in conjunction with regex fixes should remain the recommended fixes.

@fvgh
Copy link
Member Author

fvgh commented Apr 30, 2017

Thanks @nedtwigg for the quick response. I think the regexes to fix it come close to what we have currently in the spotlessSelf.gradle and at least for my use case I think the NEAREST approach makes life easier.
The remaining code change for the enum is quickly done. I of-course will provide the "draft" documentation, but I am afraid I need your help/corrections.

So I will do the following:

  1. For spotlessSelf.gradle we use CANONICAL with greclipse, since this is the recommended solution, and remove the custom rule I just introduced it as a work-around for the described apply/check bug.
  2. I will add the changes for enum solution you proposed as a new commit on the current branch, and later on you can do a squash merge. Not sure about your work-flow is in that respect. Or shall I do some cleanup on the current branch and leave the separate commits for prove-bug/fix-bug/apply-fix-in-spotlessSelf.gradle?

@fvgh
Copy link
Member Author

fvgh commented Apr 30, 2017

Just to ensure that we do not misunderstand each other.

  • OFF: Means that an error should be raised in case the formatter does not CONVERGE shouldn't it? Maybe STRICT would be a better name for this ambiguity handling, wouldn't it, since it does not really allow the formatter to have a "real" ambiguity.
  • CANONICAL, NEAREST will be the option solve the ambiguity in case a CYCLE occurs. Further more for both configuration it is accepted in case the formatter solution does DIVERGE.

@nedtwigg
Copy link
Member

nedtwigg commented Apr 30, 2017

Before we add any new features, I want to make sure we've fixed any bugs that we have. I still don't understand if we have one or not.

I got the impression that in case paddedCell is set, spotlessCheck should only log a warning.

That is correct, that is the intended behavior, and I think it's the actual behavior too. We have a unit test that ensures that this is the case. Unless the test is broken in some subtle way...

So you have paddedCell(), and spotlessCheck is failing after you have called spotlessApply?

Is this happening? Can we reproduce it with a testcase? If the feature we have is broken, we need to fix that first before we add new things.

for my use case I think the NEAREST approach makes life easier

I want to understand why NEAREST is easier than STRICT + regex. The regex fix seems very easy, and it keeps Spotless as a source of canonical truth. NEAREST will inevitably lead to formatter-preventable merge conflicts. If we find a usecase that is hard to fix with regex, then NEAREST might make sense. If our only usecase is fixed with a trivial regex, I'm not convinced that this feature is worth the combined cost of maintenance + loss of repeatability.

@fvgh
Copy link
Member Author

fvgh commented May 1, 2017

Seems we have still quite some misunderstandings.

Prove of bug

We have a unit test that ensures that this is the case.

As I stated initially:

By adapting the existing unit test I am able to reproduce the problem.

Do you agree that my extensions to the unit tests should pass with the existing implementation? They do not, that's why I provided them as a initial commit, to prove the problem.

Detection of bug

So you have paddedCell(), and spotlessCheck is failing after you have called spotlessApply?

We had both a Travis failure for PR 94. You added paddedCell initially, and thought the failure might have something to do with Travis, and I had then another failure when trying to squash the commits. At that point I realized that you added paddedCell to the groovy rules and tried to learn what it does for me.
But I found when I issue spotlessCheck --rerun-tasks or build --rerun-tasks that I still end up with failures due to the spotless checks.
As a work-around, I replaced the paddedCell by a custom rule. That's what finally got committed.

Can we reproduce it with a testcase?

As I stated above, I added a basic prove. But for the final check I extended also a for-loop. This basically tests that --rerun-tasks will now work.

Proposal

If the feature we have is broken, we need to fix that first before we add new things.

Yes I agree. My first proposal was more a result of a learning curve I went through myself when looking at the code and the problems I had when using GrEclipse in the IDE in parallel with spotless. Furthermore I did not fully realize why the CANONICAL approach had been taken, but I think I understand your explanations. But I am also not sure that we are talking about the same thing when we speak about OFF / STRICT.

Let me:

  1. Replace the current branch with one that only fixes and tests the bug
  2. Make a proposal where I introduce STRICT, which should ensure that we really got a common understanding about the work-flow (when somebody adds custom rules and when paddedCell becomes obsolete due to custom rules, ...).
  3. After that I will try again to convince you to take NEAREST on board 😄

Sounds good?

@nedtwigg nedtwigg added the bug label May 2, 2017
@nedtwigg
Copy link
Member

nedtwigg commented May 2, 2017

Very sorry, I missed that link! You've had the whole thing figured out for a while now, and I've been stumbling in the dark :)

Great fix to the test! That's a subtle bug - I was so sure it was being tested that I was blind to the bug! I pushed up a quick fix on top of your test:

#106

I will try again to convince you to take NEAREST on board

My ears are open - I'm most convinced by an example. Here we have an example where the "minimum" isn't the best resolution, but it's easy to fix with a regex.

Right now, the model is very simple. Steps are supposed to be idempotent. If they aren't, turn on paddedCell and it makes the idempotent.

What's an example that justifies breaking this simplicity?

@fvgh
Copy link
Member Author

fvgh commented May 3, 2017

Where I thought the CANONICAL would not cope with the GrEclipse cycles is where I needed a line break (java-publish.gradle in my initial example). But testing it again, also in Eclipse, I found that the GrEclipse actually respects in that scenario a line break inserted by the user. So the only remaining flaws in GrEclipse I am currently aware of are the places where it inserts/removes in a cycle line breaks (like build.gradle in my initial example). In theses cases the minimum lines are the desired solution, hence the CANONICAL approach is of course working well.
But I think we agree that insertion of indentation and line break with custom rules is difficult when you have no access to the abstract syntax tree, don't we?
So I take your point that we should not increase the complexity unless there is a current demand.

Last thing that stroke me as odd when I tried to understand the usage of paddedCell.
If the user has a formatter which has various CYCLE problems, the user can eventually fix them with custom rules.
PaddedCellGradle then reminds the user that he can remove paddedCell again.
For CONVERGE problems this reminder will also appear, wouldn't it? But since the users' formatter has still the problem, just paddedCell fixed it for the user on the current files, paddedCell is still needed, isn't it?
Don't get me wrong, this is nothing where I would see need for a change, but just form my understanding how paddedCell shall be used.

Sorry for the time you spend on this issue. I should have provided a fix first and then the new proposal (in two separate issues 😞 ).

I had no time for spotless lately and was worried that somebody might have issues with the GrEclipse instabilities, but since #106 now fixes paddedCell for all GrEclipse problems I am aware of, I am happy to consider this issue as closed and remove branch padded_cell_check_fails_after_apply.

If you agree, I will provide on a different branch modified files for spotless gradle, where I replace the groovy custom rule by paddedCell, together with a documentation change where I will recommend the usage of paddedCell with GrEclipse.

@nedtwigg
Copy link
Member

nedtwigg commented May 3, 2017

Sorry for the time you spend on this issue.

I'm sorry for the time you spent! The root cause here was I muffed a test way-back-when and there's been an underlying uncaught bug, and then when you found it, I didn't read carefully enough to realize you had isolated it so cleanly.

the user can ... fix them with custom rules ... then reminds the user that he can remove paddedCell again

When there's a formatter bug, give the user tools to fix and/or workaround. Once they have fixed it, let them know they can make the build fast again if they'd like.

For CONVERGE, it turns out that paddedCell is always "on":

// f(f(input) == f(input) for an idempotent function
if (!onceMore.equals(unixResultIfDirty)) {
// it's not idempotent. but, if it converges, then it's likely a glitch that won't reoccur,
// so there's no need to make a bunch of noise for the user
PaddedCell result = PaddedCell.check(formatter, file, onceMore);
if (result.type() == PaddedCell.Type.CONVERGE) {
String finalResult = formatter.computeLineEndings(result.canonical(), file);
Files.write(file.toPath(), finalResult.getBytes(formatter.getEncoding()), StandardOpenOption.TRUNCATE_EXISTING);
} else {

The reason is to fix the exact scenario you're talking about. Users will only get the "turn PaddedCell on" mode when there's a real problem - CYCLE or DIVERGE. Then we can pass the buck and show the user that it's the rule's fault, and even save the day by offering some workaround.

I will provide on a different branch modified files for spotless gradle ... documentation change

You know greclipse & spotless better than anyone on the whole planet - whatever you think sounds good to me 👍 I'm gonna merge and release our bugfix. This issue is closed to my satisfaction, I'll let you close it when/if you agree :)

@nedtwigg
Copy link
Member

nedtwigg commented May 4, 2017

Published in libs 1.3.2 and plugin-gradle 3.3.2

@fvgh
Copy link
Member Author

fvgh commented May 4, 2017

Thanks for your time and explanation. #106 fixes the current problems for me. If I find problems with the GrEclipse, where I think it's worth using the minimum edit distance instead of canonical, I am afraid I will reissue the subject 😄

@fvgh fvgh closed this as completed May 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants