Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: write file using existing newlines (fixes #115) #117

Merged
merged 2 commits into from
Jun 18, 2022

Conversation

rasa
Copy link

@rasa rasa commented Jun 13, 2022

Here's my proposed solution. It should work in all cases, as it will default to the system default, if the file has different line endings (file.newlines returns a tuple).

I manually tested it with Unix, Dos, and Mac formatted files, and they all retained their line endings.

Fixes #115.

Copy link
Owner

@DanielNoord DanielNoord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this fully solves the issue. If a repo contains files that are explicitly kept with CRLF and the tool is run on Linux in a CI it would still try to convert them based on the os newline.

Could you also add a test for this? I think you can copy one of the normal functional test but only need to make sure the newline is not LF. That should trigger a failure on the Windows or Linux tests.

@rasa
Copy link
Author

rasa commented Jun 13, 2022

Yes, I will add a test. Do you want me to add MacOS to the test matrix? A simple example is at
https://github.com/psf/black/blob/6140f04fe17ae7c542a29c207c9a45f6200f4e2c/.github/workflows/test.yml
which has evolved into
https://github.com/psf/black/blob/HEAD/.github/workflows/test.yml

@DanielNoord
Copy link
Owner

Yes, I will add a test. Do you want me to add MacOS to the test matrix? A simple example is at
https://github.com/psf/black/blob/6140f04fe17ae7c542a29c207c9a45f6200f4e2c/.github/workflows/test.yml
which has evolved into
https://github.com/psf/black/blob/HEAD/.github/workflows/test.yml

I think for this particular use case testing against Windows and Linux is enough, as they have different standard line endings and thus test the behaviour.

At some point I'll probably add macOS, but that's likely a task for another PR!

@codecov
Copy link

codecov bot commented Jun 13, 2022

Codecov Report

Merging #117 (acc1440) into main (77b36a2) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##              main      #117   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           20        20           
  Lines          471       475    +4     
=========================================
+ Hits           471       475    +4     
Impacted Files Coverage Δ
pydocstringformatter/run.py 100.00% <100.00%> (ø)

@rasa
Copy link
Author

rasa commented Jun 13, 2022

So this pre-commit hook is futzing my test files. I had committed these files in tests/data/newlines with the following newline formats:

dos.py       : dos
mac.py       : mac
mixed.py     : dos/unix
none.py      : none
unix.py      : unix

but the pre-commit hook rewrote them using these line endings:

dos.py       : dos
mac.py       : mac/unix
mixed.py     : dos/unix
none.py      : unix
unix.py      : unix

All the tests will still succeed, but this breaks the tests, imo. Should I dynamically write these files out in the test's setup?

.gitignore Outdated Show resolved Hide resolved
@@ -0,0 +1,6 @@
var = """A multi-line
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually thinking of just copying one of the existing tests with an .out file and adding some of the new lines in both the input and output files.
That would remove the need for the extra test file.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me better explain myself.

By using the existing testing framework of input and .out files with a docstring that requires minor formatting (for example moving the quotes of a multi-line docstring to a new line we can fully check if we are inserting all new lines of the correct type.
Everything under https://github.com/DanielNoord/pydocstringformatter/tree/main/tests/data/format is ran automatically.
So tests/data/format/newline/... would be a good place to put these.

Then copying and combining https://github.com/DanielNoord/pydocstringformatter/blob/main/tests/data/format/multi_line/module_docstring.py and https://github.com/DanielNoord/pydocstringformatter/blob/main/tests/data/format/ending_quote/module_docstring.py but changing the newlines would be a good fit as it is a very simple but clear change which would test all forms of new line insertions/removal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes! That sounds good, but perhaps you could implement? I'm a little lost, as I've only just started with this code base.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be okay for you if I set up the files like you currently did but then not do the line endings? I must confess I have no idea how to replicate those on my own Mac 😅 😄

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rasa I have added the pre-commit ignore and added the test files like you had them previously.

The only thing necessary for these tests to make sense is to add the new lines like you explained in #117 (comment).

@DanielNoord
Copy link
Owner

So this pre-commit hook is futzing my test files. I had committed these files in tests/data/newlines with the following newline formats:

dos.py       : dos
mac.py       : mac
mixed.py     : dos/unix
none.py      : none
unix.py      : unix

but the pre-commit hook rewrote them using these line endings:

dos.py       : dos
mac.py       : mac/unix
mixed.py     : dos/unix
none.py      : unix
unix.py      : unix

All the tests will still succeed, but this breaks the tests, imo. Should I dynamically write these files out in the test's setup?

You can add the directory with the tests for this specific issue into the ignore of the pre-commit hook.

exclude: "tests/data/format/whitespace_stripper|tests/data/format/quotes_type|tests/test_config.py"

It's a regular expresion so adding |dir/name/file/ should work!

Copy link
Author

@rasa rasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DanielNoord Are you requesting this section be changed? All the tests pass, so I'm not sure what changes are needed.

@DanielNoord
Copy link
Owner

@DanielNoord Are you requesting this section be changed? All the tests pass, so I'm not sure what changes are needed.

Sorry, this is probably turning out more difficult than you had imagined.

I have pushed a commit to your branch which includes correct tests for all 5 cases. You don't need to make any changes there. However, the mixed test case doesn't pass as the current solution is unable to determine the right newline. This means we probably need to rethink the solution.

@rasa
Copy link
Author

rasa commented Jun 14, 2022

However, the mixed test case doesn't pass as the current solution is unable to determine the right newline. This means we probably need to rethink the solution.

Thanks for the help. Your solution is much simpler.

Re the mixed use case, files with non-standard newlines (\n\r, \r\n\n, etc.) are either text files that got corrupted in some way (such as older versions of Perforce) or they're not UTF-8 text files at all, but some other encoding, or even binary files. So let's just delete the use case.

Of course, a better solution would be to completely ignore processing these files. That way, non-UTF-8 files would never get corrupted. Thoughts?

@rasa
Copy link
Author

rasa commented Jun 14, 2022

Actually, I don't think this test framework is going to work for the "none" case, as the system will write the file using the system default newline character, so on Linux, \n, and on Windows, \r\n, and none.py.out has \ns hardcoded.

@rasa
Copy link
Author

rasa commented Jun 14, 2022

Ok, it's a little hacky, but all tests pass on Windows now. I think we can put this to bed. Agreed?

@DanielNoord
Copy link
Owner

Ok, it's a little hacky, but all tests pass on Windows now. I think we can put this to bed. Agreed?

Hmm, yeah I'm not sure about the mixed and none examples. I think you might have been right before: they are so specific (and broken) that perhaps we shouldn't support them. Handling the different platforms by itself is already good enough for most users I think?

@rasa
Copy link
Author

rasa commented Jun 15, 2022

Handling the different platforms by itself is already good enough for most users I think?

I'd leave the none/mixed tests in, even though they are edge cases, because you can never have too many tests (at least those that run instantaneously), and they do work. If we wanted, we could even add more extreme edge cases, such as utf8-bom, utf16, and non-text files with \rs and \ns in them.

Co-authored-by: Ross Smith II <ross@smithii.com>
@DanielNoord
Copy link
Owner

Thanks a lot @rasa!

I cleaned up the commit history and made some final changes. I'm going to make an immediate patch release with this included :)

@DanielNoord DanielNoord merged commit 18b3a22 into DanielNoord:main Jun 18, 2022
@DanielNoord DanielNoord added the bug Something isn't working label Jun 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

On Windows, spurious ^M characters are added to Unix formatted files
2 participants