Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid deprecation warning from Tomli #10238

Merged
merged 1 commit into from
Jan 26, 2022
Merged

Avoid deprecation warning from Tomli #10238

merged 1 commit into from
Jan 26, 2022

Conversation

hukkin
Copy link
Contributor

@hukkin hukkin commented Jul 30, 2021

Universal newlines are not TOML compatible.

See e.g. toml-lang/toml#837 for context. EDIT: this is no longer relevant to this PR given the new title and current changes.

@uranusjr
Copy link
Member

So if I understand this correctly, this only affect string literals? If that’s the case, this can probably not too urgent and can wait until 21.3 (since long string literals are not at all common in pyproject.toml right now). Otherwise we should backport this to 21.2.

@hukkin
Copy link
Contributor Author

hukkin commented Jul 30, 2021

This mostly just makes the parser throw an exception if CR characters are used as line endings, anywhere in the document, as that is invalid TOML. I don't think this is in any way urgent at all but is more technically correct, 😄 FWIW.

@uranusjr
Copy link
Member

Oh, so I’m thinking bckwards; the current implementation silently coerce CR (due to universal newlines), which is technically incorrect? If that’s the case, maybe we should keep doing that since banning CR all the sudden can be too disruptive to Windows users.

@hukkin
Copy link
Contributor Author

hukkin commented Jul 30, 2021

Yeah current implementation converts CRs (and CRLFs) to LFs. The correct behavior is to error on bare CR, because it is not an allowed line ending in TOML.

If that’s the case, maybe we should keep doing that

That should also work. Note that I have deprecated text file objects as tomli.load input (in favor of binary file objects) because TOML spec is not nearly as liberal about encoding and newlines as Python text file objects. For 100% compliancy TOML requires UTF-8 and disabling universal newlines, which are incredibly easy to forget to configure. Thus, if you want to avoid a deprecation warning in tomli v1.2.0+ (while still allowing spec incompatible CR line endings) you should move from

with open(pyproject_toml, encoding="utf-8") as f:
    pp_toml = tomli.load(f)

to something like

with open(pyproject_toml, encoding="utf-8") as f:
    pp_toml = tomli.loads(f.read())

since banning CR all the sudden can be too disruptive to Windows users.

Banning CR should not disrupt Windows users (Windows uses CRLF which will still work fine). My understanding is that CR as line ending is used by very old Mac OSes (pre Mac OS X (released 2001)) and other exotic legacy systems.

@pfmoore
Copy link
Member

pfmoore commented Jul 30, 2021

So I'm confused. On Windows, text files (which TOML files need to be considered as, because a huge part of the reason for using TOML as the format for Python packaging is that it's human readable/editable) can use LF or CRLF line endings, as text editors typically produce either. And of course projects developed on Windows being built on Unix could have pyproject.toml with either line ending.

I'd say that pyproject.toml must allow LF and CRLF interchangeably as line endings on all platforms. If that means we're now technically using a variant of TOML, then I think we need to explicitly note that fact in PEP 518.

If the prohibition is just about bare CR, not CRs that are part of a CRLF pair, then the above is irrelevant and you can ignore me (except as a warning to be careful how you present this information, if you want to avoid scaring people!)

@hukkin
Copy link
Contributor Author

hukkin commented Jul 30, 2021

If the prohibition is just about bare CR, not CRs that are part of a CRLF pair, then the above is irrelevant and you can ignore me (except as a warning to be careful how you present this information, if you want to avoid scaring people!)

This is only about bare CR. Sorry, didn't mean to scare anyone 😄

@uranusjr uranusjr added this to the 21.3 milestone Jul 30, 2021
@@ -58,7 +58,7 @@ def load_pyproject_toml(
has_setup = os.path.isfile(setup_py)

if has_pyproject:
with open(pyproject_toml, encoding="utf-8") as f:
with open(pyproject_toml, "rb") as f:
Copy link
Member

@pradyunsg pradyunsg Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid merging this without an OK from me -- I haven't had time to take a look at the issue on the TOML repository, but this is definitely not the direction to take IMO.

Comment on lines 61 to 62
with open(pyproject_toml, "rb") as f:
pp_toml = tomli.load(f)
Copy link
Contributor Author

@hukkin hukkin Aug 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with open(pyproject_toml, "rb") as f:
pp_toml = tomli.load(f)
with open(pyproject_toml, encoding="utf-8") as f:
pp_toml = tomli.loads(f.read())

BTW, if the "rb" change is rejected, I think we should commit this change to keep existing behavior while avoiding deprecation warning from Tomli.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Pushed this change.

@pradyunsg pradyunsg removed this from the 21.3 milestone Oct 2, 2021
@pradyunsg
Copy link
Member

To communicate with sufficient clarity: I do not intend to block the 21.3 release on this PR, since this doesn't seem to be a show-stopper issue for the release. To that end, I've gone ahead and dropped this from the release milestone. :)

@hukkin hukkin changed the title Disable universal newlines when reading TOML Avoid deprecation warning from Tomli Dec 5, 2021
@hukkin
Copy link
Contributor Author

hukkin commented Dec 5, 2021

@pradyunsg this PR in its current state no longer needs to update vendored Tomli so I've reverted that change. Am I right that we can skip the news entry with this one?

Also, I changed the title of this PR to reflect current changes.

@pradyunsg pradyunsg added the skip news Does not need a NEWS file entry (eg: trivial changes) label Dec 5, 2021
@pradyunsg
Copy link
Member

Yea, this works! :)

@hukkin hukkin mentioned this pull request Jan 26, 2022
@pradyunsg pradyunsg merged commit bbcbfc6 into pypa:main Jan 26, 2022
@hukkin hukkin deleted the toml-universal-newlines branch January 26, 2022 21:13
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
skip news Does not need a NEWS file entry (eg: trivial changes)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants