-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3.10+ support #285
Comments
I haven't discussed with the other maintainers yet, but here are some thoughts:
|
Oops, looks like I mis-read the PEP. It seems the old parser will be removed in 3.10 and non-LL(1) constructs may be added then. So we will need to address this in order to support Python 3.10. |
Do you know a good PEG parser for python that also constructs a CST, I'm not much familiar with PEG :/ |
No, I doubt any such thing exists. We, or someone, would have to build it. I just sent a message to the python-dev thread about PEP 617 inquiring about the potential for reusing some of pegen (the new parser in PEP 617) for this: https://mail.python.org/archives/list/python-dev@python.org/thread/HOZ2RI3FXUEMAT4XAX4UHFN4PKG5J5GR/ |
It seems like one option would be to use https://github.com/gvanrossum/pegen as a starting point for a new Python 3.10 compatible parser. |
Yes, that would be a great idea. If there is anything I can help you folks, just let me know. |
I'm thinking about augmenting stdlib ast module to have some lib2to3 features (because lib2to3 is going away). If this is of interest to you: |
This issue is a blocker for us:
Are there any updates? Is there anything I can help with? |
No updates yet apart from some of us hacking around on our spare time to get LibCST onto a peg parser. I'll update here as soon as work starts on this in earnest. I'm hoping for good news in a month or so. |
Good news! :) I'm working on this as part of my main job now. The current status is: in my branch there's a rust implementation of a peg parser for a small part of the python grammar, but complete with whitespace handling (thanks to the awesome work on whitespace parsing and tokenization by @bgw). It can successfully roundtrip simple functions like this without any extra string allocations (the built CST shares memory with the input source string), and it seems to be real fast (it will be interesting to see if this holds as I add more and more of the grammar). I've opted for implementing the heavy lifting in Rust, and then expose the resulting CST as Python objects as part of a relatively high level interface that will be compatible with https://github.com/Instagram/LibCST/blob/master/libcst/_parser/entrypoints.py There's still lots of work to do, for example:
I'll keep updating this issue as I go along |
Here's the current status of the rewrite: I think I have 99% of the Python 3.8 grammar implemented, and the parser can roundtrip (parse -> serialize losslessly; i.e. input bytes are same as output bytes) all of the LibCST python implementation (i.e. this repo). This is a big milestone but there's still some work to be done. But first, some details in case you want to play around: The code currently lives still in https://github.com/zsol/LibCST/tree/parser/native, you should be able to
Here's what I know still needs to be done:
And then on top of this it'd be nice to look into:
|
I've setup an example using github actions here: lpetre@1089e28 Sample run: https://github.com/lpetre/LibCST/runs/3293881094 |
Hi, What is the advice for libraries that depend on this project? Can we expect 3.10 support in the near future or should people start looking for alternatives? It seems like Black has solved this with their lib2to3 fork. |
I plan to release a version with opt-in 3.10 support by the end of the year. If that goes well, a new release in January will have 3.10 support by default. |
Progress update: I have a working CI job that produces binary wheels for LibCST with the new rust-based parser for any combination of python (3.6, 3.7, 3.8, 3.9, 3.10) & (macos x86_64, macos arm64, 32bit linux, 64bit linux, 32bit windows, 64bit windows). Example artifacts at https://github.com/zsol/LibCST/suites/4682738347/artifacts/127571774 (it's a zipfile with wheels in it).
To see the new parser in action. Note: match statement is not implemented yet in these. |
#566 proposes to merge the Rust-based parser. After merging I'll follow up with (much simpler) PRs to implement match statement, and parenthesized context managers. |
Seems like the README is in order to be updated! 🎉 |
Note that for 3.10 (and 3.11) support, you still need to set |
|
|
|
#929 is landing in a moment, after which I'll release it with a major version bump 🎉 |
PEP 617 (if it is accepted, and it probably will) is out and I'm wondering if LibCST's underlying parser is capable of parsing PEG grammar. With 3.10, the LL(1) restriction on the grammar will be deferred and this means that
lib2to3.pgen2
won't be able to parse new changes on the python grammar. I'm not sure about internals of LibCST but from what I have seen in readme that it uses something that bases onlib2to3.pgen2
. Does LibCST will continue to support newer python versions and their grammar? (We are currently using lib2to3 as our refactoring tool on unimport but we might need to migrate another tool to support 3.10+ which is why I am asking)The text was updated successfully, but these errors were encountered: