-
Notifications
You must be signed in to change notification settings - Fork 195
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement a Python PEG parser in Rust (#566)
This massive PR implements an alternative Python parser that will allow LibCST to parse Python 3.10's new grammar features. The parser is implemented in Rust, but it's turned off by default through the `LIBCST_PARSER_TYPE` environment variable. Set it to `native` to enable. The PR also enables new CI steps that test just the Rust parser, as well as steps that produce binary wheels for a variety of CPython versions and platforms. Note: this PR aims to be roughly feature-equivalent to the main branch, so it doesn't include new 3.10 syntax features. That will be addressed as a follow-up PR. The new parser is implemented in the `native/` directory, and is organized into two rust crates: `libcst_derive` contains some macros to facilitate various features of CST nodes, and `libcst` contains the `parser` itself (including the Python grammar), a `tokenizer` implementation by @bgw, and a very basic representation of CST `nodes`. Parsing is done by 1. **tokenizing** the input utf-8 string (bytes are not supported at the Rust layer, they are converted to utf-8 strings by the python wrapper) 2. running the **PEG parser** on the tokenized input, which also captures certain anchor tokens in the resulting syntax tree 3. using the anchor tokens to **inflate** the syntax tree into a proper CST Co-authored-by: Benjamin Woodruff <github@benjam.info>
- Loading branch information
Showing
120 changed files
with
17,117 additions
and
477 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
[target.x86_64-apple-darwin] | ||
rustflags = [ | ||
"-C", "link-arg=-undefined", | ||
"-C", "link-arg=dynamic_lookup", | ||
] | ||
|
||
[target.aarch64-apple-darwin] | ||
rustflags = [ | ||
"-C", "link-arg=-undefined", | ||
"-C", "link-arg=dynamic_lookup", | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,14 @@ | ||
root = true | ||
|
||
[*.{py,pyi,toml,md}] | ||
[*.{py,pyi,rs,toml,md}] | ||
charset = "utf-8" | ||
end_of_line = lf | ||
indent_size = 4 | ||
indent_style = space | ||
insert_final_newline = true | ||
trim_trailing_whitespace = true | ||
max_line_length = 88 | ||
|
||
[*.rs] | ||
# https://github.com/rust-dev-tools/fmt-rfcs/blob/master/guide/guide.md | ||
max_line_length = 100 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -106,6 +106,7 @@ exclude = | |
.pyre, | ||
__pycache__, | ||
.tox, | ||
native, | ||
|
||
max-complexity = 12 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,7 @@ | ||
{ | ||
"exclude": [ | ||
".*\/native\/.*" | ||
], | ||
"source_directories": [ | ||
"." | ||
], | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,3 +16,4 @@ libcst/_version.py | |
.hypothesis/ | ||
.pyre_configuration | ||
.python-version | ||
target/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,4 @@ | ||
include README.rst LICENSE CODE_OF_CONDUCT.md CONTRIBUTING.md requirements.txt requirements-dev.txt docs/source/*.rst libcst/py.typed | ||
|
||
include native/Cargo.toml | ||
recursive-include native * |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.