-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] The Python grammars in this repo are in a terrible state! #3539
Comments
The python/python3 grammar has a JavaScript port, but it is ripping slow in comparison with the CSharp and Java ports.
The JavaScript port is typically slower than CSharp by a factor of 5, not 100x's. For example, abb:
The JavaScript port should not be this slow in comparison to the CSharp port. The parser traces of the closure compuations are identical. I do notice several; issues:
|
Ther perf data was mislabeled. Here's what it should be. (trperf needs to be fixed.)
What I'm thinking is that the parser reads too far ahead per |
* Fixes for #3539 Typescript cannot work because the declaration of the constructor for CommonToken() is wrong. * Fix python/python3/TypeScript * Remove grammars that only add to confusion. * Update Dart port, but this cannot work because Antlr 4.13.0 Dart runtime missing types. * Add Dart port for python3 grammar. This base class for Dart works, but requires changes to Antlr. antlr/antlr4#4321 * Updates, but incomplete, for Cpp. * Additional changes. * Updates to get Cpp target to link. * Changes for working Cpp target. * Add in Cpp to workflow. * Updates. * Adjust to for rebuild. * Remove bothersome tabs from source code. * Rename tests into "small" and "large" tests to reflect that "slow parsers work on the small test suite". * Fix names of test directories in desc.xml. * Update for Go target. Remove python3-cpp. * Getting Go target compiling--does not work yet. * Fixes for Go target of python3 grammar. This port works but only if the Go runtime is fixed, and "Virt" is assigned in the driver. See antlr/antlr4#4343 antlr/antlr4#4342 * Fix desc.xml
* Fixes for antlr#3539 Typescript cannot work because the declaration of the constructor for CommonToken() is wrong. * Fix python/python3/TypeScript * Remove grammars that only add to confusion. * Update Dart port, but this cannot work because Antlr 4.13.0 Dart runtime missing types. * Add Dart port for python3 grammar. This base class for Dart works, but requires changes to Antlr. antlr/antlr4#4321 * Updates, but incomplete, for Cpp. * Additional changes. * Updates to get Cpp target to link. * Changes for working Cpp target. * Add in Cpp to workflow. * Updates. * Adjust to for rebuild. * Remove bothersome tabs from source code. * Rename tests into "small" and "large" tests to reflect that "slow parsers work on the small test suite". * Fix names of test directories in desc.xml. * Update for Go target. Remove python3-cpp. * Getting Go target compiling--does not work yet. * Fixes for Go target of python3 grammar. This port works but only if the Go runtime is fixed, and "Virt" is assigned in the driver. See antlr/antlr4#4343 antlr/antlr4#4342 * Fix desc.xml
I have two suggestions:
|
I'm going to close the issue I opened because the python grammars are actually in an "okay state", and they are getting better with @RobEin on top of it.
This cannot work given the constraints. The grammar has to be split because the lexer has modes. Antlr does not accept a combined grammar with lexer modes. Further, the lexer needs to be in "target agnostic format" because if it were not, there would be forking of the .g4s for every target type. This is exactly how we got into a bad state to start: people would modify one target and not maintain the other targets. With target agnostic, there is one grammar for all targets.
That is the plan. |
There are 10 python grammars in this repo. I don't know where to even begin on what to pick and maintain because they are all terrible.
The plan to reorganize
The text was updated successfully, but these errors were encountered: