Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: update discheck #305

Merged
merged 2 commits into from
Jan 18, 2023
Merged

fix: update discheck #305

merged 2 commits into from
Jan 18, 2023

Conversation

Saransh-cpp
Copy link
Member

Description

Kindly take a look at CONTRIBUTING.md.

Please describe the purpose of this pull request. Reference and link to any relevant issues or pull requests.

Checklist

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't any other open Pull Requests for the required change?
  • Does your submission pass pre-commit? ($ pre-commit run --all-files or $ nox -s lint)
  • Does your submission pass tests? ($ pytest or $ nox -s tests)
  • Does the documentation build with your changes? ($ cd docs; make clean; make html or $ nox -s docs)
  • Does your submission pass the doctests? ($ xdoctest ./src/vector or $ nox -s doctests)

Before Merging

  • Summarize the commit messages into a brief review of the Pull request.

@codecov-commenter
Copy link

Codecov Report

Base: 83.73% // Head: 83.73% // No change to project coverage 👍

Coverage data is based on head (39f68cd) compared to base (182743d).
Patch has no changes to coverable lines.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #305   +/-   ##
=======================================
  Coverage   83.73%   83.73%           
=======================================
  Files          96       96           
  Lines       10682    10682           
=======================================
  Hits         8945     8945           
  Misses       1737     1737           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@jpivarski jpivarski linked an issue Jan 18, 2023 that may be closed by this pull request
@Saransh-cpp
Copy link
Member Author

Thank you so much, @jpivarski! :)

@Saransh-cpp Saransh-cpp merged commit 6a87f9f into scikit-hep:main Jan 18, 2023
@Saransh-cpp Saransh-cpp deleted the fix-dis branch January 18, 2023 17:22
@jpivarski
Copy link
Member

jpivarski commented Jan 18, 2023

Uncompyle6 just made a new release: 3.9.0 (December 22, 2022). The most recent version before that was more than a year ago: 3.8.0 (October 29, 2021).

This release contained two trivial changes to the way bytecode is presented (without any changes to the bytecode itself because it's all Python 3.8, anyway).

You found the first one: rocky/python-uncompyle6#381 changed the spelling of "ret_expr" to "return_expr".

The second one changed the structure of bytecode parsing trees, but only by removing a wrapping layer: "pos_arg" → "expr" → ... became just "expr" → .... The commit said that it's addressing PyPy support: rocky/python-uncompyle6@2ed211e but the change came in the non-PyPy else clause:

- # number of apply equiv arguments:
- nak = (len(opname_base) - len("CALL_METHOD")) // 3
- rule = (
-     "call ::= expr "
-     + ("pos_arg " * args_pos)
-     + ("kwarg " * args_kw)
-     + "expr " * nak
-     + opname
- )
+ if opname == "CALL_METHOD_KW":
+     args_kw = token.attr
+     rules_str = """
+          expr ::= call_kw_pypy37
+          pypy_kw_keys ::= LOAD_CONST
+     """
+     self.add_unique_doc_rules(rules_str, customize)
+     rule = (
+         "call_kw_pypy37 ::= expr "
+         + ("expr " * args_kw)
+         + " pypy_kw_keys "
+         + opname
+     )
+ else:
+     args_pos, args_kw = self.get_pos_kw(token)
+     # number of apply equiv arguments:
+     nak = (len(opname_base) - len("CALL_METHOD")) // 3
+     rule = (
+         "call ::= expr "
+         + ("expr " * args_pos)
+         + ("kwarg " * args_kw)
+         + "expr " * nak
+         + opname
+     )

(See that the ("pos_arg " * args_pos) has become ("expr " * args_pos).) This change was probably unintentional, but all that we care about is that we can distinguish between positional arguments and keyword arguments, and that distinction is preserved.

For a sample function (vector._compute.planar.add.xy_xy), the difference that this makes is

With uncompyle6 3.8.0:

call (7)
     0. expr
         L.  39         0  LOAD_GLOBAL              xy_xy
     1. pos_arg
        expr
                            2  LOAD_FAST                'lib'
     2. pos_arg
        expr
                            4  LOAD_FAST                'x1'
     3. pos_arg
        expr
                            6  LOAD_FAST                'y1'
     4. pos_arg
        expr
            call (5)
                 0. expr
                    attribute37 (2)
                         0. expr
                                            8  LOAD_GLOBAL              x
                         1.                10  LOAD_METHOD              rhophi
                 1. pos_arg
                    expr
                                       12  LOAD_FAST                'lib'
                 2. pos_arg
                    expr
                                       14  LOAD_FAST                'rho2'
                 3. pos_arg
                    expr
                                       16  LOAD_FAST                'phi2'
                 4.                18  CALL_METHOD_3         3  ''
     5. pos_arg
        expr
            call (5)
                 0. expr
                    attribute37 (2)
                         0. expr
                                           20  LOAD_GLOBAL              y
                         1.                22  LOAD_METHOD              rhophi
                 1. pos_arg
                    expr
                                       24  LOAD_FAST                'lib'
                 2. pos_arg
                    expr
                                       26  LOAD_FAST                'rho2'
                 3. pos_arg
                    expr
                                       28  LOAD_FAST                'phi2'
                 4.                30  CALL_METHOD_3         3  ''
     6.                32  CALL_FUNCTION_5       5  ''

With uncompyle6 3.9.0:

call (7)
     0. expr
         L.  39         0  LOAD_GLOBAL              xy_xy
     1. pos_arg
        expr
                            2  LOAD_FAST                'lib'
     2. pos_arg
        expr
                            4  LOAD_FAST                'x1'
     3. pos_arg
        expr
                            6  LOAD_FAST                'y1'
     4. pos_arg
        expr
            call (5)
                 0. expr
                    attribute37 (2)
                         0. expr
                                            8  LOAD_GLOBAL              x
                         1.                10  LOAD_METHOD              rhophi
                 1. expr
                                   12  LOAD_FAST                'lib'
                 2. expr
                                   14  LOAD_FAST                'rho2'
                 3. expr
                                   16  LOAD_FAST                'phi2'
                 4.                18  CALL_METHOD_3         3  ''
     5. pos_arg
        expr
            call (5)
                 0. expr
                    attribute37 (2)
                         0. expr
                                           20  LOAD_GLOBAL              y
                         1.                22  LOAD_METHOD              rhophi
                 1. expr
                                   24  LOAD_FAST                'lib'
                 2. expr
                                   26  LOAD_FAST                'rho2'
                 3. expr
                                   28  LOAD_FAST                'phi2'
                 4.                30  CALL_METHOD_3         3  ''
     6.                32  CALL_FUNCTION_5       5  ''

(The expr is no longer wrapped by a single-argument pos_arg.)

The fix that I added makes our test insensitive to these two changes in uncompyle6. Now that they have a new version out, it might become possible to use a later version of Python 3.8 or even a later version of Python, but I haven't tested it. When Python 3.8 gets end-of-lifed, I'll revisit this.

It doesn't look like a lot of new _compute functions are being added. This disassemble test is to guard against _compute functions being added in such a way that would limit our ability to add backends in the future. If no one is adding _compute functions or modifying them, that's not much of a danger.

It took me so long to write this note that the PR is already merged now. I hadn't noticed that auto-merge was on. You merged it manually. Thanks!

@rocky
Copy link

rocky commented Jan 19, 2023

Uncompyle6 just made a new release: ...
You found the first one: rocky/python-uncompyle6#381 changed the spelling of "ret_expr" to "return_expr".

Let me explain. That change was to improve correspondence with the term Python's AST uses "Return". Note that a decompilation grammar the Python grammar and AST will always be different, but to the extent they can be similar, that is probably a good thing. The lower-casing and underscores is intentional to indicate that these two things are different.

It is possible that "return" may get used in the future to conform to Python's terminology better.

The second one changed the structure of bytecode parsing trees...

(See that the ("pos_arg " * args_pos) has become ("expr " * args_pos).) This change was probably unintentional,

You are correct. It was unintentional. Thanks for pointing this out.

The fix that I added makes our test insensitive to these two changes in uncompyle6.

Although in the current master, the parse tree is now more like the way it was in in 3.8.0, I am not planning on a release of uncompyle6 in the near future.

@Saransh-cpp
Copy link
Member Author

Oops, I merged too quickly. Thank you for the amazingly detailed explanations! I don't think any more changes are required here? Unless we want to switch to the master branch of uncompyle6 that adds back "pos_args"?

@rocky
Copy link

rocky commented Jan 19, 2023

You all ultimately should decide what's best for you. I will however try to give you input data, hopefully for you to figure out what is best for you.

For Python 3.7 and Python 3.8, there are in fact two decompilers: uncompyle6 which you are using, and decompyle3 which tries to clean up problems with uncompyle6 by focusing on those two specific versions. There isn't a PyPI release of decompyle3, and there isn't likely to be one. See the paragraph after next for why.

So what happens is changes sometimes get made in one repository and then get ported to the other. The decompyle3 to uncompyle3 direction is often because that I tend to develop new stuff there because it is smaller and it is easier to make cleaner. The uncompyle6 to decompyle3 direction is often because people use uncompyle6 more - possibly because there is a PyPI package for it and it covers more versions of Python and is older. So bugs are reported there and are sometimes fixed there first. But in truth, if you are only interested in Python 3.7 or 3.8 decompyle3 does a better job at decompilation currently. (When decompyle3 was started, that was not true).

If this isn't complicated enough, there is yet another decompiler version which focuses on 3.8 and now 3.9 a little. In that, I am using up front more sophisticated control flow detection via https://github.com/rocky/python-control-flow. The 3.8-3.9 decompiler however is private and is likely to remain that way for a while.

As I was trying to sync up decompyle3 with the with the changes to uncompyle6, I now have come to understand that possibly the removal of pos_args was probably again motivated by a slight desire to match Python's AST closer. Python's AST does not have a notion of "positional args", and does have a base class called ast.expr .

In sum, from the decompiler development side, as with so many things, there is no clear winner here - pos-args is nicer from the standpoint of reading the parse tree. And as I said, it is possible that return_expr will get renamed to return to better match Python's AST, since I have a desire to try to make this simpler and clearer in the hope that people might understand this mirky stuff more clearly.

@jpivarski
Copy link
Member

For @Saransh-cpp: there's no need to reopen or unmerge this PR: it's done. I was just in the process of documenting the explanations of the changes.

For @rocky: thanks for the heads-up about upcoming changes. When they come, we'll adjust. This test asserts on just about everything, so if the kind of return_expr becomes return, that will be easy to recognize and we'll loosen our test. I like to try to find the change in uncompyle6 to justify the change, though; that's why I reported PR and commit numbers here.

Just to be clear, this project only uses uncompyle6 in a test, since we have a suite of functions that need to remain "simple" (simplicity is defined by the tests) so that we don't prevent ourselves from being able to add new array backends in the future. That test can run in any one version of Python that isn't end-of-lifed. Vector can run in Python versions that uncompyle6 doesn't support. (For Python 3.8, we have until October 14, 2024!)

(I had another potential project that would have used uncompyle6 in a more extensive way, but that one never became a finished product.)

@Saransh-cpp Saransh-cpp added this to the v1.0.0 milestone Feb 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failing Disassemble check
4 participants