-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-98831: Use opcode metadata for stack_effect() #101704
Conversation
This halves the time to run the cases generator (most of the time goes into parsing the input).
8d5a6c4
to
bd19d79
Compare
Python/compile.c
Outdated
|
||
case LOAD_BUILD_CLASS: | ||
return 1; | ||
if (0 <= opcode && opcode < 256) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have these:
#define MAX_REAL_OPCODE 254
#define IS_WITHIN_OPCODE_RANGE(opcode) \
(((opcode) >= 0 && (opcode) <= MAX_REAL_OPCODE) || \
IS_PSEUDO_OPCODE(opcode))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll use opcode <= MAX_REAL_OPCODE
. I feel we don't need IS_PSEUDO_OPCODE()
because the switch will take care of that.
if (0 <= opcode && opcode < 256) { | ||
if (_PyOpcode_Deopt[opcode] != opcode) { | ||
// Specialized instructions are not supported. | ||
return PY_INVALID_STACK_EFFECT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be correct to return the stack effect of _PyOpcode_Deopt[opcode]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be correct to return the stack effect of
_PyOpcode_Deopt[opcode]
?
I thought so, but there are unit tests that insist that this is an error, and I didn't feel like changing the tests:
# All not defined opcodes
for code in set(range(256)) - set(dis.opmap.values()):
with self.subTest(opcode=code):
self.assertRaises(ValueError, stack_effect, code)
self.assertRaises(ValueError, stack_effect, code, 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, those tests probably predate specialization. We don't need to change this now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opcode.py goes to great lengths to hide the existence of specialized instructions -- they don't show in either opname
or opmap
, even though pseudo ops do exist there. So I think it's by design. Though we could argue for changing that design.
I found the algorithm in dis.py on L46-52. I think it's duplicated in Tools/build/generate_opcode_h.py on L145-151.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think dis.py:46-52 is actually allocating unused opcodes for the specialised instructions, while generate_opcode_h.py:145-151 is building the full deopt lookup table from opcode["_specializations"], plus mapping each normal opcode to itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, in generate_opcode_h.py it's L101-107. These two algorithms ought to match, otherwise results will be hilarious. :-)
Anyway, I'm going to merge now. On to figuring out whether mark_stacks() is in reach yet (I doubt it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes. Wow, we need to fix that.
* main: (82 commits) pythongh-101670: typo fix in PyImport_ExtendInittab() (python#101723) pythonGH-99293: Document that `Py_TPFLAGS_VALID_VERSION_TAG` shouldn't be used. (#pythonGH-101736) no-issue: Add Dong-hee Na as the cjkcodecs codeowner (pythongh-101731) pythongh-101678: Merge math_1_to_whatever() and math_1() (python#101730) pythongh-101678: refactor the math module to use special functions from c11 (pythonGH-101679) pythongh-85984: Remove legacy Lib/pty.py code. (python#92365) pythongh-98831: Use opcode metadata for stack_effect() (python#101704) pythongh-101283: Version was just released, so should be changed in 3.11.3 (pythonGH-101719) pythongh-101283: Fix use of unbound variable (pythonGH-101712) pythongh-101283: Improved fallback logic for subprocess with shell=True on Windows (pythonGH-101286) pythongh-101277: Port more itertools static types to heap types (python#101304) pythongh-98831: Modernize CALL and family (python#101508) pythonGH-101696: invalidate type version tag in `_PyStaticType_Dealloc` (python#101697) pythongh-100221: Fix creating dirs in `make sharedinstall` (pythonGH-100329) pythongh-101670: typo fix in PyImport_AppendInittab() (pythonGH-101672) pythongh-101196: Make isdir/isfile/exists faster on Windows (pythonGH-101324) pythongh-101614: Don't treat python3_d.dll as a Python DLL when checking extension modules for incompatibility (pythonGH-101615) pythongh-100933: Improve `check_element` helper in `test_xml_etree` (python#100934) pythonGH-101578: Normalize the current exception (pythonGH-101607) pythongh-47937: Note that Popen attributes are read-only (python#93070) ...
In order to do this, I had to change the generated opcode_metadata.h file to be sensitive to whether
NEED_OPCODE_TABLES
is defined -- if it is, it defines the functions and data with public linkage, if it isn't it only declares them. That way the file can be included in multiple places without worry about duplicating static data or functions.Also, the pushed/popped functions will now accept any integer opcode argument and return -1 for unknown opcodes.
This also changes the command line options for generate_cases.py to write all output files in a single run, and updates Makefile.pre.in accordingly. This halves the time to run the cases generator (since most of the time goes into parsing bytecodes.c).