Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-93382: Cache result of PyCode_GetCode in codeobject #93383

Merged
merged 7 commits into from
Jun 3, 2022

Conversation

Fidget-Spinner
Copy link
Member

@Fidget-Spinner Fidget-Spinner commented May 31, 2022

Fixes gh-93382.

@Fidget-Spinner
Copy link
Member Author

Fidget-Spinner commented May 31, 2022

@markshannon do tell me if you feel that it isn't worth making PyCodeObject larger for this use case. I think it's worth it because it restores O(1) co_code access. Currently it's O(kn).

Benchmarks aren't available yet. I'll try to run it by the end of this week if no one beats me to it.

@AA-Turner AA-Turner added performance Performance or resource usage interpreter-core (Objects, Python, Grammar, and Parser dirs) labels May 31, 2022
@Fidget-Spinner
Copy link
Member Author

Fidget-Spinner commented Jun 1, 2022

Quick n dirty microbenchmark:

py -3.10 -m timeit -s "import typing; c = typing.get_type_hints.__code__" "c.co_code"
10000000 loops, best of 5: 27.4 nsec per loop

py -3.11 -m timeit -s "import typing; c = typing.get_type_hints.__code__" "c.co_code"
500000 loops, best of 5: 555 nsec per loop

# This branch, release build without PGO
 .\PCbuild\amd64\python.exe -m timeit -s "import typing; c = typing.get_type_hints.__code__" "c.co_code"
10000000 loops, best of 5: 28.6 nsec per loop

@Fidget-Spinner
Copy link
Member Author

No real change in pyperformance https://gist.github.com/Fidget-Spinner/55cd7ee60faaa403ddc970ec31401d35. Seems like weird build differences/noise only.

@Fidget-Spinner Fidget-Spinner requested a review from sweeneyde June 3, 2022 08:33
…e-93382.Jf6gAj.rst

Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
@Fidget-Spinner
Copy link
Member Author

Thanks for the review Kumar!

@Fidget-Spinner
Copy link
Member Author

I'm completely confused. How are tests relating to the peepholer now failing after we updated the news entry...?

@sweeneyde
Copy link
Member

Without looking at it too carefully yet, I'm guessing the cache needs to be invalidated when the bytecode changes here.

@Fidget-Spinner
Copy link
Member Author

Without looking at it too carefully yet, I'm guessing the cache needs to be invalidated when the bytecode changes here.

I think you're spot on! Thank you.

@Fidget-Spinner
Copy link
Member Author

Since there are no perf regressions, I'm merging this.

@Fidget-Spinner Fidget-Spinner merged commit d52ffc1 into python:main Jun 3, 2022
@Fidget-Spinner Fidget-Spinner deleted the cache_co_code branch June 3, 2022 16:41
@Fidget-Spinner
Copy link
Member Author

Fidget-Spinner commented Jun 3, 2022

@pablogsal can I backport this PR to 3.11 please?

In essence, 3.11 beta 1 made accessing co_code O(kn) where it was previously O(1) in prior versions. I want to fix that performance regression because it affects anything that touches co_code (e.g coverage.py).

See e.g #93383 (comment) for microbenchmark. For even just a medium-sized code object, there's a nearly 20x slowdown from 3.10 to 3.11. I can't imagine how bad it would be for larger code in the wild.

This change affects the code object struct. I'm not sure if that will affect your decision.

@Fidget-Spinner
Copy link
Member Author

@kumaraditya303 I have one question about test_frozenmain and I think you have more expertise in this area than I do:

Seems like test_frozenmain was regenerated for this PR. Does this mean or indicate anything? I'm afraid that I inadvertently changed bytecode representation.

@kumaraditya303
Copy link
Contributor

Marshal format is not deterministic and it seems to be a form of #73894 issue.

Here is the diffoscope diff:

@@ -1,10 +1,10 @@
 00000000: ae0d 0d0a 0000 0000 4c71 5d62 bc01 0000  ........Lq]b....
 00000010: e300 0000 0000 0000 0000 0000 0008 0000  ................
-00000020: 0000 0000 0073 a000 0000 9700 6400 6401  .....s......d.d.
+00000020: 0000 0000 00f3 a000 0000 9700 6400 6401  ............d.d.
 00000030: 6c00 5a00 6400 6401 6c01 5a01 0200 6502  l.Z.d.d.l.Z...e.
 00000040: 6402 ab01 0000 0000 0000 0000 0100 0200  d...............
 00000050: 6502 6403 6500 6a03 0000 0000 0000 0000  e.d.e.j.........
 00000060: ab02 0000 0000 0000 0000 0100 0200 6501  ..............e.
 00000070: 6a04 0000 0000 0000 0000 ab00 0000 0000  j...............
 00000080: 0000 0000 6404 1900 0000 0000 0000 0000  ....d...........
 00000090: 5a05 6405 4400 5d17 5a06 0200 6502 6406  Z.d.D.].Z...e.d.
@@ -18,21 +18,21 @@
 00000110: 6162 6c65 da0f 7573 655f 656e 7669 726f  able..use_enviro
 00000120: 6e6d 656e 74da 1163 6f6e 6669 6775 7265  nment..configure
 00000130: 5f63 5f73 7464 696f da0e 6275 6666 6572  _c_stdio..buffer
 00000140: 6564 5f73 7464 696f 7a07 636f 6e66 6967  ed_stdioz.config
 00000150: 207a 023a 2029 07da 0373 7973 da11 5f74   z.: )...sys.._t
 00000160: 6573 7469 6e74 6572 6e61 6c63 6170 69da  estinternalcapi.
 00000170: 0570 7269 6e74 da04 6172 6776 da0b 6765  .print..argv..ge
-00000180: 745f 636f 6e66 6967 7372 0200 0000 da03  t_configsr......
+00000180: 745f 636f 6e66 6967 7372 0300 0000 da03  t_configsr......
 00000190: 6b65 79a9 00f3 0000 0000 fa1b 5072 6f67  key.........Prog
 000001a0: 7261 6d73 2f74 6573 745f 6672 6f7a 656e  rams/test_frozen
 000001b0: 6d61 696e 2e70 79fa 083c 6d6f 6475 6c65  main.py..<module
-000001c0: 3e72 1100 0000 0100 0000 738c 0000 00f8  >r........s.....
+000001c0: 3e72 1200 0000 0100 0000 738c 0000 00f8  >r........s.....
 000001d0: f006 0001 0b80 0a80 0a80 0ad8 0018 d000  ................
 000001e0: 18d0 0018 d000 18e0 0005 8005 d006 1ad4  ................
 000001f0: 001b d000 1bd8 0005 8005 806a 9023 9428  ...........j.#.(
 00000200: d400 1bd0 001b d809 26d0 091a d409 26d4  ........&.....&.
 00000210: 0928 a818 d409 3280 06f0 0206 0c02 f000  .(....2.........
 00000220: 0701 2af0 0007 012a 8043 f00e 0005 0a80  ..*....*.C......
 00000230: 45d0 0a28 9043 d00a 28d0 0a28 9836 a023  E..(.C..(..(.6.#
 00000240: 9c3b d00a 28d0 0a28 d404 29d0 0429 d004  .;..(..(..)..)..
-00000250: 29f0 0f07 012a f000 0701 2a72 0f00 0000  )....*....*r....
+00000250: 29f0 0f07 012a f000 0701 2a72 1000 0000  )....*....*r....

cc @methane

@kumaraditya303
Copy link
Contributor

I'm afraid that I inadvertently changed bytecode representation.

No, I don't think so because I tried comparing deepfreeze output of test_frozenmain and it did not show any difference.

@pablogsal
Copy link
Member

@pablogsal can I backport this PR to 3.11 please?

In essence, 3.11 beta 1 made accessing co_code O(kn) where it was previously O(1) in prior versions. I want to fix that performance regression because it affects anything that touches co_code (e.g coverage.py).

See e.g #93383 (comment) for microbenchmark. For even just a medium-sized code object, there's a nearly 20x slowdown from 3.10 to 3.11. I can't imagine how bad it would be for larger code in the wild.

This change affects the code object struct. I'm not sure if that will affect your decision.

Can you please check if this fixes this issue:

nedbat/coveragepy#1287

If so, I am happy backporting it as long as two more core dev other than myself review it and approve it.

CC: @markshannon @brandtbucher @nedbat

@Fidget-Spinner
Copy link
Member Author

Fidget-Spinner commented Jun 4, 2022

Can you please check if this fixes this issue:

It should help but I doubt it will fix all of it. My own research suggests that profiling has slowed down by >50% in 3.11, even without the co_code issue.

Fidget-Spinner added a commit to Fidget-Spinner/cpython that referenced this pull request Jun 4, 2022
…nGH-93383)

Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Dennis Sweeney <36520290+sweeneyde@users.noreply.github.com>
@Fidget-Spinner
Copy link
Member Author

Fidget-Spinner commented Jun 4, 2022

I opened the backport for 3.11 since it's simpler than the 3.12 code. I hope the discussion can happen there or on the issue instead #93493. Thank you!

pablogsal pushed a commit that referenced this pull request Jun 23, 2022
…3383) (#93493)

Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Dennis Sweeney <36520290+sweeneyde@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PyCode_GetCode could be faster
6 participants