Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-118761: improve import time for pickle #128732

Merged
merged 5 commits into from
Jan 14, 2025

Conversation

picnixz
Copy link
Member

@picnixz picnixz commented Jan 11, 2025

We can remove the re import which takes quite a long time. Benchmarks were performed on a RELEASE build (no PGO, no LTO). It's a bit hard to have stable numbers with -X importtime, so I'm only using the hyperfine benchmarks.

PR

$ hyperfine --warmup 8 "./python -c 'import pickle'"
Benchmark 1: ./python -c 'import pickle'
  Time (mean ± σ):       7.8 ms ±   0.3 ms    [User: 6.6 ms, System: 1.3 ms]
  Range (min … max):     7.4 ms …   9.8 ms    337 runs

Main

$ hyperfine --warmup 8 "./python -c 'import pickle'"
Benchmark 1: ./python -c 'import pickle'
  Time (mean ± σ):       9.7 ms ±   0.3 ms    [User: 8.3 ms, System: 1.4 ms]
  Range (min … max):     9.3 ms …  12.6 ms    282 runs

Since something that is no more present in the global namespace is removed, I've added a NEWS entry and a detailed changelog.

Importing `pickle` is now roughly 25% faster.

Importing the `re` module is no longer needed and
thus is no more implicitly exposed as `pickle.re`.
@picnixz picnixz force-pushed the perf/import/pickle-118761 branch from 0c5d01a to 6ce7785 Compare January 11, 2025 12:03
@picnixz picnixz added the performance Performance or resource usage label Jan 11, 2025
@picnixz picnixz changed the title gh-118761: improve import time of pickle gh-118761: improve import time for pickle Jan 11, 2025
Lib/pickle.py Outdated Show resolved Hide resolved
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
@picnixz
Copy link
Member Author

picnixz commented Jan 12, 2025

I'll merge this one tomorrow (and will check if removing isidentifier() improves a bit performances). I want to keep the first commit message as it indicates that re is now removed from the global namespace (it should never have been accessed from outside but we never know; maybe someone has been patching that attribute for whatever reason in their test suite).

EDIT: no micro-optimization so leaving the numbers as is

@picnixz
Copy link
Member Author

picnixz commented Jan 14, 2025

As a follow-up, I can also improve the import time of pickletools once I've merged this one (reason is that pickletools imports re and pickle so if I just change pickletools now, without this PR, then we won't see any improvements at all).

@picnixz
Copy link
Member Author

picnixz commented Jan 14, 2025

@vstinner I plan to merge this one with the following commit message:

Importing `pickle` is now roughly 25% faster.

Importing the `re` module is no longer needed and
thus `re` is no more implicitly exposed as `pickle.re`.

and following title:

Improve import time of the `pickle` module.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@picnixz picnixz merged commit ff3e145 into python:main Jan 14, 2025
38 checks passed
@picnixz picnixz deleted the perf/import/pickle-118761 branch January 14, 2025 11:27
@AA-Turner
Copy link
Member

@picnixz tiny note for the future -- when merging, try not to end the titles of commit messages with full stops ('.'), as it doesn't work well with the auto-appended PR reference: ff3e145

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants