-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf: optimize postal_parse_address implementation #49
base: main
Are you sure you want to change the base?
Conversation
@lmores here's what I see when I run those benchmarks for the following system specs
============================= test session starts ==============================
platform linux -- Python 3.12.4, pytest-8.2.2, pluggy-1.5.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/jimmy/Work/mismo
configfile: pyproject.toml
plugins: anyio-4.4.0, cov-5.0.0, benchmark-4.0.0
collected 24 items / 18 deselected / 6 selected
mismo/lib/geo/tests/test_postal_benchmark.py ...... [100%]
---------------------------------------------------------------------------------------------- benchmark 'nrows=1000000': 6 tests ---------------------------------------------------------------------------------------------
Name (time in s) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_parse[1m-noop] 12.8445 (1.0) 12.9714 (1.0) 12.9009 (1.0) 0.0570 (1.0) 12.9015 (1.0) 0.1046 (1.44) 1;0 0.0775 (1.0) 5 1
test_benchmark_parse[1m-python_only] 13.5212 (1.05) 13.7194 (1.06) 13.5844 (1.05) 0.0782 (1.37) 13.5659 (1.05) 0.0728 (1.0) 1;1 0.0736 (0.95) 5 1
test_benchmark_parse[1m-postal_only] 51.1747 (3.98) 51.4688 (3.97) 51.2672 (3.97) 0.1179 (2.07) 51.2128 (3.97) 0.1199 (1.65) 1;0 0.0195 (0.25) 5 1
test_benchmark_parse[1m-postal_parse_address] 56.0817 (4.37) 56.8488 (4.38) 56.2741 (4.36) 0.3264 (5.73) 56.1196 (4.35) 0.2928 (4.02) 1;1 0.0178 (0.23) 5 1
test_benchmark_parse[1m-postal_parse_address__direct_import] 56.2151 (4.38) 56.4120 (4.35) 56.3279 (4.37) 0.0975 (1.71) 56.3729 (4.37) 0.1837 (2.52) 2;0 0.0178 (0.23) 5 1
test_benchmark_parse[1m-postal_parse_address__initial_impl] 58.3636 (4.54) 58.5571 (4.51) 58.4671 (4.53) 0.0819 (1.44) 58.4389 (4.53) 0.1328 (1.82) 2;0 0.0171 (0.22) 5 1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Legend:
Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
OPS: Operations Per Second, computed as 1 / Mean
================ 6 passed, 18 deselected in 1769.89s (0:29:29) ================= Like you, I also see a significant improvement in the runtime performance when using pytest-benchmark |
Hmm, really curious about this one, I'll dig in once I'm at a computer. Thanks for the effort. First thought is all the timings seem cut in nearly exactly half, maybe were accidentally doing computation twice in the non-pytest harness? |
Quite sure this is not the case (also, my timings are a bit far from being halved). Differences may be due to the fact that in non-pytest benchmarks I was using DuckDB that writes to disk, and not in memory as in pytest-benchmark... however, this is what I get when running non-pytest benchmarks using a memtable (doesn't change much, still far from the others):
At first I thought that pytest-benchmark library did something as simple as |
Intro
In this PR:
mismo.lib.geo.postal_parse_address
has been rewritten using my implementationmismo.lib.geo.tests.test_postal_benchmark.postal_parse_address__direct_import
is the same as the above one but avoids the optional import ofpostal
inside its body (@NickCrews: don't worry, I was just curious if that import and the inner function declaration generated any relevant overhead... seems it is irrelevant, will remove it before merge)mismo.lib.geo.tests.test_postal_benchmark.postal_parse_address__initial_impl
is the original implementation by @jstammersIn this branch I also temporary restored the top-level
benchmarks
directory to manually gather execution times.All benchmarks have been run on the full data set of 1M rows using
duckdb 0.10.0
,ibis-framework 9.1.0
andpython 3.12.3
on the same laptop with no other programs running (except GNOME, etc).Results
Results obtained with manual timings are as follows.
On the other hand, pytest-benchmark reports the following.
Values in the Mean column are remarkably lower than those reported manually (which lay way outside the StdDev or the IQR).
Isn't it quite/too strange? As far as I can tell
pytest-benchmark
is running each function 5 times, are we caching something that we should not and subsequent executions run faster than just one (as in manual timings)?Also, in pytest benchmarks the difference between
postal_parse_address
andpostal_only
is more than5s
which is much bigger than the one from manual timings (<1s
).Additionally, before running the above results I recreated my virtualenv from zero to match mismo new requirements (
python>=3.10
,ibis-framework>=9.1.0
) and the results I get are quite different also from those I reported in PR #48 where I was usingibis-framework 9.0.0
andduckdb 1.0.0
(not sure about which version of python though).Questions