-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor code for performance #46
Conversation
@@ -61,7 +61,7 @@ def __init__( | |||
): | |||
self.optimizer = optimizer | |||
self.lr_dict = lr_dict | |||
self.group_names = list(self.lr_dict.keys()) | |||
self.group_names = [* self.lr_dict] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way compromises the readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there is a tradeoff between readability and performance. It is up to the team to decide which to maximize. You may argue that I would prefer readability for maintainability over that little performance gain, and that's totally fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no practical performance benefit from doing it this way.
It's trivial to benchmark the difference:
In [4]: def test1():
...: return list(a.keys())
...:
In [5]: def test2():
...: return [*a]
...:
In [6]: %timeit -n 10_000_000 test2
23.1 ns ± 0.214 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [7]: %timeit -n 10_000_000 test1
23.1 ns ± 0.246 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
using ipython
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>>> from timeit import timeit
>>> timeit("[]")
0.09845466999831842
>>> timeit("list()")
0.22419986899876676
Also, less memory footprint:
>>> import dis
>>> dis.dis("[]")
1 0 BUILD_LIST 0
2 RETURN_VALUE
>>> dis.dis("list()")
1 0 LOAD_NAME 0 (list)
2 CALL_FUNCTION 0
4 RETURN_VALUE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is simply not the same code at all, you're comparing apples to oranges.
My benchmark shows the exact change you did, and shows no performance improvement in the slightest.
In fact, if you run the benchmark using Python 3.11 the older method is consistently faster.
Python 3.11.2 (main, Feb 8 2023, 14:49:25) [GCC 11.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.12.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: a = {"a": 5, "k": 3, "asdf": 53}
In [2]: def test1():
...: return list(a.keys())
...:
In [3]: def test2():
...: return [*a]
...:
In [4]: %timeit -n 10_000_000 test2
13.7 ns ± 1.16 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
In [5]: %timeit -n 10_000_000 test1
13 ns ± 1.15 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
In [6]: %timeit -n 10_000_000 test1
13 ns ± 1.22 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
In [7]: %timeit -n 10_000_000 test2
14.1 ns ± 1.15 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
In [12]: dis.dis(test1)
1 0 RESUME 0
2 2 LOAD_GLOBAL 1 (NULL + list)
14 LOAD_GLOBAL 2 (a)
26 LOAD_METHOD 2 (keys)
48 PRECALL 0
52 CALL 0
62 PRECALL 1
66 CALL 1
76 RETURN_VALUE
In [13]: dis.dis(test2)
1 0 RESUME 0
2 2 BUILD_LIST 0
4 LOAD_GLOBAL 0 (a)
16 LIST_EXTEND 1
18 RETURN_VALUE
While test2 has a smaller bytecode output, it is still slower than test1. Not all opcodes are equally fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like python 3.11 introduced a lot of optimizations out of the box, and it is the default behavior under the hood. On python 3.9, I am getting these results:
>>> timeit('[* {"a": 5, "k": 3, "asdf": 53}]')
0.7558078520014533
>>> timeit('list({"a": 5, "k": 3, "asdf": 53}.keys())')
1.142674779999652
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if the change would have a minor performance benefit, none of the code paths seem to be hot, so these nano optimizations gain nothing in practice.
Appreciate the interest and contribution. As we are in the process of moving to py 3.11, I don't think this will bring any performance benefits so I'm closing. |
keys
,list
, anddict
calls.Note: I haven't tested these changes, but things should work as expected.