-
-
Notifications
You must be signed in to change notification settings - Fork 40
Find and fix possible performance bottlenecks #16
Comments
Nice to know, thanks for keep me posted. I already notice the performance is not so awesome the last time I randomly try on my machine with |
Yeah, it's definitely not. I guess at least partially because of needless copying. Would be nice to have a tool to visualize allocations and memory usage. I looked around but couldn't find any... I guess I would start by optimizing this BTreeMap now.
Running this from a Mac. |
Oh yeah, and if you profile don't forget to do that on a release build with debug flags. [profile.release]
debug = true More info: |
This profiling run is outdated by now. Also, the issue is not really actionable as profiling and performance improvements will always be an ongoing effort. Therefore I'm closing this to keep the issue tracker clean. |
Yesterday I did some profiling using the setup described here.
The resulting callgrind file is attached. This can be opened with qcachegrind on Mac or kcachegrind on Linux.
callgrind.out.35583.zip
If you don't have any of those programs handy, I've added a screenshot for the two main bottlenecks that I can see. I'm not an expert, but it looks like we spend a lot of time allocating, converting, and dropping the BTreeMap, which will be converted to a dictionary and returned to Python in the end.
I guess we could save a lot of time by making this part more efficient. E.g. by copying less and instead working on references. Might be mistaken, though. Help and pull requests are very welcome.
😊
The text was updated successfully, but these errors were encountered: