Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance #103

Closed
wants to merge 5 commits into from
Closed

Conversation

lehins
Copy link
Collaborator

@lehins lehins commented Apr 21, 2020

This PR fixes:

  • Performance issue that I discovered when benchmarking generation of integral values in a range
  • Order of arguments in uniformR function was inconsistent with other range functions.

@curiousleo
Copy link
Collaborator

* Performance issue that I discovered when benchmarking generation of integral values in a range

Nice! How big is the performance boost?

@lehins
Copy link
Collaborator Author

lehins commented Apr 21, 2020

@curiousleo Pretty significant ~x20. Problem is when a function is recursive, then it cannot be inlined, which can hurt the performance really bad when that function is being called many many times.
In case of [un]signedBitmaskWithRejectionRM there was no need for them to be recursive.

Haven't tried the internal benchmarks, but running with criterion difference in performance was obvious.

@lehins lehins mentioned this pull request Apr 21, 2020
@lehins
Copy link
Collaborator Author

lehins commented Apr 21, 2020

@curiousleo Applied the same fix of removing recursion to Integer generation, got quite a bit of performance boost as well ;)

Benchmarks for ranges are available in the same place: https://github.com/lehins/haskell-benchmarks/tree/new-random/new-random-benchmarks

@curiousleo curiousleo force-pushed the improve-performance branch from 4055a8e to bb7b99a Compare April 23, 2020 11:34
@curiousleo
Copy link
Collaborator

Fixed a semantic merge conflict: 068aa36#diff-5b60458897b8b814de6d596bdc71e75cL210-R210

@curiousleo
Copy link
Collaborator

curiousleo commented Apr 23, 2020

According to the new benchmarks, this is an improvement for some types (e.g. Word64) and a performance regression for others (e.g. Word16).

                                         mean BEFORE    AFTER     error BEFORE     AFTER
pure/uniformR/full/Word16                mean 28.35 μs  17.96 ms  ( +- 1.383 μs  ) ( +- 1.369 ms  )
pure/uniformR/full/Word64                mean 1.552 ms  131.3 μs  ( +- 103.9 μs  ) ( +- 3.496 μs  )

Full comparison in compare.txt.

improve-performance-bb7b99a.txt
interface-to-performance-d03e325.txt

Copy link
Collaborator

@curiousleo curiousleo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amongst performance improvements, this also contains performance regressions: #103 (comment)

@curiousleo
Copy link
Collaborator

Ping @lehins - I've cherry-picked most performance improvements from here onto #116, which is now merged. I've played around with the optimisation you made to uniformIntegerM (making it non-recursive). This has strange side-effects: in my benchmarks, it makes other generators run slower. I won't pursue this particular optimisation further for now.

@lehins
Copy link
Collaborator Author

lehins commented May 8, 2020

I am closing this, since most of improvements were cherry picked

@lehins lehins closed this May 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants