-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test lucene FixedBitSet - Experiment #4384
Conversation
public int next() { | ||
int index = nextIndex; | ||
nextIndex = | ||
index == set.length() - 1 ? DocIdSetIterator.NO_MORE_DOCS : set.nextSetBit(index + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check the performance with SparseFixedBitSet
, In that case the nextSetBit
could be faster, since it checks each 4096-bit block only if at least one value is set, and it knows directly in which 64-bit segment. the regular FixedBitSet
needs to check each block. It might be that linear scanning is still faster, as most of the data is already in the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be better than org.apache.lucene.util.FixedBitSet
, but still worse than java.util.BitSet
java.util.BitSet
Worker:
==> mc_destination : [ 464, 441, 440, 444 ] Avg: 447.3
Total:
==> mc_destination : [ 772, 705, 709, 711 ] Avg: 724.3
org.apache.lucene.util.FixedBitSet
Worker:
==> mc_destination : [ 480, 459, 465, 466 ] Avg: 467.5
Total:
==> mc_destination : [ 779, 741, 750, 750 ] Avg: 755.0
org.apache.lucene.util.SparseFixedBitSet
Worker:
==> mc_destination : [ 480, 447, 449, 452 ] Avg: 457.0
Total:
==> mc_destination : [ 788, 726, 722, 720 ] Avg: 739.0
Right now, I don't see a noticeable speed up in the speed test in Germany either: https://tinyurl.com/2nhldzku |
@@ -66,21 +67,21 @@ public int transitArrivalTime(int stop) { | |||
* @return true if at least one stop arrival was reached last round (best overall). | |||
*/ | |||
public boolean isCurrentRoundUpdated() { | |||
return !reachedCurrentRound.isEmpty(); | |||
return reachedCurrentRound.cardinality() > 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is slow, but should not have a huge impact. We can look for the first bit set or keep a flag in the BestTimes to improve this.
In the profiling @leonardehrenfried did on the German data a while a go the |
Did I say that? In Grafana I could not see a difference. |
I got a profiling jfr file from you a while back witch showed this - at least I though I got it from you. |
Ah, I misunderstood. I thought that you referred to the speed test on CI which doesn't run on the data set for entire Germany. I will check that. |
It might be that the whole of Germany exceeds a threshold, hence trigger the |
I'm currently setting up a test scenario for CI where I load all of Germany's transit data. |
I'm adding the Germany-wide data set to the CI speed test in this PR: #4408 Once that is merged, we can run the numbers with this branch and check the performance. |
I striped the Java BitSet for all checks and made a simpler version, it would be interesting to test that as well. For Norway it performs exactly as the Java BitSet. The The code is in the otp2_use_stripped_java_bitset branch. |
Ok, if you could give #4408 a review then we have some baseline data to compare it to. The PR also contains the generation of JFR files which IntelliJ can use to show flame graphs. |
The results for all of Germany are in. dev-2.xCI run: https://github.com/opentripplanner/OpenTripPlanner/actions/runs/2917645778
Stripped BitSetCI run: https://github.com/opentripplanner/OpenTripPlanner/actions/runs/2918381468
You can indeed see that it's gone from the flame graph: It's definitely an improvement but not a huge one. |
In the Norwegian dataset, the BitSet in routeIndexIterator is the one consuming most time, especially the |
I will close this one now, thanks to everyone for testing this. |
Summary
Lucene
JDK
Not as good as we thought ...