Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test lucene FixedBitSet - Experiment #4384

Closed

Conversation

t2gran
Copy link
Member

@t2gran t2gran commented Aug 17, 2022

Summary

Lucene

Norway - Vincent´s machine
 Worker: 
 ==> mc_destination : [  911,  826,  884,  836 ] Avg: 864,3

Total:  
 ==> mc_destination : [ 1397, 1340, 1361, 1319 ] Avg: 1354,3

Baden-wuerttemberg, Performance test 
 Worker: 
 ==> mc_destination : [  614,  607,  604,  603 ] Avg: 607.0

Total:  
 ==> mc_destination : [  775,  765,  762,  760 ] Avg: 765.5

JDK

Norway - Vincent´s machine
 Worker: 
 ==> mc_destination : [  844,  775,  827,  856 ] Avg: 825,5

Total:  
 ==> mc_destination : [ 1321, 1225, 1300, 1292 ] Avg: 1284,5
 
 Baden-wuerttemberg, Performance test 
Worker: 
 ==> mc_destination : [  612,  602,  602,  603 ] Avg: 604.8

Total:  
 ==> mc_destination : [  769,  756,  755,  757 ] Avg: 759.3

Not as good as we thought ...

@t2gran t2gran added the Optimization The feature is to improve performance. label Aug 17, 2022
@t2gran t2gran added this to the 2.2 milestone Aug 17, 2022
@t2gran t2gran requested a review from a team as a code owner August 17, 2022 13:43
@t2gran t2gran changed the title Ese lucene FixedBitSet - Experiment Test lucene FixedBitSet - Experiment Aug 17, 2022
@t2gran t2gran marked this pull request as draft August 17, 2022 13:47
public int next() {
int index = nextIndex;
nextIndex =
index == set.length() - 1 ? DocIdSetIterator.NO_MORE_DOCS : set.nextSetBit(index + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check the performance with SparseFixedBitSet, In that case the nextSetBit could be faster, since it checks each 4096-bit block only if at least one value is set, and it knows directly in which 64-bit segment. the regular FixedBitSet needs to check each block. It might be that linear scanning is still faster, as most of the data is already in the cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be better than org.apache.lucene.util.FixedBitSet, but still worse than java.util.BitSet

java.util.BitSet


Worker: 
 ==> mc_destination : [  464,  441,  440,  444 ] Avg: 447.3

Total:  
 ==> mc_destination : [  772,  705,  709,  711 ] Avg: 724.3


org.apache.lucene.util.FixedBitSet


Worker: 
 ==> mc_destination : [  480,  459,  465,  466 ] Avg: 467.5

Total:  
 ==> mc_destination : [  779,  741,  750,  750 ] Avg: 755.0


org.apache.lucene.util.SparseFixedBitSet


Worker: 
 ==> mc_destination : [  480,  447,  449,  452 ] Avg: 457.0

Total:  
 ==> mc_destination : [  788,  726,  722,  720 ] Avg: 739.0

@leonardehrenfried
Copy link
Member

Right now, I don't see a noticeable speed up in the speed test in Germany either: https://tinyurl.com/2nhldzku

@@ -66,21 +67,21 @@ public int transitArrivalTime(int stop) {
* @return true if at least one stop arrival was reached last round (best overall).
*/
public boolean isCurrentRoundUpdated() {
return !reachedCurrentRound.isEmpty();
return reachedCurrentRound.cardinality() > 0;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slow, but should not have a huge impact. We can look for the first bit set or keep a flag in the BestTimes to improve this.

@t2gran
Copy link
Member Author

t2gran commented Aug 23, 2022

In the profiling @leonardehrenfried did on the German data a while a go the BestTime.setTime() -> java.util.BitSet.expandTo(int) showed up, but it does not show up in the profiling we have done on the Entur case. Is this something we should investigate more?

@leonardehrenfried
Copy link
Member

Did I say that? In Grafana I could not see a difference.

@t2gran
Copy link
Member Author

t2gran commented Aug 23, 2022

I got a profiling jfr file from you a while back witch showed this - at least I though I got it from you.
jar_2022_07_18_144732.jfr.zip

@leonardehrenfried
Copy link
Member

Ah, I misunderstood. I thought that you referred to the speed test on CI which doesn't run on the data set for entire Germany. I will check that.

@t2gran
Copy link
Member Author

t2gran commented Aug 23, 2022

It might be that the whole of Germany exceeds a threshold, hence trigger the expandTo. This can probably be avoided with a small change of the java BitSet and improve the performance for very large data sets.

@leonardehrenfried
Copy link
Member

I'm currently setting up a test scenario for CI where I load all of Germany's transit data.

@leonardehrenfried
Copy link
Member

I'm adding the Germany-wide data set to the CI speed test in this PR: #4408

Once that is merged, we can run the numbers with this branch and check the performance.

@t2gran
Copy link
Member Author

t2gran commented Aug 24, 2022

I striped the Java BitSet for all checks and made a simpler version, it would be interesting to test that as well. For Norway it performs exactly as the Java BitSet. The expandTo(int) method is gone in FBitSet (my striped version). I branched of the same commit as the otp2_use_lucene_fixedbitset.

The code is in the otp2_use_stripped_java_bitset branch.

@leonardehrenfried
Copy link
Member

Ok, if you could give #4408 a review then we have some baseline data to compare it to. The PR also contains the generation of JFR files which IntelliJ can use to show flame graphs.

@leonardehrenfried
Copy link
Member

The results for all of Germany are in.

dev-2.x

CI run: https://github.com/opentripplanner/OpenTripPlanner/actions/runs/2917645778

 Worker: 
 ==> mc_destination : [ 7331 ] Avg: 7331.0

Total:  
 ==> mc_destination : [ 7962 ] Avg: 7962.0

Screenshot from 2022-08-24 13-57-48

Stripped BitSet

CI run: https://github.com/opentripplanner/OpenTripPlanner/actions/runs/2918381468

Worker: 
 ==> mc_destination : [ 7124 ] Avg: 7124.0

Total:  
 ==> mc_destination : [ 7767 ] Avg: 7767.0

You can indeed see that it's gone from the flame graph:

Screenshot from 2022-08-24 13-57-39

It's definitely an improvement but not a huge one.

@hannesj
Copy link
Contributor

hannesj commented Aug 25, 2022

In the Norwegian dataset, the BitSet in routeIndexIterator is the one consuming most time, especially the BitSetIterator coupled to it. It might make sense to test the different BitSet implementations in it

@t2gran t2gran modified the milestones: 2.2, 2.3, Rejected Nov 1, 2022
@t2gran
Copy link
Member Author

t2gran commented Nov 4, 2022

I will close this one now, thanks to everyone for testing this.

@t2gran t2gran closed this Nov 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Optimization The feature is to improve performance.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants