Raptor Heuristics performance #3748

t2gran · 2021-11-16T17:50:23Z

In the past 6 months the focus has been on functionality, not performance. I ran the SpeedTest today, and especially the Raptor Heuristics seems to be much slower than it used to be. My benchmark uses the same 27 routing requests and run them all 8 times to get god average results:

METHOD CALLS DURATION                           |              SUCCESS               |         FAILURE         
                                                |  Min   Max  Avg     Count   Total  | Average  Count   Total  
SpeedTest Transit Data                          |   83   113   94 ms     27    2,6 s |    0 ms      0    0,0 s 
Raptor Mc-DP Minute Transfers                   |    0    59    0 ms   4618    1,3 s |    0 ms      0    0,0 s 
Raptor NoWaitBT-Rev Route                       |   19   476  316 ms     27    8,6 s |    0 ms      0    0,0 s 
Raptor Mc-DP Minute Transit                     |    0   177    1 ms   4618    6,6 s |    0 ms      0    0,0 s 
SpeedTest Route Worker                          |   44  4771  920 ms     27   24,8 s |    0 ms      0    0,0 s 
Raptor NoWaitBT Route                           |   18   435  309 ms     27    8,4 s |    0 ms      0    0,0 s 
SpeedTest Collect Results                       |    0     0    0 ms     27    0,0 s |    0 ms      0    0,0 s 
SpeedTest Street Route                          |    2    31    9 ms     27    0,3 s |    0 ms      0    0,0 s 
SpeedTest Route                                 |  142  4874 1024 ms     27   27,7 s |    0 ms      0    0,0 s 
Raptor NoWaitBT Minute Transit                  |    0    81   36 ms    183    6,6 s |    0 ms      0    0,0 s 
Raptor NoWaitBT-Rev Minute Transit              |    0    83   34 ms    183    6,4 s |    0 ms      0    0,0 s 
Raptor Mc-DP Route                              |    4  3918  292 ms     27    7,9 s |    0 ms      0    0,0 s 
Raptor NoWaitBT-Rev Minute Transfers            |    0    23   11 ms    183    2,1 s |    0 ms      0    0,0 s 
Raptor NoWaitBT Minute Transfers                |    0    20    9 ms    183    1,7 s |    0 ms      0    0,0 s 

Paths found         : 193 [5, 12, 9, 2, 3, 7, 12, 3, 10, 6, 7, 32, 5, 2, 16, 2, 4, 2, 10, 2, 2, 2, 2, 2, 14, 11, 9]
Successful searches : 27 / 27
Sample              : 8 / 8
Time total          : 27,66 seconds

Worker: 
 ==> mc_destination : [  927,  946,  947,  915,  926,  917,  934,  920 ] Avg: 929,0

Total:  
 ==> mc_destination : [ 1045, 1054, 1057, 1020, 1030, 1023, 1039, 1024 ] Avg: 1036,5

This chart show some of the key numbers compared with old versions of OTP2:

Average Response Times

Raptor request distribution

Version of OTP used (exact commit hash or JAR name)

Update all timers to micrometer instances #3744

Data sets in use (links to GTFS and OSM PBF files)

An old Entur dataset used as a benchmark - I use the same dataset every time.

Steps to reproduce the problem

Run SpeedTest with the same dataset and config with different versions of OTP.

Parameters -p md -n 8.

The text was updated successfully, but these errors were encountered:

hannesj · 2021-11-25T14:16:41Z

Might be related to #3664

leonardehrenfried · 2022-08-24T12:09:35Z

After having added the speed test for all of Germany today, I noticed that this is a critical issue for very large data sets.

You can see that in this data set (500k stops, 200k patterns) the heuristics request takes up about 60% of the routing time:

More reading: #4384 (comment)

slvlirnoff · 2022-08-24T12:35:28Z

For huge networks, such as this one it might not be possible to have an heuristic that reach most stops and patterns (and the min transfer + 5 exit condition of existing heuristic is likely to reach almost all stops)

I think that an additional precomputed heuristic to prune the graph to the relevant patterns would help. For instance, clustering connecting pattern by operator and computing best generalised score between these clusters could help. In many cases you don't need a lot more than a handful of operators out of the 100s that operates at a national level. The city/regional around the origin and destination and the national operators in between + a few connecting ones (ferry, regional trains, etc.)

The challenge is to precompute this list properly by taking into account all transit times, days of operations and RT information.

leonardehrenfried · 2022-08-24T12:41:45Z

Based on your observations of reducing the input data set for Raptor leading to a speed up, I wrote an experimental filter that uses the bounding box of start, destination coordinates and checks if a route is inside of it. The result of that filtering is then fed to Raptor.

The code is at leonardehrenfried@c973c04

If you have a huge network but only need to route a few km, it led to a dramatic speed up. Of course there is lots of ways how this could go wrong.

We discussed this in a dev meeting and @t2gran and @abyrd said that instead of this filter they think that improving the heuristics themselves would lead to better results.

mvanlaar · 2022-08-24T12:44:19Z

I'm having the same problem with country wide slowness in Colombia. I've got around 120 gtfs feeds with a lot of frequency based trips.
08:32:36.057 INFO (DirectTransferGenerator.java:173) Done connecting stops to one another. Created a total of 1380637 transfers from 25730 stops.
08:32:38.512 INFO (GraphBuilder.java:220) Graph built. |V|=1,669,562 |E|=4,274,212
08:32:38.512 INFO (GraphBuilder.java:221) Transit built. |Stops|=26,306 |Patterns|=4,683

leonardehrenfried · 2022-08-24T12:47:07Z

If you want to tackle this in a systematic manner then you could contribute a speed test data set for Columbia, like I have done for Germany: #4408

mvanlaar · 2022-08-24T12:48:40Z

@leonardehrenfried How dit you generate the csv? Cut and paste?

leonardehrenfried · 2022-08-24T12:50:12Z

I wrote a little python script that parses stops.txt and selects random-ish stop ids for start and destination.

mvanlaar · 2022-08-24T12:51:20Z

I wrote a little python script that parses stops.txt and selects random-ish stop ids for start and destination.

Can you share it?

leonardehrenfried · 2022-08-24T12:53:54Z

https://gist.github.com/leonardehrenfried/f9ba0f585676be35fb9a3fbee5d0241a

You need to remove the Germany-specific parts.

This one does stop to stop routing requests only. This is actually not a very good simulation of real workloads where you have potentially many start/end stops.

So you should remove the column fromPlace and toPlace if you want something more realistic. Also maybe move the coordinates a little bit away from the stop coordinates.

mvanlaar · 2022-08-25T13:51:01Z

                              |  Min   Avg  Max     Count   Total
Raptor Mc-DP Minute Transfers |    0    13  217 ms    21'  294.8 s
Raptor Mc-DP Minute Transit   |    0    24  282 ms    21'  519.4 s
Raptor Mc-DP Route            |    0  6927  54' ms    118  817.4 s
Routing Access                |    0    24   48 ms     40    1.0 s
Routing AccessEgress          |    0    66  127 ms     40    2.7 s
Routing DirectStreet          |    0    27  147 ms     40    1.1 s
Routing Egress                |    0    40  115 ms     40    1.6 s
Routing Filtering             |    0    12   34 ms     40    0.5 s
Routing ItineraryCreation     |    0    73  227 ms     39    2.9 s
Routing PreCalculation        |    0     2    3 ms     40    0.1 s
Routing Raptor                |    0   20'  55' ms     39  817.8 s
Routing Rendering             |    0     0    1 ms     40    0.0 s
Routing Router                |    0   20'  55' ms     40  831.4 s
Routing Total                 |    0   20'  55' ms     40  832.0 s
Routing Transit               |    0   20'  55' ms     40  830.1 s
Routing TripPatternFiltering  |    0   166  186 ms     40    6.7 s

Test case ids       : [    0   2     4    6    8    10   12     14   16    18    20    22    24    26    28   30   32   34   36   38   40   42   44   46    48   50    52   54   56   58    60   62   64   66    68    70   72   74   76   78]
Number of paths     : [    3   0     3    3    1     3    0      3    3     3     3     3     3     3     3    3    3    2    3    3    3    0    3    3     3    3     3    3    3    1     2    3    0    3     0     2    3    3    3    0]
Transit times(ms)   : [18246 252 13463 2304 6237 60099 1645 135696 1680 45397 39129 73221 45219 41393 81422 9919 4303 1752 3077 8825 1953 5168 2039 4521 52129 2286 55833 5714 1689 2393 28973 2443 2820 4065 12068 40726 1541 2795 3567 4090]
Total times(ms)     : [18361 266 13506 2487 6251 60126 1659 135751 1736 45412 39148 73238 45234 41410 81453 9934 4314 1762 3240 8839 2047 5178 2101 4722 52146 2343 55853 5754 1744 2404 28984 2532 2829 4106 12076 40737 1566 2919 3735 4099]
Successful searches : 40 / 40
Time total          : 832 seconds```

First version. Had a few test cases without routes.

leonardehrenfried · 2022-08-25T14:19:48Z

Looks good.

If you want to run this with every merge, you can submit a PR.

However, it's not a guarantee that a core developer will do anything about it.

Nevertheless it's good to track the problematic performance cases.

github-actions · 2023-01-31T07:34:19Z

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days

github-actions · 2023-07-24T07:33:44Z

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days

t2gran added the Optimization The feature is to improve performance. label Nov 16, 2021

t2gran added this to the 2.1 milestone Nov 16, 2021

t2gran mentioned this issue Nov 16, 2021

Update all timers to micrometer instances #3744

Merged

leonardehrenfried mentioned this issue Dec 14, 2021

Run automatic performance tests #3783

Merged

t2gran modified the milestones: 2.1, 2.2 Mar 15, 2022

leonardehrenfried mentioned this issue Aug 25, 2022

Remove unused heuristics fields #4410

Closed

t2gran modified the milestones: 2.2, 2.3 Nov 1, 2022

t2gran mentioned this issue Nov 5, 2022

Improve Raptor Heuristics performance - number of transfers #4577

Closed

hannesj mentioned this issue Nov 18, 2022

Use time-independent heuristic search #4616

Draft

github-actions bot added the Stale This issue is stale, no activity for 90 days. Remove stale label or comment within 30 days. label Jan 31, 2023

hannesj removed the Stale This issue is stale, no activity for 90 days. Remove stale label or comment within 30 days. label Jan 31, 2023

t2gran modified the milestones: 2.3, 2.4 Apr 24, 2023

github-actions bot added the Stale This issue is stale, no activity for 90 days. Remove stale label or comment within 30 days. label Jul 24, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raptor Heuristics performance #3748

Raptor Heuristics performance #3748

t2gran commented Nov 16, 2021 •

edited

Loading

hannesj commented Nov 25, 2021

leonardehrenfried commented Aug 24, 2022 •

edited

Loading

slvlirnoff commented Aug 24, 2022

leonardehrenfried commented Aug 24, 2022

mvanlaar commented Aug 24, 2022 •

edited

Loading

leonardehrenfried commented Aug 24, 2022

mvanlaar commented Aug 24, 2022

leonardehrenfried commented Aug 24, 2022

mvanlaar commented Aug 24, 2022

leonardehrenfried commented Aug 24, 2022 •

edited

Loading

mvanlaar commented Aug 25, 2022

leonardehrenfried commented Aug 25, 2022

github-actions bot commented Jan 31, 2023

github-actions bot commented Jul 24, 2023

Raptor Heuristics performance #3748

Raptor Heuristics performance #3748

Comments

t2gran commented Nov 16, 2021 • edited Loading

Version of OTP used (exact commit hash or JAR name)

Data sets in use (links to GTFS and OSM PBF files)

Steps to reproduce the problem

hannesj commented Nov 25, 2021

leonardehrenfried commented Aug 24, 2022 • edited Loading

slvlirnoff commented Aug 24, 2022

leonardehrenfried commented Aug 24, 2022

mvanlaar commented Aug 24, 2022 • edited Loading

leonardehrenfried commented Aug 24, 2022

mvanlaar commented Aug 24, 2022

leonardehrenfried commented Aug 24, 2022

mvanlaar commented Aug 24, 2022

leonardehrenfried commented Aug 24, 2022 • edited Loading

mvanlaar commented Aug 25, 2022

leonardehrenfried commented Aug 25, 2022

github-actions bot commented Jan 31, 2023

github-actions bot commented Jul 24, 2023

t2gran commented Nov 16, 2021 •

edited

Loading

leonardehrenfried commented Aug 24, 2022 •

edited

Loading

mvanlaar commented Aug 24, 2022 •

edited

Loading

leonardehrenfried commented Aug 24, 2022 •

edited

Loading