Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements Reasonable Alternative Routes for MLD #4047

Merged
merged 1 commit into from
Jul 7, 2017
Merged

Conversation

daniel-j-h
Copy link
Member

@daniel-j-h daniel-j-h commented May 15, 2017

For #3905. Implements reasonable alternatives using the via-method. See the ticket for details.

  • Measurements: Average query speed with 1,2,3,4 alternatives
  • Review
  • Adjust for comments

@daniel-j-h daniel-j-h added this to the 5.9.0 milestone May 15, 2017
@daniel-j-h daniel-j-h self-assigned this May 15, 2017
@daniel-j-h
Copy link
Member Author

daniel-j-h commented May 15, 2017

Tasks (not in order; also I will split off into sub-tasks as I see fit):

  • Make pipeline recognize alternatives feature on MLD
  • Lift restriction to only ever have zero or one alternative (done separately in Refactors Existing Alternatives Architecture #4035 already in master)
  • Let search spaces overlap and intersect nodes as via-candidates
  • Unpack paths and rank unpacked alternatives
  • Get alternatives into the API response
  • Get multiple alternatives to show up in the frontend
  • Pre-filter and rank candidates with heuristics (requires some 🤔, will be major part of this task)

@daniel-j-h daniel-j-h force-pushed the mld/alternatives branch 4 times, most recently from f399636 to 006d2d1 Compare May 17, 2017 14:24
@daniel-j-h
Copy link
Member Author

We have the full set of all possible alternatives available now. From the search space overlap, candidate generation based on boundary nodes and base graph nodes at the bottom, path reconstruction, path unpacking in the backend all through the pipeline to the response. Screenshot from local frontend:

map

make -C test/data
./osrm-routed --algorithm mld ../test/data/mld/monaco.osrm

The next task will be filtering via-candidates as soon as possible (for performance reasons) based on

  • stretch - how much is the alternative longer than the primary route
  • sharing - how many ways does the alternative share with the primary route
  • local optimality - sub-paths up to certain length are themselves shortest paths

For the stretch-test @MoKob just brought up a good point I haven't thought about.

For restricted areas (think: access=destination) we apply a "high" weight penalty to the turn onto the restricted way (idea was blogged about in this diary post).

Here is where we apply the penalty weight and here is where it is defined.

In the case of restricted areas on the paths the stretch-test based on raw weights suddenly makes no sense anymore. For example a route that is longer in terms of weight can be of almost the same duration but passing a restricted area instead. In the case when start and / or target are inside a restricted area both our primary route as well as all alternatives have at least the start and / or target penalty weight on them. We probably also want to filter via candidates which are inside a restricted area to begin with.

If we could access the high turn penalty from the C++ side we could simply subtract it from the route's weights. But we can't since the restricted penalty is only set in the Lua profile and is not bound in our scripting environment. Binding it would be a breaking profile API change. Not binding it means I can't do reasonable stretch-tests on weights - at least not for the case when restricted areas are involved.

@danpat
Copy link
Member

danpat commented May 19, 2017

For those following along - the "stretch test" means "how much longer is the alternative than the shortest/fastest/least-weight path". A stretch factor of 1.2 means a path is 20% longer than the shortest ("longer" is in terms of the metric being used for shortest-path routing, in our case, weight).

@daniel-j-h Given that we're applying these weights in order to avoid these restricted ways, how does this cause problems when using weight to calculate stretch? If an alternative enters a restricted way, it will have a high weight, and the stretch will be large, possibly causing it to be pruned. Is this not desirable behaviour? If we're using weights to avoid private roads, why would we want alternatives that use private roads?

It seems to me that we want all our other metrics (avoid private roads, etc) to hold for alternatives as well, and using weight to calculate stretch would generally achieve this. Am I missing something?

@MoKob
Copy link

MoKob commented May 19, 2017

@danpat when travelling a-b, there are two valid locations for private roads (begin/end, if you are starting/ending on it). In our case, we can get intermediate vertices on private roads, if we aren't careful.

Right now we add 50 mins per entry/exit. Now if you have a long path that has a privat path as a shortcut, you could end up taking that part (since you only find a candidate in that area and can't see whether it is private or not). If the road around the (e.g.) private bridge is long enough, suddenly we allow paying that penalty in the middle.

Its all about avoiding private areas, not allowing them. But stretch alone is not guaranteed to work out. It will in most cases, since military bases or other private roads should hopefully not be the easiest path of reaching an area, but in any remote areas it could well be the case. It can, of course, already happen with the shortest path itself. By allowing stretch, we reduce the effective penalty of the already rather small penalty when it comes to remote areas.

@daniel-j-h
Copy link
Member Author

For the record this is already a problem in the current impl. (see here) and not specific to this changeset.

@daniel-j-h
Copy link
Member Author

@danpat here's an example from Berlin where I filter candidate routes by at most 1.05 times longer than the shortest path. And the shortest path ending in a restricted area. Which means the shortest path has a high weight applied to it - we suddenly filter out a lot less candidate paths then we should.

stretch

@danpat
Copy link
Member

danpat commented May 19, 2017

I see, so because the penalty is so large in comparison to the total route weight, and because it's mandatory to take the penalty turn to access the endpoint, doing a simple stretch comparison of all alternatives makes them appear to be quite similar (the weight penalty dominates the total route weight).

@daniel-j-h
Copy link
Member Author

I just implemented a rough sharing-heuristic (how many ways does the alternative share with the primary route) making use of the cell structure we have at our hands from the partitioner.

In contrast to the techniques described in comments above this heuristic already needs the packed paths from the heaps. I then compute the similarity based on the cell ids for the nodes on the packed path.

sharing0
sharing1

At the moment I'm still comparing sharing(shortest path, alternative) only - what we eventually want for returning multiple alternatives is comparing the sharing between all routes.


There are a few open questions I don't have answers to yet:

  • On which level does it make sense to compute cell sharing on? Should sharing be a linear combination based on level and cell sharing on each level?
  • Do we need to scale sharing with edge weights? Is sharing based on cells only good enough?

@daniel-j-h
Copy link
Member Author

daniel-j-h commented May 23, 2017

Comparing sharing between all combination of routes is implemented by now. The simple approach of taking the pre-filtered via candidates ranking them by weight and then filtering based on the cell structure for the packed paths on the first level already gives good looking results:

alternatives

(click for webm video)


What's next is local optimality, investigating cell structure sharing on multiple levels, experimenting with parameters and trying to solve the open questions from above.

@daniel-j-h daniel-j-h force-pushed the mld/alternatives branch 4 times, most recently from 0dc8e4b to d53bb80 Compare May 31, 2017 09:21
@daniel-j-h daniel-j-h force-pushed the mld/alternatives branch 2 times, most recently from 96a6249 to 5ce6f96 Compare June 1, 2017 10:30
{
util::static_assert_iter_category<RandIt, std::random_access_iterator_tag>();
util::static_assert_iter_value<RandIt, WeightedViaNode>();

// Assumes weight roughly corresponds to duration-ish. If this is not the case e.g.
// because users are setting weight to be distance in the profiles, then we might
// either generate more candidates than we have to or not enough. But is okay.
const auto scaledAtMostLongerBy = scaledAtMostLongerByFactorBasedOnDuration(weight);
const auto scaledAtMostLongerBy =
scaledAtMostLongerByFactorBasedOnDuration(weight / weight_multiplier);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oxidase do you think this is good enough for now until we have durations in the overlay?

@daniel-j-h
Copy link
Member Author

@oxidase just helped me out running his plotting scripts for the osrm-runner bench runs on Bavaria. Seem's like our alternatives on mld are a bit slow in the current state.

Here's ch alternatives=true versus mld alternatives=true:

plot0

And here is ch alternatives=true versus mld alternatives=5 (note the change on y axis):

plot1

I think we need to do two things here:

  • Go in with a profiler and figure out which check is expensive and if we can make it faster while still providing high-quality alternatives. Especially figure out why the slowdown is not linear.
  • See if unpacking twice as many candidate paths as requested alternatives is way too high. For five requested alternatives we unpack up to ten candidate paths in the final round of checks. We may have to implement the unpacking cache (see Implement path unpacking cache #3835) and / or a stack allocator (see Thread-local Stack Allocator for Routing Algorithms #4213).

@daniel-j-h daniel-j-h force-pushed the mld/alternatives branch 4 times, most recently from e914ea1 to 8bc33ac Compare July 5, 2017 15:29
@daniel-j-h
Copy link
Member Author

Here are the callgraphs and the functions' contribution total time sampled via perf for alternatives=1:

perf2

after reducing the search space and only using vias with weight < shortest path * factor:

perf3

which shows generating the via candidates and unpacking paths as the most expensive operations.


With the latest performance improvements in place here are the plots for alternatives=1:

alt-1

and for alternatives=5:

alt-5

two clusters could be from the partitions; we can check by changing cell sizes.

@daniel-j-h daniel-j-h force-pushed the mld/alternatives branch 2 times, most recently from 6890ecd to 726b4aa Compare July 6, 2017 13:57
@daniel-j-h
Copy link
Member Author

Another datapoint with higher partition cell sizes,

osrm-partition bayern-latest.osrm --max-cell-sizes 4096,131072,2097152,67108864

map

@Project-OSRM Project-OSRM deleted a comment from MoKob Jul 7, 2017
@daniel-j-h
Copy link
Member Author

Oops, I accidentally deleted your comment about the 400ms outlier @MoKob on bavaria - to be clear here this is on purpose using a 4096 level 1 cell size to see how the behavior changes and not about production performance numbers. Here are some additional cell sizes:

Our default partition config values: 128 4096 65536 2097152
bench-128 4096 65536 2097152

Making the top most level smaller: 128 4096 65536 524288
bench-128 4096 65536 524288

And introducing an additional level: 128 2048 32768 524288 2097152:
bench-128 2048 32768 524288 2097152

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants