Geofence algorithm: Dog Park vs PDC #1775

bengtan · 2018-08-22T08:40:12Z

Description

This is so weird... I’m showing up at the dog park (marked in red), but I should be showing up at the PDC (marked in blue).

https://hippware.slack.com/archives/C2V6L53TQ/p1534893214000100

Data

These are the relevant pieces of data.

User bot events for WeHo - Dog Park
(bot id: eca2ee9e-3e9c-11e8-8a29-0a580a02050f)

2018-08-21 22:08:57 Transition in  / f80bb139-e5c7-4b40-ba1a-5234748afa52
2018-08-21 22:09:13 enter          / 30b21b58-63aa-4362-9efc-dca4135a0f55
2018-08-21 22:10:19 transition_out / 042f8742-abf7-4865-9ef7-9f8bc2191f63
2018-08-22 00:31:44 exit           / bcbcb423-cdee-4c5e-bd14-77a9c0fdca81

// user_id='55cb59a6-55e2-11e6-b457-0eea5386eb69' AND created_at <= '2018-08-22T00:32:00Z' order by created_at desc

Interpretation

(DA = 'debounce algorithm')

22:08:57 The DA first detects the user is within the bot perimeter. Begins debouncing.
22:09:13 On the next data point, debounce is passed and an entry event is (correctly) generated.

Fine so far. Then:

22:10:19 The DA detects the user is outside. Begins debouncing.
In between: There are 90+ data points but they all have accuracy range GT 30. The lowest accuracy range is 32. I think these are all discarded [1].
00:31:44 A data point outside the bot, and with accuracy range 25 (which is lte 30). The debounce passes and an exit event is generated.

At the time of the user report, around 23:13:00, the user had left the bot but because the 90+ data points were all discarded, the DA thought the user was still in the bot perimeter.

The location of the user indicated in the screenshot (ie. 'blue') is correct. The DA is incorrect.

FWIW, the most recent data point as of 23:13:00 had an accuracy range of 65 and an is_moving of false.

Discussion

There's a number of ways to tweak the DA to correct for this. I'll leave that for another time, or for others to give their opinions.

The text was updated successfully, but these errors were encountered:

bengtan · 2018-08-22T08:44:53Z

I know we agreed to filter out data points with accuracy range > 30 but I also expected that the 30-sec debounce algorithm (ie. previous behaviour), which is a fallback plan 'B', would be unchanged.

I think the accuracy filter is filtering out data points before they reach plan 'B', so there has been an change (intentional or otherwise).

toland · 2018-08-22T22:47:30Z

I think the accuracy filter is filtering out data points before they reach plan 'B', so there has been an change (intentional or otherwise).

Inaccurate data points are filtered out first thing. They are not considered at all. This was very much intentional. The request in #1713 was for inaccurate data to be "thrown out."

The example case that was raised to show why we should filter inaccurate data points had Miranda in a metal building for several minutes while her phone was reporting that she was hundreds of meters away. If we accepted an equally inaccurate data point 30 seconds later then the problem would not be fixed.

As I see it, the accuracy threshold and debouncing are completely orthogonal concepts. The data's accuracy doesn't improve with age 😄

bengtan · 2018-08-23T02:02:30Z

Inaccurate data points are filtered out first thing. They are not considered at all. This was very much intentional. The request in #1713 was for inaccurate data to be "thrown out."

I guess I misunderstood/mis-interpreted your description of the changes. Fair enough. Good that it's been clarified now.

The error case observed in this ticket is going to be interesting ... in terms of how we're going to solve it.

Time to ponder.

toland · 2018-08-23T17:53:24Z

I have had a deeper look at this case, and I am not sure that there is anything that we can, or should, do about this case.

Here are a few facts:

The client was sending inaccurate data, sometimes very inaccurate.
The accuracy of the backend algorithm is constrained by the accuracy of the data presented to it.
The algorithm performed exactly as expected in this case.
We have one documented case so far.
This is actually a somewhat rare case based on the existing data in the database.

I reran some of my calculations on the accuracy values in the database this morning. 87% of the data in the staging database has an accuracy of 30m or less. 92% has an accuracy of 50m or less. Many of the data points in the screenshot above have an accuracy of 65m. There isn't much difference between 50m and 65m: 92.4% of the data points have an accuracy of less than 65m. iOS seems to "prefer" certain accuracy values: about 2.6% of the location data points have an accuracy of exactly 65m.

Based on eyeballing the data, it looks like inaccurate values are most likely to be clumped together. This makes sense since there is usually some external interference that would cause accuracy to be low.

What can we do? Not much, really. I think the algorithm itself is sound. There is only so much we can do with bad data and I am satisfied with the result in this case. It wasn't a perfect result, but it was a reasonable result given the quality of the data we had to work with.

One thing we can do is tweak the accuracy threshold. It is currently at 30m, and I think we can go as high as 50m without causing too many problems. Keep in mind that raising the threshold also raises the likelihood that we will see this same problem from the other direction. In other words, instead of an inaccurate result because we ignored inaccurate data, we could get an inaccurate result because we didn't ignore inaccurate data. I would be very hesitant to raise the threshold above 50m for that reason.

However, raising the threshold to 50m might not have mitigated this particular instance. There are a couple of data points around 22:40 with 48m accuracy that were more than 30s apart, so it might have exited the dog park, albeit late. That may not have been enough to properly enter the PDC bot, though. Eyeballing the data, it looks like it may have entered the PDC bot a few seconds earlier than it actually did.

I think the thing to do in this case is to chalk it up to a regrettable but unavoidable poor result due to circumstances beyond our control. The algorithm isn't magic, and you can't get a good result from bad data.

toland · 2018-08-23T23:35:16Z

FYI

I bumped the accuracy threshold on staging to 50m and later to 90m on @thescurry's request.

bengtan · 2018-08-27T03:52:25Z

Not sure where to put this. Maybe it might result in a front-end ticket later.

Anyway, here's the iOS documentation page for accuracy levels that we can configure iOS to return.

https://developer.apple.com/documentation/corelocation/cllocationaccuracy?language=objc

We're currently using kCLLocationAccuracyBest.

There is another level kCLLocationAccuracyBestForNavigation that we could try out, but it does come with warnings:

This level of accuracy is intended for use in navigation apps that require precise position information at all times. Because of the extra power requirements, use this level of accuracy only while the device is plugged in.

https://developer.apple.com/documentation/corelocation/kcllocationaccuracybestfornavigation?language=objc

bengtan · 2018-08-27T04:16:51Z

Been reading the internets and forums on ios gps accuracy. According to some people, it's been deteriorating around, or since, iOS 11. And then there are all sorts of work-arounds to 'fix it' back up.

The anecdotes are all over the place. I'm sure there are a lot of false alarms, but there's probably some truth in there too. The problem is wading through them all.

toland · 2018-08-27T18:51:05Z

Oh, boy. This sounds like a rabbit hole we could spend quite some time going down.

bengtan · 2018-08-28T02:12:43Z

This sounds like a rabbit hole we could spend quite some time going down.

Yeah, that was my second last thought.

My last thought was ... Even if client-side GPS accuracy is deteriorating and there are special 'fixes' available, it's not feasible for us to 'fix' every user's phone so ... we'll have to do our best at the server-side and handle high accuracy uncertainties as best we can.

meta: It would be nice to be able to say to a user: 'Hey, your GPS accuracy sucks. Can you sort it out instead of sending us inaccurate data?' (Which is what prompted the suggestion of rn-chat#2720)

toland · 2018-08-28T15:51:51Z

It would be nice to be able to say to a user: 'Hey, your GPS accuracy sucks. Can you sort it out instead of sending us inaccurate data?' (Which is what prompted the suggestion of rn-chat#2720)

I think we may have to figure out something like that. The one GPS app (other than maps) that I use a lot is Fitbit, and they have had significant problems with GPS accuracy over the last couple of years. I commented on the upstream ticket to that effect.

bengtan · 2018-08-29T03:53:31Z

@thescurry has indicated that he's happy with an accuracy threshold of 90m. I've spun out a ticket (#1799) so we remember to configure Production eventually.

Since things seem to be satisfactory for now, I'm going to consider this feedback iteration as done, and closing this ticket.

Until we hit the next geofence anomaly.

bengtan mentioned this issue Aug 22, 2018

Discussion: Users getting stuck at locations. hippware/rn-chat#2700

Closed

toland added Bug Geolocation labels Aug 23, 2018

bengtan closed this as completed Aug 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Geofence algorithm: Dog Park vs PDC #1775

Geofence algorithm: Dog Park vs PDC #1775

bengtan commented Aug 22, 2018

bengtan commented Aug 22, 2018

toland commented Aug 22, 2018

bengtan commented Aug 23, 2018

toland commented Aug 23, 2018

toland commented Aug 23, 2018

bengtan commented Aug 27, 2018

bengtan commented Aug 27, 2018

toland commented Aug 27, 2018

bengtan commented Aug 28, 2018

toland commented Aug 28, 2018

bengtan commented Aug 29, 2018

Geofence algorithm: Dog Park vs PDC #1775

Geofence algorithm: Dog Park vs PDC #1775

Comments

bengtan commented Aug 22, 2018

Description

Data

Interpretation

Discussion

bengtan commented Aug 22, 2018

toland commented Aug 22, 2018

bengtan commented Aug 23, 2018

toland commented Aug 23, 2018

toland commented Aug 23, 2018

bengtan commented Aug 27, 2018

bengtan commented Aug 27, 2018

toland commented Aug 27, 2018

bengtan commented Aug 28, 2018

toland commented Aug 28, 2018

bengtan commented Aug 29, 2018