Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geofence algorithm: Dog Park vs PDC #1775

Closed
bengtan opened this issue Aug 22, 2018 · 11 comments
Closed

Geofence algorithm: Dog Park vs PDC #1775

bengtan opened this issue Aug 22, 2018 · 11 comments

Comments

@bengtan
Copy link
Contributor

bengtan commented Aug 22, 2018

Description

Quoting @thescurry:

This is so weird... I’m showing up at the dog park (marked in red), but I should be showing up at the PDC (marked in blue).

https://hippware.slack.com/archives/C2V6L53TQ/p1534893214000100

image from ios

Data

These are the relevant pieces of data.

User bot events for WeHo - Dog Park
(bot id: eca2ee9e-3e9c-11e8-8a29-0a580a02050f)

2018-08-21 22:08:57 Transition in  / f80bb139-e5c7-4b40-ba1a-5234748afa52
2018-08-21 22:09:13 enter          / 30b21b58-63aa-4362-9efc-dca4135a0f55
2018-08-21 22:10:19 transition_out / 042f8742-abf7-4865-9ef7-9f8bc2191f63
2018-08-22 00:31:44 exit           / bcbcb423-cdee-4c5e-bd14-77a9c0fdca81
// user_id='55cb59a6-55e2-11e6-b457-0eea5386eb69' AND created_at <= '2018-08-22T00:32:00Z' order by created_at desc

screenshot from 2018-08-22 16-33-54

Interpretation

(DA = 'debounce algorithm')

  • 22:08:57 The DA first detects the user is within the bot perimeter. Begins debouncing.
  • 22:09:13 On the next data point, debounce is passed and an entry event is (correctly) generated.

Fine so far. Then:

  • 22:10:19 The DA detects the user is outside. Begins debouncing.
  • In between: There are 90+ data points but they all have accuracy range GT 30. The lowest accuracy range is 32. I think these are all discarded [1].
  • 00:31:44 A data point outside the bot, and with accuracy range 25 (which is lte 30). The debounce passes and an exit event is generated.

At the time of the user report, around 23:13:00, the user had left the bot but because the 90+ data points were all discarded, the DA thought the user was still in the bot perimeter.

The location of the user indicated in the screenshot (ie. 'blue') is correct. The DA is incorrect.

FWIW, the most recent data point as of 23:13:00 had an accuracy range of 65 and an is_moving of false.

Discussion

There's a number of ways to tweak the DA to correct for this. I'll leave that for another time, or for others to give their opinions.

@bengtan
Copy link
Contributor Author

bengtan commented Aug 22, 2018

I know we agreed to filter out data points with accuracy range > 30 but I also expected that the 30-sec debounce algorithm (ie. previous behaviour), which is a fallback plan 'B', would be unchanged.

I think the accuracy filter is filtering out data points before they reach plan 'B', so there has been an change (intentional or otherwise).

@toland
Copy link
Contributor

toland commented Aug 22, 2018

I think the accuracy filter is filtering out data points before they reach plan 'B', so there has been an change (intentional or otherwise).

Inaccurate data points are filtered out first thing. They are not considered at all. This was very much intentional. The request in #1713 was for inaccurate data to be "thrown out."

The example case that was raised to show why we should filter inaccurate data points had Miranda in a metal building for several minutes while her phone was reporting that she was hundreds of meters away. If we accepted an equally inaccurate data point 30 seconds later then the problem would not be fixed.

As I see it, the accuracy threshold and debouncing are completely orthogonal concepts. The data's accuracy doesn't improve with age 😄

@bengtan
Copy link
Contributor Author

bengtan commented Aug 23, 2018

Inaccurate data points are filtered out first thing. They are not considered at all. This was very much intentional. The request in #1713 was for inaccurate data to be "thrown out."

I guess I misunderstood/mis-interpreted your description of the changes. Fair enough. Good that it's been clarified now.

The error case observed in this ticket is going to be interesting ... in terms of how we're going to solve it.

Time to ponder.

@toland
Copy link
Contributor

toland commented Aug 23, 2018

I have had a deeper look at this case, and I am not sure that there is anything that we can, or should, do about this case.

Here are a few facts:

  1. The client was sending inaccurate data, sometimes very inaccurate.
  2. The accuracy of the backend algorithm is constrained by the accuracy of the data presented to it.
  3. The algorithm performed exactly as expected in this case.
  4. We have one documented case so far.
  5. This is actually a somewhat rare case based on the existing data in the database.

I reran some of my calculations on the accuracy values in the database this morning. 87% of the data in the staging database has an accuracy of 30m or less. 92% has an accuracy of 50m or less. Many of the data points in the screenshot above have an accuracy of 65m. There isn't much difference between 50m and 65m: 92.4% of the data points have an accuracy of less than 65m. iOS seems to "prefer" certain accuracy values: about 2.6% of the location data points have an accuracy of exactly 65m.

Based on eyeballing the data, it looks like inaccurate values are most likely to be clumped together. This makes sense since there is usually some external interference that would cause accuracy to be low.

What can we do? Not much, really. I think the algorithm itself is sound. There is only so much we can do with bad data and I am satisfied with the result in this case. It wasn't a perfect result, but it was a reasonable result given the quality of the data we had to work with.

One thing we can do is tweak the accuracy threshold. It is currently at 30m, and I think we can go as high as 50m without causing too many problems. Keep in mind that raising the threshold also raises the likelihood that we will see this same problem from the other direction. In other words, instead of an inaccurate result because we ignored inaccurate data, we could get an inaccurate result because we didn't ignore inaccurate data. I would be very hesitant to raise the threshold above 50m for that reason.

However, raising the threshold to 50m might not have mitigated this particular instance. There are a couple of data points around 22:40 with 48m accuracy that were more than 30s apart, so it might have exited the dog park, albeit late. That may not have been enough to properly enter the PDC bot, though. Eyeballing the data, it looks like it may have entered the PDC bot a few seconds earlier than it actually did.

I think the thing to do in this case is to chalk it up to a regrettable but unavoidable poor result due to circumstances beyond our control. The algorithm isn't magic, and you can't get a good result from bad data.

@toland
Copy link
Contributor

toland commented Aug 23, 2018

FYI

I bumped the accuracy threshold on staging to 50m and later to 90m on @thescurry's request.

@bengtan
Copy link
Contributor Author

bengtan commented Aug 27, 2018

Not sure where to put this. Maybe it might result in a front-end ticket later.

Anyway, here's the iOS documentation page for accuracy levels that we can configure iOS to return.

https://developer.apple.com/documentation/corelocation/cllocationaccuracy?language=objc

We're currently using kCLLocationAccuracyBest.

There is another level kCLLocationAccuracyBestForNavigation that we could try out, but it does come with warnings:

This level of accuracy is intended for use in navigation apps that require precise position information at all times. Because of the extra power requirements, use this level of accuracy only while the device is plugged in.

https://developer.apple.com/documentation/corelocation/kcllocationaccuracybestfornavigation?language=objc

@bengtan
Copy link
Contributor Author

bengtan commented Aug 27, 2018

Been reading the internets and forums on ios gps accuracy. According to some people, it's been deteriorating around, or since, iOS 11. And then there are all sorts of work-arounds to 'fix it' back up.

The anecdotes are all over the place. I'm sure there are a lot of false alarms, but there's probably some truth in there too. The problem is wading through them all.

@toland
Copy link
Contributor

toland commented Aug 27, 2018

Oh, boy. This sounds like a rabbit hole we could spend quite some time going down.

@bengtan
Copy link
Contributor Author

bengtan commented Aug 28, 2018

This sounds like a rabbit hole we could spend quite some time going down.

Yeah, that was my second last thought.

My last thought was ... Even if client-side GPS accuracy is deteriorating and there are special 'fixes' available, it's not feasible for us to 'fix' every user's phone so ... we'll have to do our best at the server-side and handle high accuracy uncertainties as best we can.

meta: It would be nice to be able to say to a user: 'Hey, your GPS accuracy sucks. Can you sort it out instead of sending us inaccurate data?' (Which is what prompted the suggestion of rn-chat#2720)

@toland
Copy link
Contributor

toland commented Aug 28, 2018

It would be nice to be able to say to a user: 'Hey, your GPS accuracy sucks. Can you sort it out instead of sending us inaccurate data?' (Which is what prompted the suggestion of rn-chat#2720)

I think we may have to figure out something like that. The one GPS app (other than maps) that I use a lot is Fitbit, and they have had significant problems with GPS accuracy over the last couple of years. I commented on the upstream ticket to that effect.

@bengtan
Copy link
Contributor Author

bengtan commented Aug 29, 2018

@thescurry has indicated that he's happy with an accuracy threshold of 90m. I've spun out a ticket (#1799) so we remember to configure Production eventually.

Since things seem to be satisfactory for now, I'm going to consider this feedback iteration as done, and closing this ticket.

Until we hit the next geofence anomaly.

@bengtan bengtan closed this as completed Aug 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants