-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an available vehicles endpoint to provider #310
Comments
Why not just use GBFS to serve this functionality? To me this seems like the most obvious alternative, and I believe the standard includes 3/4 pieces of information you are looking for (with the exception being the most recent event ID). I know that there is till work being done to generalize the standard for dockless (and companies have implemented their own various flavors of it for scooters / other devices), but many of them (including lime) already have GBFS for various places. In fact, there is a field in providers.csv to provide the GBFS discovery URL. Now that you mention it, Lime's is empty....😳 I guess the next question would be: in what case would you want fleet status information that wouldn't be shown in an open, public feed? If I am understanding correctly that your focus is on availability, I'd be curious to hear how GBFS falls short. |
For use cases like parking caps, cities often want to see both available and unavailable devices to make sure they're getting a full picture of the provider's presence in the ROW. GBFS doesn't work for this use case since it only includes available devices. |
some providers also purposefully don't expose consistent/reconcilable device identifiers between MDS and GBFS. |
ah...I see. |
I don't mean to sound dismissive, but to me this proposal does not address the above concern.
Agencies need raw data to be able to do the analysis precisely because getting reliable device counts/information from companies is so problematic. This endpoint would further remove consumers of MDS Provider from the raw data and set up an implicit trust in any given company's methods of counting/aggregation. From the basic "level playing field" point of view, how is a regulatory agency supposed to know if Company A altered their counts or otherwise counted in a "unique" way, different or altogether incompatible with Company B, C, D, etc.? |
@thekaveman I don't think @johnpena was suggesting pre-aggregating or pre-counting with provider-specific methods, but rather to provide the raw state of vehicles at a given point in time similar to GBFS but including non-public info like real device_ids and the state of non-available vehicles. Since status_changes is a messy feed at this point (and probably always will be), it seems super helpful to be able to check against the current state periodically. The main problem I see with the proposal is that the nice thing about status_changes is it's a concise way to represent several months of data (basically it's a differential encoding), whereas this "current state" functionality would be quite large to represent historically with any granularity (GBFS gets around this by only giving the live state). Still it would be quite nice to have access to the full state of the system historically with something like e.g. an hourly granularity. |
We spoke briefly about this at the MDS meeting and discussed a similar solution to what @fscottfoti suggests (though nothing was decided on). Consumer of MDS want to get an accurate world state both now and at times in the past. As long as |
To be clear, I'm not proposing removing
This is the primary use case I would like to address with this change. Replaying status changes is error prone and something it seems a lot of cities would rather not do.
It's a fair point that retaining historical data might be cumbersome, but it's something we should consider. The fact that GBFS only gives you a live snapshot rules it out for a lot of interesting use cases. |
Thank you @johnpena for creating this issue. This is exactly what I was suggesting one or two MDS calls ago. I believe cities want GBFS implemented for the As a provider I may choose to implement the use cases of GBFS and MDS differently.
MDS provider APIs do not specify what is a timely matter. I interpret this as allowing use of DB replicas, or cron jobs, etc. where some seconds or minutes or lag are acceptable. MDS agency attempts to alleviate this, but hasn't seen much adoption amongst cities. A realtime API would require realtime data sources and be implemented like GBFS free_bike_status but with more data allowed in an authenticated API. Implementing this feature would allow MDS to remove the dependency on the subset of GBFS outlined in the Realtime Data of the spec. Is the goal of MDS to replace GBFS for scooter and dockless vehicle shares? Or complement the GBFS spec with more data? |
Right, and I could have chosen my words more carefully... I was trying to get across that Whereas this proposal suggests a more subjective stream of data - what a given company "thinks" is out there; this is slightly removed from the ground-truth of operations that the current endpoints were designed to capture. Under this re-framing, my main point still stands:
If Company A decides "these are the devices we have available right now" - how did they make that decision? What conditions determine whether a device makes it on this list or not? Does Company B, C, D, and all others share those conditions and make those decisions the same way? Even the relatively simple description of this endpoint: From the standpoint of fairly administering a regulation, e.g. for parking or device caps: we need consistency in how these things are measured/accounted for. I'm concerned that this proposal does not address that need for consistency. |
@thekaveman you bring up good points, but a lot of these strike me as broader issues with MDS that are outside the scope of this issue on it's own. With respect to determining the set of vehicles published in this endpoint, I think we could specify that any vehicle for which the provider would publish a status change or trip during that time period, the provider should include that vehicle in the set of all vehicles. |
We discussed this issue in the weekly Thursday meetings this week. I believe there was a consistent need vocalized for a consistent, well defined to calculated realtime metrics for caps. My concern with a realtime only endpoint is if a consumer's (eg, a city's) system goes down for some time window they miss data with no way to recover it. |
I was thinking today it might be useful to separate this into two separate problems (which I sort of helped conflate in my comments above, sorry). The original intent of the issue as presented by @johnpena was to provide a simple mechanism to check caps for cities, inspired by GBFS and possibly only for the current state of the system (no historic data). This is mainly to allow cities to see the fleet of vehicles out there right now with some privacy protections that GBFS wouldn't provide. In the spirit of @thekaveman's comments, perhaps this is insufficient to compute a more nuanced definition of cap compliance (e.g. a rolling average of some kind) and for that we need historic data. Or like @billdirks says, maybe we want to compute cap compliance for some time before we started harvesting the feed, or the feed went down for a bit. My biggest problem is that MDS feeds are noisy - some events aren't recorded, and I think we frequently have no evidence of when a vehicle stops pinging the operator. Basically we have no representation of what's out there at a given point in time, we just have its most recent event in the status_changes feed to that point in time. It's an excellent compression scheme, but it's a lossy compression! I had proposed using this quasi-GBFS feed as a way to get the state of the system at any historic point in time, but alternatively, what if we just added a "ping" event that essentially says "still available" at some reasonable cadence, perhaps as infrequent as an hourly basis. This would solve having to create assumptions about when a vehicle disappears from the feed, at least within an hourly resolution. I can think of other ways to do this rather than a ping - operators could rigorously add service_ends after they haven't heard from a vehicle for a given period of time - but it might be nice to have the confidence of a good solid ping ;) Even if you disagree with that specific proposal, the larger points still holds. One could have the new endpoint proposed by @johnpena to give a GBFS-like view of the current world out there and we could solve the messy-ness of status_changes some other way, in order to give sufficient flexibility in the definition of cap compliance. Put another way, perhaps the endpoint proposed here should give "the vehicles that are available in the city right now" rather than "cap compliance." |
Is your feature request related to a problem? Please describe.
Agencies often use our MDS status change feed to figure out which vehicles are available in their region. Many are trying to calculate parking caps, or trying to get simple counts of vehicle availability. Using status changes is problematic. Most agencies try to do this by replaying all of the status changes in their feed, in an attempt to replay all the state changes in the vehicle state machine. Most just want to answer the simple question: what's the current status of the provider's fleet?
Describe the solution you'd like
A vehicles endpoint in MDS provider, similar to the one in MDS agency, wherein a provider publishes a list of the vehicles that are currently registered in the region. This endpoint could take as inspiration the same functionality that's provided in GBFS. It could contain the following information:
Is this a breaking change
Provider
oragency
For which API is this feature being requested:
provider
agency
Describe alternatives you've considered
The most obvious alternative is to add additional status change types, but this would still require agencies to do their own analysis to replay each vehicle's state machine. I believe this work should be done regardless. But since many agencies are trying to understand vehicle availability, I believe it would be easiest for all parties if the provider made this available through a simple API where little-to-no analytics or data processing would be necessary.
The text was updated successfully, but these errors were encountered: