-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flat file implementations of provider API #58
Comments
I can't speak to the relevance of a flat file implementation specifically, but I strongly agree that there is value to having a well-defined object format (i.e. events, trips, routes) independent of any specific interface. Maybe we can start by breaking things down into object types and interfaces? |
This makes a ton of sense to me. @hunterowens: Thoughts on this? Related discussion in #60 as well. Idea would be to:
LA would require that all providers implement the API, but other cities could adopt the data component of the spec while using something other than an API to move data. |
I think a flat file standard would be 💯 as many cities won't have the ability to consume API endpoints but will want to access the data. Main concern here is some standardized data chunking or ability to break what will certainly be very large data sets up to be manageable. |
Strongly agree with this. I’d like to see simplifications to the Provider API that allow it to be expressed fully as a flat file interface instead of an additional component to the spec. This would position the provider API as an easily-consumable data feed accessible to smaller cities with limited IT or engineering staff, but available data analysts familiar with spreadsheet tools. A sample trips request and response might look like this:
Parameters for the This change would conflict with #46 which pulls the API away from common skillsets present in cities and toward those typically found in software engineering orgs. Challenges would include representing routes and pagination. Pagination could be addressed out of band with Web Linking headers. |
I've put together an initial proposal for this at #68 |
I think a flat file implementation would be great at improving adoption. Will comment on the PR from here. |
I am a fan of the changes in #68, in particular the addition of |
I'm agreed on the flat file. As an academic potentially doing research with this data in the future, it's going to be more useful to have a dump of all the trips (even if it's 10GB or 1TB) than to have to repeatedly query the api. |
The changes in #68 look good, with one potential change: My assumption is that the start and end location will be used far more often than the full collection of points that make up the route. What do people think about leaving start and end location in the basic trip data structure (CSV-friendly), while offering the optional As proposed in #68, I still need a database (or ability to process and join two datasets) to get start/end location for a trip. Leaving it in the |
I agree that for many (most?) users a flat file, even a potentially large one, would be the preferred way of analyzing the data. As @migurski points out, this would pull away from #53. My concern is the same as in #46: CSVs are emphatically not a well-specified format, and I can easily see an In #68 there is some discussion of a quasi-official client-side tool for transforming JSON-data to a CSV (or other flat file). I think that could be a good compromise. |
My preference would be the client-side tool, as mentioned in #68. My worry about supporting CSVs on the same spec is loosing some of the ability to have robust standardization, for example, JSON Schema. A tool like Additionally, supporting some sort of token based auth means that the barrier to entry is already high, I wonder if we are creating a bunch of work for ourselves if we add CSV support only to require that they use complex request to get that CSV In order to support that, I have opened issue #79 for CORS. Can anybody think of additional changes to spec needed for that tool. |
I'm not particularly tied to CSV, although I can see it being valuable for
smaller cities (they can analyze it in Excel). But I think the important
piece here from a research perspective is that we need a way to get all of
the trips out in a single file so we can analyze then en masse, without
needing to make piecemeal requests to an api.
…On Tue, Sep 18, 2018, 12:09 PM Hunter Owens ***@***.***> wrote:
My preference would be the client-side tool, as mentioned in #68
<#68>.
In order to support that, I have opened issue #79
<#79>
for CORS. Can anybody think of additional changes to spec needed for that
tool.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#58 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAimrisScVS-lJS3-sCY7Ye_7tHULZY_ks5ucUTVgaJpZM4Wl2Ke>
.
|
Also throwing my 👍 towards the conversion tool idea, and a more robust JSON-based spec. And since it seems the crowd is leaning this direction... Does anyone want to take a peek over at #46 / #53 for a discussion around JSON Schema and tightening up this spec for machine validation? @ian-r-rose made some additions to #53 that haven't made their way over yet, but I'll merge them ASAP. |
Since we have JSON Schema inclusion, I'm gonna close this issue. Hoping that we can come up with a toolset to serve as an mds-provider-bulk-downloader |
I am concerned that many cities may not have the technical capacity to consume a dynamic, query-based API.
The /trips and /events endpoints both lend themselves to a flat file format. While LA will clearly want to require a dynamic API, it may be worth broadening MDS to allow for other cities to request the same data in flat file format (e.g. monthly/weekly CSVs).
Are LADOT folks amenable to including some guidance on "flat file MDS implementations" in the spec? I suspect this would make MDS more appealing to other cities.
Happy to do put in pull request of people think this is a good idea.
The text was updated successfully, but these errors were encountered: