With the amount of historical data that we have access to, there are many possible directions for data analysis. Broadly speaking, we're interested in being able to measure the efficiency of public transit in the city, and how closely it matches up to expectations (i.e. if a bus is supposed to arrive at a stop every 15 minutes, how often does it actually do so?).
We've provided a small sample of bus arrival data from two routes, and a notebook that produces the average wait times for stops along those routes. Can you produce other useful metrics? Are there any interesting patterns that appear in the data?
The sample_routes_stops_pst_15s.json
file that the example notebook loads is zipped up in sample_routes_stops_pst_15s.rar
, so you'll have to extract it before running the notebook.