Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate RT delays to missing stop_sequences #162

Open
nukeador opened this issue May 21, 2024 · 6 comments
Open

Propagate RT delays to missing stop_sequences #162

nukeador opened this issue May 21, 2024 · 6 comments

Comments

@nukeador
Copy link

nukeador commented May 21, 2024

Following from #160 (comment)

The GTFS spec states that:

If one or more stops are missing along the trip the delay from the update (or, if only time is provided in the update, a delay computed by comparing the time against the GTFS schedule time) is propagated to all subsequent stops. This means that updating a stop time for a certain stop will change all subsequent stops in the absence of any other information. Note that updates with a schedule relationship of SKIPPED will not stop delay propagation, but updates with schedule relationships of SCHEDULED (also the default value if schedule relationship is not provided) or NO_DATA will.

This is a feature request so node-gtfs is able to propagate this delay when previous stops do have RT time data.

Some of us are encountering agencies that only provide RT trip updates for a few stops on a trip and not all of them, this results on missing RT times for many stops along a trip.

Other apps such as OTP, provide a config setting to granularity control this behavior.

@brendannee
Copy link
Member

Thanks for the details on this.

Right now, the node-gtfs library just stores exactly what was received to the database and doesn't try to modify it in any way. It's up to the application querying the database to handle the data.

I'm thinking it might be useful to build this functionality (and a few other features, like exposing a JSON API) as part of a new library built on top of node-gtfs.

@nukeador
Copy link
Author

I see, are there any best practices or examples of other libraries built on top for reference? Thanks!

@brendannee
Copy link
Member

Great question.

I haven't used the GTFS-Realtime functionality of the library extensively, which is why there are not a lot of features built around it (I mostly use the static GTFS functionality in lots of other apps). Often, when I need an app to query GTFS-Realtime data, I just do that directly and not store the data in sqlite via node-gtfs.

I don't know any other projects built on top of node-gtfs that use the GTFS-Realtime functionality, but I'd love to have some point to.

@nukeador
Copy link
Author

We are building an app that uses node-gtfs and returns a RESTful API, with support for multiple agencies and realtime updates. Basically it's a tailor solution to serve data to a client-side web app (PWA)

Probably in the future we should standardize the data structure and field naming of the output json so it can be reused by others.

https://github.com/VallaBus/api-auvasa (sorry docs and comments are in Spanish)

@nukeador
Copy link
Author

nukeador commented Jun 3, 2024

As far as I've found, delay propagation is part of the GTFS spec expected behavior

If one or more stops are missing along the trip the delay from the update (or, if only time is provided in the update, a delay computed by comparing the time against the GTFS schedule time) is propagated to all subsequent stops. This means that updating a stop time for a certain stop will change all subsequent stops in the absence of any other information. Note that updates with a schedule relationship of SKIPPED will not stop delay propagation, but updates with schedule relationships of SCHEDULED (also the default value if schedule relationship is not provided) or NO_DATA will.
Example

For the same trip instance, three StopTimeUpdates are provided:

  • delay of 300 seconds for stop_sequence 3
  • delay of 60 seconds for stop_sequence 8
  • ScheduleRelationship of NO_DATA for stop_sequence 10

This will be interpreted as:

  • stop_sequences 1,2 have unknown delay.
  • stop_sequences 3,4,5,6,7 have delay of 300 seconds.
  • stop_sequences 8,9 have delay of 60 seconds.
  • stop_sequences 10,..,20 have unknown delay

So I guess that's what you would expect node-gtfs to do by default when you request getStopTimeUpdates()

@nukeador
Copy link
Author

nukeador commented Jun 4, 2024

As a temporal workaround I've created a function to calculate the delayed propagation for a given trip_id+stop_sequence that doesn't have realtime data, so I can query and apply the delay to the scheduled time.

I've done this both for forward but also for backward propagation, to cover numerous trips that arrive before schedule and end up showing only the scheduled arrival.

It's not perfect, but it's the most accurate way to show timetables following the GTFS spec recommendation. Ideally node-gtfs can implement something in the future so it's a built-in configurable feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants