Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add trip-to-trip transfers with in-seat option #303

Merged
merged 4 commits into from
Jul 26, 2022

Conversation

gcamp
Copy link
Contributor

@gcamp gcamp commented Jan 26, 2022

This PR adds new transfer_types for trip to trip transfers to define if a user can do an "in seat transfer" when the same vehicle is operating two consecutive trips and the user can stay onboard. In addition, the transfer type can define when in-seat transfers aren't allowed but can link together two different trips operationally.

Why are in-seat transfers useful? Why is vehicle operational information on trips useful?

Some transit systems have in-seat transfers as a core concept on how their system functions. It's also not an edge case or a lightly used feature. It's core to how Trimet, King County Metro1, and others function. With the current definition of block_id, those feeds are not representing their systems correctly.

On the other hand, it's also useful to know how a vehicle continues between trips. The main use case is to help create predictions of vehicle arrival at the start of the trip. Without knowing the link between trips, it's not possible to propagate delay information. This is used by at least Transit and Swiftly for prediction creation, and probably much more I don't know.

I thought block_ids represented this information?

Before Jan 21, 2017, block_id in GTFS would represent in-seat transfers while ignoring the operational-only functionalities. After #44, the definition was changed to represent the "same vehicle" operational definition and ignored the in-seat transfer feature. Confusion about what functionality block_id was present before that change, but that change didn't help.

Currently, Google Maps is still using block_id under the pre-2017 definition of in-seat transfers (Googlers can correct me if I'm wrong). At Transit, we try to deduce via heuristics if block_id represent an in-seat transfer or not, with varying success. I'm assuming other consumers are doing something similar since in-seat transfers are so important in some markets it can't be ignored.

Shouldn't the in-seat transfers be based on block_id?

There are multiple problems of the current block_id as implemented currently

  • Some information is lost and needs to be reconstructed by consumers. The producer knows the order of trips inside a block which is lost when using block_id.
  • "Block" often refers to a driver assignation. However what the spec is asking for is vehicle information, creating confusion.
  • Ambiguity around how block_id works across different service requires lengthy explanations. This interpretation across services is both error prone on the producer side and hard to implement on the consumer side. Some more ambiguity still exists around midnight when the service day changes (should it continue or not?).

Why use transfer_type?

The recent addition of trip to trip transfers in the spec allows us to make the type of transfer explicit. The idea of using transfer_type actually comes from @antrim from an old PR #32.

In addition

  • It provides in-seat transfer information, absent from block_id.
  • It makes a clear distinction between transfers where riders can stay on board and transfers where riders cannot stay on board while not duplicating that information.
  • It makes trip continuations explicit without any reconstructing of information.
  • It now allows for coupling and uncoupling of vehicles in a continuation
  • It allows producers and consumers of block_ids to still use it in parallel.

A downside of this approach is that it's more data heavy. However, data consumers are already required to recreate this data themselves anyway.

Footnotes

  1. Notice the "to route" mention on the right

hannesj added a commit to entur/OpenTripPlanner-LegacyHSLFork that referenced this pull request Jan 26, 2022
@hannesj
Copy link

hannesj commented Jan 26, 2022

Support for consuming these in OpenTripPlanner v2 is added in opentripplanner/OpenTripPlanner#3831

| `min_transfer_time` | Non-negative integer | Optional | Amount of time, in seconds, that must be available to permit a transfer between routes at the specified stops. The `min_transfer_time` should be sufficient to permit a typical rider to move between the two stops, including buffer time to allow for schedule variance on each route. |

#### Linked trips
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 for this illustration

@scmcca scmcca added the GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule label Jan 28, 2022
@mgilligan
Copy link

+1, I have this as a backlog item from years ago and I will eventually add it to TriMet's GTFS

@gcamp
Copy link
Contributor Author

gcamp commented Feb 7, 2022

Thanks for the feedback everyone!

There's no producers currently creating the data described in the PR, so a vote is not possible at the moment. We have a couple producers that we know are interested but none that would be able to create data quickly. If there are any producers interested in producing trip-to-trip transfers with in-seat option, don't hesitate to mention it here or reach out to me.

hannesj added a commit to entur/OpenTripPlanner-LegacyHSLFork that referenced this pull request Feb 8, 2022
@github-actions
Copy link

github-actions bot commented Mar 4, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label Mar 4, 2022
@gcamp
Copy link
Contributor Author

gcamp commented Mar 4, 2022

We are still waiting on a producer creating that for this PR but we had multiple producers commit on doing it.

@github-actions github-actions bot removed the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label Mar 5, 2022
@gcamp
Copy link
Contributor Author

gcamp commented Mar 23, 2022

Still waiting on producers, but we're hearing it's actively being worked on!

Meanwhile, we now have an open source tool that can convert block-based trip linking into transfers are proposed in this PR. The transition uses heuristics to do the conversion and find if in-seats transfers are possible, so this would be the perfect tool for consumers wanting to use the new specification. However, for producers having the real information about in-seats transfers this might only be a tool to explore the potential result and validation.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label Apr 16, 2022
@gcamp
Copy link
Contributor Author

gcamp commented Apr 16, 2022

Still being worked on!

@github-actions github-actions bot removed the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label Apr 17, 2022
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label May 11, 2022
@npaun
Copy link
Contributor

npaun commented May 11, 2022

Still being worked on. We're actively using trip-to-trip transfers in production. Several producers are working on generating this data.

@github-actions github-actions bot removed the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label May 12, 2022
@losvedir
Copy link

At the MBTA, we're working on producing some of this data in the coming week or two. Specifically, we will be adding transfer_type=4 to some of our trips, since some of them do support "in-seat transfers", and that's not something that can currently be expressed in GTFS.

However, at least for now, we will not be adding transfer_type=5. We already publish block_id with the usual operational definition of a sequence of trips by the same vehicle, with no implication towards "in-seat transfers", which at least Swiftly uses in its generation of real-time predictions for us. I don't see much further value in transfer_type=5, since it mostly seems to redundantly specify that same data, and since we don't have any 1-to-n or n-to-1 trip transfers, which is as far as I can tell the main increase in semantic power offered by the value. I'm not keen on the denormalization of including the same block information in two different ways in the feed, and it doesn't look like we can remove block_id anytime soon.

In addition, for some use cases of the data, say an operational dashboard displaying all the trip blocks, it seems to me much easier to generate that using a block ID and sorting. To do the same with transfer_type=5 would involve recursively constructing the blocks out of a bunch of interleaved linked lists, and worrying about 1-to-n and n-to-1 trips, and cycles in errant data.

We're aware that Google misuses block_id, unfortunately, but then again I doubt that they will support transfer_type=5, either. In other words, while transfer_type=4 allows us to express important new information, I can't see any concrete benefit to transfer_type=5 at this time, for us.

@gcamp
Copy link
Contributor Author

gcamp commented May 26, 2022

@losvedir all the points that you are making are valid, even if mixing transfer_type=4/5 and block_id is not something we envisioned. Mixing the two can create confusions, especially considering that using block_id always requires some guessing and matching. I would suggest duplicating the value but it's still valid to not use if it you don't want to.

I added some clarification in the spec to mention the transfers have priority vs block when there's a conflict.

@gcamp
Copy link
Contributor Author

gcamp commented Jul 18, 2022

I’m opening a vote on this PR.

Consumer : Transit, OpenTripPlanner.
Producer : MBTA, Entur

I'd also like to note to any producers wanting to update their backend that we created a converted that takes a GTFS with blocks and converts it to a GTFS with type 4 and 5 transfers.

Voting ends on 2022-07-25 at 23:59:59 UTC.

@scmcca scmcca added the Status: Voting Pull Requests where the advocate has called for a vote as described in the changes.md label Jul 18, 2022
@skinkie
Copy link
Contributor

skinkie commented Jul 18, 2022

Good that the N:1 and 1:N cases are documented. I wouldn't have expected this.

+1 (OpenGeo)

@hannesj
Copy link

hannesj commented Jul 18, 2022

+1 from Entur

Co-authored-by: Leonard Ehrenfried <mail@leonard.io>
@prhod
Copy link

prhod commented Jul 19, 2022

+1 from Hove

@flocsy
Copy link
Contributor

flocsy commented Jul 20, 2022

+1 Moovit

@leonardehrenfried
Copy link
Contributor

+1 OpenTripPlanner

@BodoMinea
Copy link

+1 FlashWeb IT as a producer

@jfabi
Copy link
Contributor

jfabi commented Jul 22, 2022

+1 from the MBTA.

The MBTA has worked to implement our bus and ferry networks’ in-seat transfers as transfer_type=4 entries, and a sample static GTFS file can be found at this link, which may used for testing and validation. As @losvedir has mentioned, the sample does not include transfer_type=5 block transfers, however block_id is still to be populated in our trips.txt file.

Note that the MBTA has already been publishing trip-to-trip transfers (using the from_trip_id and to_trip_id fields) for the existing transfer types for the past 2+ years.

@gcamp
Copy link
Contributor Author

gcamp commented Jul 26, 2022

The vote has ended with 7 votes in favour and no opposition. The proposition passes, thanks everyone!

@omar-kabbani omar-kabbani removed the Status: Voting Pull Requests where the advocate has called for a vote as described in the changes.md label Jul 26, 2022
@omar-kabbani omar-kabbani merged commit 9d5ebf1 into google:master Jul 26, 2022
omar-kabbani added a commit to MobilityData/transit that referenced this pull request Aug 4, 2022
commit 9d5ebf1
Author: Guillaume Campagna <guillaume.campagna@gmail.com>
Date:   Tue Jul 26 17:09:35 2022 -0400

    Add trip-to-trip transfers with in-seat option (google#303)

    * Add trip-to-trip transfers with in-seat option

    * Fix stop_id are **Conditionally Required** and formatting

    * Add clarification about potential conflict

    * Fix typo

    Co-authored-by: Leonard Ehrenfried <mail@leonard.io>

    Co-authored-by: Nicholas Paun <np@icebergsystems.ca>
    Co-authored-by: Leonard Ehrenfried <mail@leonard.io>

commit a132709
Author: McKenzie Maidl <40008048+mckenzie-maidl-ibigroup@users.noreply.github.com>
Date:   Tue Jul 26 13:58:04 2022 -0700

    addition of cause_detail and effect_detail to the spec (google#332)

commit 8993a24
Author: Zsombor Welker <flaktack@users.noreply.github.com>
Date:   Mon Jul 25 14:49:40 2022 +0200

    Add WheelchairAccessible documentation (google#340)
@bdferris-v2
Copy link
Collaborator

Hey all, I'm adding support for this change to the MobilityData GTFS Validator and I had a question. The spec was amended to indicate that from_stop_id and to_stop_id are conditionally forbidden if transfer_type 4 or 5 are specified. I assume the same would apply to from_route_id and to_route_id as well?

@hannesj
Copy link

hannesj commented Oct 6, 2022

It is not that they are forbidden, but that they can't refer to a stop of location_type 1, i.e. stations, only stops of location_type 0, i.e. stops.

Also regarding routes, there is pre-existing requirement in the specification.

If both to_trip_id and to_route_id are defined, the trip_id must belong to the route_id

@felixguendling
Copy link

felixguendling commented May 22, 2023

Hi! I'm adding support for linked trips to MOTIS as described in this PR and I'm not sure if I understand the specification correctly. So I wanted to ask/clarify here. Probably I'm missing a detail from the spec. If this is not the right place, I would be grateful for a hint where I can ask.

With block_id, it is clear that all services need to have the same traffic days (service_id):

A block consists of a single trip or many sequential trips made using the same vehicle, defined by shared service days and block_id.

For in-seat transfers, the specification only requires the "n" (from 1-to-n/n-to-1) side to be identical.

In a 1-to-n continuation, the trips.service_id for each to_trip_id MUST be identical.
In an n-to-1 continuation, the trips.service_id for each from_trip_id MUST be identical.

Question regarding the illustration example:

Trip A
───────────────────\
                    \    Trip C
                     ─────────────
Trip B              /
───────────────────/

As I understand it: It's specified that trip A and B need to have the same service_id. Trip C has no requirements regarding its service_id.

Let's make an extreme example just to better understand the specification. Although it might happen in practice with long distance night trains.

Trip A and B service days bitset: 1100
Trip A stop times: 24:10, ..., 47:50
Trip B stop times: 00:10, ..., 47:50
Trip C stop times: 00:15, ...

Which traffic days would Trip C need to have so that both A and B are coupled on both days they operate on?
1100 (which also means that 00:15 would need to change to 48:15 so A and B can continue as C) or 0011 (with 00:15)? Or are both options valid and the data consuming system needs to detect which trip instance is the next instance (so the matching cannot be done based on the active traffic days in service_id alone)? This rule would potentially be dangerous if C does not operate always when A and B operate (so sometimes, the transfer would not be active). Most cases could probably be prevented with a heuristic based on the specification saying "the last arrival time of from_trip_id SHOULD be prior but close to the first departure time of to_trip_id" but a clear rule would be better.

If C would have less or more service days than A and B, this would mean that the transfer exists in some cases and in other cases not? Or does the number of 1 entries in the service_id need to match, so for each A+B there is a C continuation? What happens at the end of the timetable range if some 1 entries might be "cut off"?

My goal is to prevent creating transfers between trips that are not to be connected - especially in cases where times >24:00 are involved and/or the transfer goes over midnight.

Is there a reason to not require all involved services to have the same service_id so services can be easily matched like with block_id?

I found this example in the OTP repository: https://github.com/entur/OpenTripPlanner/tree/6b502d47dc60dad18c6dab66338eea60ae41f3cb/src/test/resources/gtfs/interlining

However, the example does not contain one of the corner cases I described above (no times >24:00 and all trips have the same traffic days).

It would be very helpful if the GTFS (and maybe GTFS-RT) standards would provide corner-case test suits and a description what is the expected behavior of a system consuming the data. If someone thinks this could be helpful, we could start collecting minimal edge-case datasets (including descriptions) in a central spot (similar to the autobahn testsuite for websockets or Acid3 test for web browsers).

Edit: I already implemented this for HAFAS Rohdaten and initially got it wrong there, too. So this time I try to get it right the first time.

@gcamp
Copy link
Contributor Author

gcamp commented May 23, 2023

Hi @felixguendling,

First, seems you're referencing internal concept you have on service days (service days bitset), so I didn't grasp everything on your questions. But here is my best answer :

This rule would potentially be dangerous if C does not operate always when A and B operate (so sometimes, the transfer would not be active)

If the transfer is active in some days and not others, you can duplicate A and B and have two version of the trip. One with a trip-to-trip transfer to C and one without.

Is there a reason to not require all involved services to have the same service_id so services can be easily matched like with block_id?

For service lines that operate 24/7 and always have a trip-to-trip transfer (think of a loop route for example), there would be no way to specify with block_id a link between two service days. If we enforce that rule of always the same service_id, it would also be the case for transfer_type 4 and 5.

@felixguendling
Copy link

felixguendling commented May 24, 2023

Yes, the service day bitset is an internal bitset for the service_id combining all active days from calendar.txt and calendar_date.txt for the loaded timetable period.

If the transfer is active in some days and not others [...]

This is exactly, what I wanted to compute: "What is the set of days where the transfer is active?"

Based on the concept of the service_id bitset, the rule to calculate the day offset between of two trips t1 and t2 (to establish a relation ship which active day in t1 corresponds to which active day in t2) connected via a transfer_type=4 can be described as follows:

t1_offset = last_stop_time(t1) / 24h
t2_offset = first_stop_time(t2) / 24h
day_change_offset = last_stop_time(t1) % 24h > first_stop_time(t2) % 24h ? 1 : 0
offset = t2_offset - t1_offset + day_change_offset

So if t1 operates on day i, we know that t2 has to operate on day i+offset or otherwise the stay-seated transfer does not exist. This allows us to calculate the set of days where the transfer is active by calculating (bitset(t1) >> offset) & bitset(t2)). This can only be done if all times (and corresponding bitsets) are in the same timezone (e.g. after translation to UTC) since there's no rule that services connected by transfer_type=4 have to be in the same timezone (agency_id).

Thank you for the infinite loop example - this really helped to understand the reason why service_id is not required to match and is also a special case which we need to consider in our routing algorithm in order to not run into an infinite (or at least unnecessarily long) loop.

Thank you for responding so quickly - even if my question was a bit confusing, sorry for that!

@felixguendling
Copy link

The specification doesn't forbid having frequency based trips that are connected by transfer_type=4. What are the rules to handle this case?

isabelle-dr pushed a commit to MobilityData/transit that referenced this pull request Dec 19, 2023
* Squashed commit of the following:

commit 2e6887e
Author: scmcca <scott@mobilitydata.org>
Date:   Wed Feb 2 12:42:10 2022 -0500

    [Formatting fix] Add newlines before lists

    Improved syntax for different markdown parsers

commit 0033573
Author: Tristram Gräbener <tristramg@gmail.com>
Date:   Fri Jan 28 15:54:00 2022 +0100

    Specify that the filename are case sensitive (google#300)

    Closes google#297

commit 23d877e
Author: scott christian mccallum <scott@mobilitydata.org>
Date:   Tue Jan 18 19:09:46 2022 -0500

    "Fields" and "Values" as non-header (google#302)

* Squashed commit of the following:

commit 9d5ebf1
Author: Guillaume Campagna <guillaume.campagna@gmail.com>
Date:   Tue Jul 26 17:09:35 2022 -0400

    Add trip-to-trip transfers with in-seat option (google#303)

    * Add trip-to-trip transfers with in-seat option

    * Fix stop_id are **Conditionally Required** and formatting

    * Add clarification about potential conflict

    * Fix typo

    Co-authored-by: Leonard Ehrenfried <mail@leonard.io>

    Co-authored-by: Nicholas Paun <np@icebergsystems.ca>
    Co-authored-by: Leonard Ehrenfried <mail@leonard.io>

commit a132709
Author: McKenzie Maidl <40008048+mckenzie-maidl-ibigroup@users.noreply.github.com>
Date:   Tue Jul 26 13:58:04 2022 -0700

    addition of cause_detail and effect_detail to the spec (google#332)

commit 8993a24
Author: Zsombor Welker <flaktack@users.noreply.github.com>
Date:   Mon Jul 25 14:49:40 2022 +0200

    Add WheelchairAccessible documentation (google#340)

* Update README.md (#63)

* issue templates

* update use cases section

* update contact links

* rename

* Delete .github/ISSUE_TEMPLATE/spec_improvement.yml

---------

Co-authored-by: scmcca <scott@mobilitydata.org>
Co-authored-by: omar-kabbani <78552622+omar-kabbani@users.noreply.github.com>
Co-authored-by: Emma Jae Blue <emma@Emma-Jae-Blue.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule
Projects
None yet
Development

Successfully merging this pull request may close these issues.