Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/adapt for restart #2

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

sidetrackedmind
Copy link
Collaborator

Updating TrynAPI schema, resolver, and s3helper code to align with Trimet API and new S3 bucket configuration.

Main Updates

  • field names are different (for instance, new "vehicleID" vs. previous "vid")
  • some field names are not available (for instance, new "heading" vs. previous "bearing")
  • old s3 bucket used to store data by the minute ${agencyId}/${year}/${month}/${day}/${hour}/${minute}/
  • new s3 bucket stores data by the hour ${agencyId}/${year}/${month}/${day}/${hour}/
  • functions have been adapted to the modified s3 setup but there is room for improvement. It works now but I would suggest an issue for future improvement. There are inefficiencies in the way it searches the bucket for new files within a timeframe.

state(agencyId: "trimet"
, startTime: 1642867201
, endTime: 1642867500
, routes: ["100"]) {
agencyId
startTime
routes {
routeId
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent, can we either change agencyId and routeId to agencyID and routeID, or change vehicleID and tripID to vehicleId and tripId?

* @param agencyId - String
* @param currentTime - Number
* @return prefix - String
*/
function getBucketMinutePrefix(agencyId, currentTime) {
function getBucketHourPrefix(agencyId, currentTime) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous S3 path format with the minute was used because the S3 API allows searching for keys by prefix, and including the minute in the prefix of the path makes it possible to fetch data with a granularity of one minute. opentransit-metrics fetches data from opentransit-state-api in chunks that are not necessarily multiples of 1 hour.
If the minute was removed from the bucket prefix, it would require searching the S3 for keys with a granularity of 1 hour and then filtering within those keys to find keys with the requested minutes. I'm not sure that there is a drawback to keeping the minute in the prefix so it's probably easier that way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reviewing the existing code in getVehiclePaths, I see that it makes separate s3.listObject requests to get the list of keys for every minute within the time range, which is a lot more API requests than necessary and might reduce performance. Now I think it probably would be faster to search for a keys with a granularity of 1 hour and then filter within those keys to find the requested minutes.

const hour = currentDateTime.getUTCHours();
const minute = currentDateTime.getUTCMinutes();
return `${agencyId}/${year}/${month}/${day}/${hour}/${minute}/`;
const pacificDateTime = convertTZ(currentDateTime, 'America/Los_Angeles');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using America/Los_Angeles as the time zone could cause issues related to daylight savings since when switching from PDT to PST in the fall the there would be two hours with the same key prefix.

Using America/Los_Angeles would also be confusing when deploying opentransit-state-api for transit agencies that are not in Pacific time. It is also possible that a single instance of opentransit-state-api could support transit agencies in multiple time zones. It would probably be simpler to keep the S3 keys in UTC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants