Feature/adapt for restart #2

sidetrackedmind · 2022-02-09T20:50:31Z

Updating TrynAPI schema, resolver, and s3helper code to align with Trimet API and new S3 bucket configuration.

Main Updates

field names are different (for instance, new "vehicleID" vs. previous "vid")
some field names are not available (for instance, new "heading" vs. previous "bearing")
old s3 bucket used to store data by the minute ${agencyId}/${year}/${month}/${day}/${hour}/${minute}/
new s3 bucket stores data by the hour ${agencyId}/${year}/${month}/${day}/${hour}/
functions have been adapted to the modified s3 setup but there is room for improvement. It works now but I would suggest an issue for future improvement. There are inefficiencies in the way it searches the bucket for new files within a timeframe.

…aused error

avoid crashing on errors extracting data from s3; show s3 path that caused error

youngj · 2022-02-11T02:59:48Z

README.md

+  state(agencyId: "trimet"
+    , startTime: 1642867201
+    , endTime: 1642867500
+    , routes: ["100"]) {
    agencyId
    startTime
    routes {
      routeId


To be consistent, can we either change agencyId and routeId to agencyID and routeID, or change vehicleID and tripID to vehicleId and tripId?

youngj · 2022-02-11T07:16:03Z

src/helpers/s3Helper.js

 * @param agencyId - String
 * @param currentTime - Number
 * @return prefix - String
 */
-function getBucketMinutePrefix(agencyId, currentTime) {
+function getBucketHourPrefix(agencyId, currentTime) {


The previous S3 path format with the minute was used because the S3 API allows searching for keys by prefix, and including the minute in the prefix of the path makes it possible to fetch data with a granularity of one minute. opentransit-metrics fetches data from opentransit-state-api in chunks that are not necessarily multiples of 1 hour.
If the minute was removed from the bucket prefix, it would require searching the S3 for keys with a granularity of 1 hour and then filtering within those keys to find keys with the requested minutes. I'm not sure that there is a drawback to keeping the minute in the prefix so it's probably easier that way.

After reviewing the existing code in getVehiclePaths, I see that it makes separate s3.listObject requests to get the list of keys for every minute within the time range, which is a lot more API requests than necessary and might reduce performance. Now I think it probably would be faster to search for a keys with a granularity of 1 hour and then filter within those keys to find the requested minutes.

youngj · 2022-02-11T07:21:14Z

src/helpers/s3Helper.js

-  const hour = currentDateTime.getUTCHours();
-  const minute = currentDateTime.getUTCMinutes();
-  return `${agencyId}/${year}/${month}/${day}/${hour}/${minute}/`;
+  const pacificDateTime = convertTZ(currentDateTime, 'America/Los_Angeles');


Using America/Los_Angeles as the time zone could cause issues related to daylight savings since when switching from PDT to PST in the fall the there would be two hours with the same key prefix.

Using America/Los_Angeles would also be confusing when deploying opentransit-state-api for transit agencies that are not in Pacific time. It is also possible that a single instance of opentransit-state-api could support transit agencies in multiple time zones. It would probably be simpler to keep the S3 keys in UTC.

sidetrackedmind and others added 5 commits January 25, 2022 19:43

minor updates to adapt for restart

44d306c

changed timestamps in resolver from millisecond to second

40ef714

update readme and query example for new schema

8ffbec9

avoid crashing on errors extracting data from s3; show s3 path that c…

f25028d

…aused error

Merge pull request #1 from codeforpdx/youngj-s3-errors

c4394da

avoid crashing on errors extracting data from s3; show s3 path that caused error

sidetrackedmind requested review from youngj, kgottfri and liamphmurphy February 9, 2022 20:50

youngj reviewed Feb 11, 2022

View reviewed changes

kgottfri approved these changes Feb 22, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/adapt for restart #2

Feature/adapt for restart #2

sidetrackedmind commented Feb 9, 2022

youngj Feb 11, 2022

youngj Feb 11, 2022

youngj Feb 12, 2022

youngj Feb 11, 2022

Feature/adapt for restart #2

Are you sure you want to change the base?

Feature/adapt for restart #2

Conversation

sidetrackedmind commented Feb 9, 2022

Main Updates

youngj Feb 11, 2022

Choose a reason for hiding this comment

youngj Feb 11, 2022

Choose a reason for hiding this comment

youngj Feb 12, 2022

Choose a reason for hiding this comment

youngj Feb 11, 2022

Choose a reason for hiding this comment