Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: OL Crowding Heuristics Implementation #1879

Merged
merged 25 commits into from
Nov 7, 2023
Merged

Conversation

cmaddox5
Copy link
Contributor

Asana task: Add logging to triptych app - pt 2

A lot of changes here are refactoring so that credo doesn't complain about too many parameters or functions having too many nested conditionals. The meat of this PR is in get_instance_using_dwell_time. The gist is that we wait for the train to be stopped at the previous station. Once it is stopped, we get the prediction for that particular part of the trip in order to get the predicted dwell time (departure_time - arrival_time). When now is >= to that time (minus 10 seconds just to give ourselves a little time before the train is supposed to leave), we want to show the widget. The original logic of the train being in transit to the current station will always take precedent. This is only to be used as an extra effort to show the widget a little sooner.

  • Tests added?

@cmaddox5 cmaddox5 requested review from a team and hannahpurcell and removed request for a team September 29, 2023 14:06
@hannahpurcell
Copy link
Contributor

hannahpurcell commented Oct 11, 2023

Once it is stopped, we get the prediction for that particular part of the trip in order to get the predicted dwell time (departure_time - arrival_time). When now is >= to that time (minus 10 seconds just to give ourselves a little time before the train is supposed to leave), we want to show the widget.

Before reviewing, I have questions on the PR description: do we need to calculate the dwell time? It's just the departure time minus 10 seconds.

Also, we don't want to show the widget in this case. We want to log to Splunk to say that "heuristic #1 was met" and include the crowding level at that time. Then we'll make a Splunk dashboard to answer these questions:

  • ...for what % of trips does that heuristic log appear BEFORE we'd normally show the widget?
  • ...what % of the time does the crowding level change between that heuristic log when the widget actually shows up?

I remember there being 2 heuristics to evaluate, and it seems like this PR only captures one. The second one is: two back-to-back readings of crowding are identical.

This might be TMI at this point, but rewriting the problem this way helped it click for me. We're trying to figure out if we we can have high confidence crowding predictions earlier than just "once the train has left a station". Can we have high confidence predictions when:

  • The train is scheduled to depart a station in 10s? (first heuristic)
  • Two back-to-back readings of crowding were identical? (second heuristic)

Edit: Wanted to add, the task should have been written with more and clearer detail!! The task description absolutely did not capture this work well.

@cmaddox5
Copy link
Contributor Author

Before reviewing, I have questions on the PR description: do we need to calculate the dwell time? It's just the departure time minus 10 seconds.

This is definitely a case of me overthinking what we needed... That does seem exactly right so unless I run into problems with that, I will make that change.

Also, we don't want to show the widget in this case.

Guess I got too excited trying to add the widget to the screen more often 😅 . Easy to fix though.

I remember there being 2 heuristics to evaluate, and it seems like this PR only captures one. The second one is: two back-to-back readings of crowding are identical.

This reminds me that I had a concern about this heuristic at the time and completely forgot to communicate it. Are these for two consecutive screen refreshes or some other interval to be done in the background? Both I feel pose different challenges. Two consecutive screen refreshes seems like most screens will never have a chance to catch that (although if we are just logging, not really a big issue. Just may not be a heuristic most stations use). A background interval isn't really a problem, but will mean I need to spin up a different background process to track it.

@cmaddox5 cmaddox5 assigned hannahpurcell and unassigned cmaddox5 Oct 11, 2023
@hannahpurcell
Copy link
Contributor

Are these for two consecutive screen refreshes or some other interval to be done in the background?

Ah, I don't remember. Maybe ask Paul H? We decided during that heuristics meeting and I cannot at all remember which way we decided

@github-actions
Copy link

Coverage of commit 6814b0f

Summary coverage rate:
  lines......: 40.5% (2323 of 5731 lines)
  functions..: 41.9% (1054 of 2518 functions)
  branches...: no data found

Files changed coverage rate:
                                                                           |Lines       |Functions  |Branches    
  Filename                                                                 |Rate     Num|Rate    Num|Rate     Num
  ===============================================================================================================
  lib/screens/application.ex                                               |75.0%      4|50.0%     2|    -      0
  lib/screens/departures/departure.ex                                      | 0.0%    152| 0.0%    30|    -      0
  lib/screens/ol_crowding/agent.ex                                         |60.0%      5|75.0%     4|    -      0
  lib/screens/ol_crowding/dynamic_supervisor.ex                            |50.0%      4|66.7%     3|    -      0
  lib/screens/ol_crowding/logger.ex                                        | 0.0%     20| 0.0%     4|    -      0
  lib/screens/v2/candidate_generator/widgets/train_crowding.ex             |64.5%     76|54.5%    11|    -      0

Download coverage report

@cmaddox5
Copy link
Contributor Author

Added an additional heuristic for consecutive crowing classes. Heuristic logic now live in their own function pick_heuristic. Dwell time logic is the same as before (except we look at departure_time instead of calculating it). New logic is simply log when two consecutive predictions have the same crowding classes.

@mbta mbta deleted a comment from github-actions bot Oct 23, 2023
@cmaddox5
Copy link
Contributor Author

which one did you end up choosing?

2 consecutive screen refreshes. The time-based heuristic will always get priority just so we aren't forever looking for 2 consecutive classes that are the same. Each refresh that still isn't currently past that time will compare to the previous. If it's the same, log it. Otherwise, replace what is in the cache with the new prediction.

@github-actions
Copy link

Coverage of commit f97ff2c

Summary coverage rate:
  lines......: 39.5% (2163 of 5471 lines)
  functions..: 38.5% (873 of 2268 functions)
  branches...: no data found

Files changed coverage rate:
                                                                   |Lines       |Functions  |Branches    
  Filename                                                         |Rate     Num|Rate    Num|Rate     Num
  =======================================================================================================
  lib/screens/application.ex                                       |75.0%      4|50.0%     2|    -      0
  lib/screens/departures/departure.ex                              | 0.0%    152| 0.0%    30|    -      0
  lib/screens/ol_crowding/agent.ex                                 |60.0%      5|75.0%     4|    -      0
  lib/screens/ol_crowding/dynamic_supervisor.ex                    |50.0%      4|66.7%     3|    -      0
  lib/screens/ol_crowding/logger.ex                                | 0.0%     19| 0.0%     4|    -      0
  lib/screens/v2/candidate_generator/widgets/train_crowding.ex     |53.9%     89|46.2%    13|    -      0

Download coverage report

@cmaddox5
Copy link
Contributor Author

After our chat in Slack, I agree that these heuristics should be independent. That has been changed.

@github-actions
Copy link

Coverage of commit f91f869

Summary coverage rate:
  lines......: 39.5% (2163 of 5472 lines)
  functions..: 38.5% (873 of 2269 functions)
  branches...: no data found

Files changed coverage rate:
                                                                   |Lines       |Functions  |Branches    
  Filename                                                         |Rate     Num|Rate    Num|Rate     Num
  =======================================================================================================
  lib/screens/application.ex                                       |75.0%      4|50.0%     2|    -      0
  lib/screens/departures/departure.ex                              | 0.0%    152| 0.0%    30|    -      0
  lib/screens/ol_crowding/agent.ex                                 |60.0%      5|75.0%     4|    -      0
  lib/screens/ol_crowding/dynamic_supervisor.ex                    |50.0%      4|66.7%     3|    -      0
  lib/screens/ol_crowding/logger.ex                                | 0.0%     19| 0.0%     4|    -      0
  lib/screens/v2/candidate_generator/widgets/train_crowding.ex     |53.3%     90|42.9%    14|    -      0

Download coverage report

Copy link
Contributor

@hannahpurcell hannahpurcell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some clean-up thoughts, but I think the logic is sound 👍

lib/screens/ol_crowding/dynamic_supervisor.ex Show resolved Hide resolved
Comment on lines 118 to 121
is_nil(next_train_prediction) or
alert_makes_this_a_terminal or
next_train_prediction.vehicle.carriages == [] ->
[]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since get_instance is its own function, this can all be done with pattern matching in the function def. Optional suggestion

cached_prediction,
common_params
) do
show_widget_after_dt = DateTime.add(cached_prediction.departure_time, -10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should maybe call this var show_widget_before_dt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well now needs to be >= this dt. I added a comment to this var that should help. Let me know if it's still unclear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think this value should be called something else, because it suggests we're logging after departure time, but really we're logging before departure time; that is, 10s before departure time. now needs to be later than that time, which is still before the vehicle has even departed.

TL;DR the value represented by this variable is pre-departure, whether by 10s, 15s, or more.

:gt
] do
log_crowding_info(
:dwell,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe adjust this to match the "rebrand" of time-based heuristic. Perhaps :some_time_before_departure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I went with :time_based because that's what I call the heuristic in code. Is that name clear or is it too general?

Copy link
Contributor

@hannahpurcell hannahpurcell Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on the fence, but I suppose we can leave it for now (until we add another time based heuristic). But there is another comment referring to dwell time on line 284 that could use clarification

@cmaddox5 cmaddox5 assigned hannahpurcell and unassigned cmaddox5 Oct 30, 2023
@github-actions
Copy link

Coverage of commit 6a469db

Summary coverage rate:
  lines......: 39.5% (2159 of 5471 lines)
  functions..: 38.5% (873 of 2269 functions)
  branches...: no data found

Files changed coverage rate:
                                                                   |Lines       |Functions  |Branches    
  Filename                                                         |Rate     Num|Rate    Num|Rate     Num
  =======================================================================================================
  lib/screens/application.ex                                       |75.0%      4|50.0%     2|    -      0
  lib/screens/departures/departure.ex                              | 0.0%    152| 0.0%    30|    -      0
  lib/screens/ol_crowding/agent.ex                                 |60.0%      5|75.0%     4|    -      0
  lib/screens/ol_crowding/dynamic_supervisor.ex                    |50.0%      4|66.7%     3|    -      0
  lib/screens/ol_crowding/logger.ex                                | 0.0%     19| 0.0%     4|    -      0
  lib/screens/v2/candidate_generator/widgets/train_crowding.ex     |49.4%     89|42.9%    14|    -      0

Download coverage report

) do
show_widget_after_dt = DateTime.add(cached_prediction.departure_time, -10)
# cached_prediction.departure_time minus previous_departure_time_cushion is when we expect crowding to be reliable.
# When now >= this time, show the widget.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still refers to showing the widget. Perhaps instead:

Suggested change
# When now >= this time, show the widget.
# When now >= this time, log it.

Copy link
Contributor

@hannahpurcell hannahpurcell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending two other tweaks, you should be good to go

Copy link

Coverage of commit 38d213c

Summary coverage rate:
  lines......: 39.5% (2159 of 5471 lines)
  functions..: 38.5% (873 of 2269 functions)
  branches...: no data found

Files changed coverage rate:
                                                                   |Lines       |Functions  |Branches    
  Filename                                                         |Rate     Num|Rate    Num|Rate     Num
  =======================================================================================================
  lib/screens/application.ex                                       |75.0%      4|50.0%     2|    -      0
  lib/screens/departures/departure.ex                              | 0.0%    152| 0.0%    30|    -      0
  lib/screens/ol_crowding/agent.ex                                 |60.0%      5|75.0%     4|    -      0
  lib/screens/ol_crowding/dynamic_supervisor.ex                    |50.0%      4|66.7%     3|    -      0
  lib/screens/ol_crowding/logger.ex                                | 0.0%     19| 0.0%     4|    -      0
  lib/screens/v2/candidate_generator/widgets/train_crowding.ex     |49.4%     89|42.9%    14|    -      0

Download coverage report

@cmaddox5 cmaddox5 merged commit 0f03c50 into master Nov 7, 2023
2 checks passed
@cmaddox5 cmaddox5 deleted the cm/ol-heuristics branch November 7, 2023 14:47
cmaddox5 added a commit that referenced this pull request Nov 21, 2023
* Added departure_time in epoch seconds to response. (#1911)

* feature: New widget endpoint for einks (#1909)

* Pull last deploy timestamp from config cache, add to data response

* Removed unneeded ResponseMapper

* Added frontend for bus einks

* feat: OL Crowding Heuristics Implementation (#1879)

* Added translation to logger.

* Started on heuristics work.

* Changed what value is saved in Agent.

* Fixed logic for NB triptychs.

* Tweaked shutdown logic for accuracy logger.

* Removed inspect.

* Dialyzer.

* Changed heuristic to only log, not show widget.

* Changed time we use to determine heuristic.

* Changed variable name.

* Added an additional heuristic for consecutive crowding classes.

* Added nil check.

* Refactored heuristics so they run independently.

* Improved conditional.

* Added a log.

* Improved comments.

* Made hardcoded value a parameter.

* Changed log scenario name.

* Improved var name.

* Credo.

* Addressed comments.

* feat: Mercury GL E-Ink audio (#1913)

* Added audio SSML to screen data response on gl einks with audio configured.

* Updated screens_config.

* Tweaked data so we always include the key even if there is no audio.

* Added audio column to GL & PreFare in admin table (#1916)

* Add filter to make sure ID is a number. (#1915)

* fix: Skip CSRF protection on /widget POSTs since they really fetch a new HTML page (#1918)

* fix: Skip CSRF protection on /widget POSTs since they really fetch a new HTML page

* Build browser pipeline atop browser_no_csrf pipeline

* Added suppressions for GL Surge. (#1919)

* Added suppressions for GL Surge.

* Fixed Govt Ctr alert ids.

* Added a GL surge suppression at Kenmore.

* Alert ID fix.

---------

Co-authored-by: Hannah Purcell <69368883+hannahpurcell@users.noreply.github.com>
Co-authored-by: Jon Zimbel <63608771+jzimbel-mbta@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants