Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/att event multi calendars #158

Merged
merged 5 commits into from
Feb 2, 2025
Merged

Conversation

ryanaguilar
Copy link
Contributor

@ryanaguilar ryanaguilar commented Dec 6, 2024

Description & motivation

This PR addresses an edge case that occured in the BPS Stadium implementation where a student had multiple enrollments associated with different calendars per enrollment, which resulted in a student's attendance event not contributing to their attendance count.

Currently, the attendance_event_date is converted into a k_calendar_date by joining to fct_student_school_association and deduping by selecting enrollments where is_latest_annual_entry is true.

This updated model would instead join so that the attendance_event_date is within the range of the enrollment's entry_date and exit_withdraw_date (and accounts for null exit_withdraw_dates).

However, if a student has overlapping enrollments at the same school, multiple rows will
be returned for each date, and so an additional deduplication step is introduced.

See the original Monday item here.

Breaking changes introduced by this PR:

No breaking changes, but in rare cases could change attendance count.

PR Merge Priority:

  • Medium

Changes to existing files:

New files created:

Tests and QC done:

Future ToDos & Questions:

  • Further discussion may be needed on how to pick a winner when there overlapping enrollments with multiple calendars at the same school. The rule implemented here is one option that was presented during planning.

edu_wh PR Review Checklist:

Make sure the following have been completed before approving this PR:

  • Description of changes has been added to Unreleased section of CHANGELOG.md. Add under ## New Features for features, etc.
  • If a new configuration xwalk was added:
    • The code is written such that the xwalk is optional (preferred), and this behavior was tested, OR
    • The code is written such that the xwalk is required, and the required xwalk is added to edu_project_template, and this PR is flagged as breaking change (not for patch release)
    • A description for the new xwalk has been added to EDU documentation site here
  • If a new configuration variable was added:
    • The code is written such that the variable is optional (preferred), and this behavior was tested, OR
    • The code is written such that the variable is required, and a default value was added to edu_project_template, and this PR is flagged as breaking change (not for patch release)
    • A description for the new variable has been added to EDU documentation site here
  • Reviewer confirms the grain of all tables are unchanged, OR any changes are expected, communicated, and this PR is flagged as a breaking change (not for patch release)

@ryanaguilar ryanaguilar requested a review from rlittle08 December 6, 2024 17:08
@ryanaguilar ryanaguilar marked this pull request as ready for review January 23, 2025 21:16
and stg_stu_sch_attend.attendance_event_date = dim_calendar_date.calendar_date
on fct_student_school_assoc.k_school_calendar = dim_calendar_date.k_school_calendar
and stg_stu_sch_attend.attendance_event_date = dim_calendar_date.calendar_date
and dim_calendar_date.calendar_date between fct_student_school_assoc.entry_date and coalesce(fct_student_school_assoc.exit_withdraw_date,getdate())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any specific to use getdate() (returns a timestamp) instead of current_date() (returns a datestamp) here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specific reason, I don't think I realized there was a difference.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it affects the logic in this case, but I prefer current_date(), since the extra precision isn't needed and current_date() is available in databricks

@rlittle08
Copy link
Collaborator

I tested and deeply qc'd in Boston, TX, SC. Only changes were expected improvements. LGTM

@rlittle08 rlittle08 merged commit 8b51f86 into main Feb 2, 2025
@rlittle08 rlittle08 deleted the fix/att_event_multi_calendars branch February 2, 2025 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants