Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/Contacts and StudentContactAssociations #96

Merged
merged 14 commits into from
Sep 13, 2024
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
# Unreleased
## New features
- Add base/stage models for `contacts` and `student_contact_associations`, added due to the rename from parent to contact in Ed-Fi data standard v5.0.
- Rename `k_parent` to `k_contact` in `stg_ef3__survey_responses`.
## Under the hood
- Add columns to `base_ef3__parents` to allow data to be unioned into new `stg_ef3__contacts` model.
## Fixes

# edu_edfi_source v0.3.6
Expand Down
5 changes: 5 additions & 0 deletions macros/gen_skey.sql
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,11 @@
'col_list': ['learningStandardId'],
'annualize': True
},
'k_contact': {
'reference_name': 'contact_reference',
'col_list': ['contactUniqueId'],
'annualize': False
},

'k_template': {
'reference_name': '',
Expand Down
45 changes: 45 additions & 0 deletions models/staging/edfi_3/base/base_ef3__contacts.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
with contacts as (
{{ source_edfi3('contacts') }}
),
renamed as (
select
tenant_code,
api_year,
pull_timestamp,
file_row_number,
last_modified_timestamp,
filename,
is_deleted,

v:id::string as record_guid,
v:contactUniqueId::string as contact_unique_id,
v:personReference:personId::string as person_id,
v:firstName::string as first_name,
v:middleName::string as middle_name,
v:lastSurname::string as last_name,
v:maidenName::string as maiden_name,
v:generationCodeSuffix::string as generation_code_suffix,
v:personalTitlePrefix::string as personal_title_prefix,
v:genderIdentity::string as gender_identity,
v:preferredFirstName::string as preferred_first_name,
v:preferredLastSurname::string as preferred_last_name,
v:loginId::string as login_id,
{{ extract_descriptor('v:sexDescriptor::string') }} as sex,
{{ extract_descriptor('v:highestCompletedLevelOfEducationDescriptor::string') }} as highest_completed_level_of_education,
{{ extract_descriptor('v:personReference:sourceSystemDescriptor::string') }} as person_source_system,
-- references
v:personReference as person_reference,
-- unflattened lists
v:addresses as v_addresses,
v:internationalAddresses as v_international_addresses,
v:electronicMails as v_electronic_mails,
v:telephones as v_telephones,
v:languages as v_languages,
v:otherNames as v_other_names,
v:personalIdentificationDocuments as v_personal_identification_documents,

-- edfi extensions
v:_ext as v_ext
from contacts
)
select * from renamed
5 changes: 5 additions & 0 deletions models/staging/edfi_3/base/base_ef3__parents.sql
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,11 @@ renamed as (
v:maidenName::string as maiden_name,
v:generationCodeSuffix::string as generation_code_suffix,
v:personalTitlePrefix::string as personal_title_prefix,
-- the following three fields were introduced to the Contacts resource, which replaced Parents in v5.0
rlittle08 marked this conversation as resolved.
Show resolved Hide resolved
-- including them here (they will always be null) to allow the tables to be unioned in stage
v:genderIdentity::string as gender_identity,
v:preferredFirstName::string as preferred_first_name,
v:preferredLastSurname::string as preferred_last_name,
v:loginId::string as login_id,
{{ extract_descriptor('v:sexDescriptor::string') }} as sex,
{{ extract_descriptor('v:highestCompletedLevelOfEducationDescriptor::string') }} as highest_completed_level_of_education,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
with student_contact_associations as (
{{ source_edfi3('student_contact_associations') }}
),
renamed as (
select
tenant_code,
api_year,
pull_timestamp,
last_modified_timestamp,
file_row_number,
filename,
is_deleted,

v:id::string as record_guid,
v:contactPriority::int as contact_priority,
v:contactRestrictions::string as contact_restrictions,
v:emergencyContactStatus::boolean as is_emergency_contact,
v:livesWith::boolean as is_living_with,
v:primaryContactStatus::boolean as is_primary_contact,
v:legalGuardian::boolean as is_legal_guardian,
{{ extract_descriptor('v:relationDescriptor::string') }} as relation_type,
-- references
v:contactReference as contact_reference,
v:studentReference as student_reference,

-- edfi extensions
v:_ext as v_ext
from student_contact_associations
)
select * from renamed
8 changes: 4 additions & 4 deletions models/staging/edfi_3/base/base_ef3__survey_responses.sql
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,10 @@ renamed as (
v:responseDate::date as response_date,
v:responseTime::int as completion_time_seconds,
--references
v:surveyReference as survey_reference,
v:studentReference as student_reference,
v:staffReference as staff_reference,
v:parentReference as parent_reference,
v:surveyReference as survey_reference,
v:studentReference as student_reference,
v:staffReference as staff_reference,
coalesce(v:parentReference, v:contactReference) as contact_reference, -- parentReference renamed to contactReference in Data Standard v5.0
-- lists
v:surveyLevels as v_survey_levels,
-- edfi extensions
Expand Down
37 changes: 37 additions & 0 deletions models/staging/edfi_3/stage/stg_ef3__contacts.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
with base_contacts as (
select * from {{ ref('base_ef3__contacts') }}
where not is_deleted
),
base_parents as (
select * rename parent_unique_id as contact_unique_id
rlittle08 marked this conversation as resolved.
Show resolved Hide resolved
from {{ ref('base_ef3__parents') }}
where not is_deleted
),
-- parents were renamed to contacts in Data Standard v5.0
unioned as (
select * from base_contacts
union
select * from base_parents
),
keyed as (
select
{{ dbt_utils.surrogate_key(
[
'tenant_code',
'lower(contact_unique_id)'
]
) }} as k_contact,
unioned.*
{{ extract_extension(model_name=this.name, flatten=True) }}
sleblanc23 marked this conversation as resolved.
Show resolved Hide resolved
from unioned
),
deduped as (
{{
dbt_utils.deduplicate(
rlittle08 marked this conversation as resolved.
Show resolved Hide resolved
relation='keyed',
partition_by='k_contact',
order_by='api_year desc, pull_timestamp desc'
)
}}
)
select * from deduped
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{{ flatten_addresses('stg_ef3__contacts', ['k_contact']) }}
1 change: 1 addition & 0 deletions models/staging/edfi_3/stage/stg_ef3__contacts__emails.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{{ flatten_emails('stg_ef3__contacts', ['k_contact']) }}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{{ flatten_telephones('stg_ef3__contacts', ['k_contact']) }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
with base_stu_contact as (
select
*,
contact_reference:contactUniqueId as contact_unique_id
from {{ ref('base_ef3__student_contact_associations') }}
where not is_deleted
),
base_stu_parent as (
select
*,
parent_reference:parentUniqueId as contact_unique_id --rename to support union and key generation
from {{ ref('base_ef3__student_parent_associations') }}
where not is_deleted
),
-- parents were renamed to contacts in Data Standard v5.0
unioned as (
select * from base_stu_contact
union
select * from base_stu_parent
),
keyed as (
select
{{ gen_skey('k_student') }},
-- we can't use the gen_skey macro here because we're bringing in the deprecated parents endpoint data, which contains a parentReference that won't work
iff(
contact_unique_id is not null,
md5(cast(coalesce(cast(tenant_code as TEXT), '') || '-' || coalesce(cast(lower(contact_unique_id) as TEXT), '') as TEXT)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you use dbt_utils.surrogate key? and what's the case where contact_unique_id is null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, good call! updated to use surrogate_key. and it shouldn't ever be null, that's a mistake on my part.

null
)::varchar(32) as k_contact,
{{ gen_skey('k_student_xyear') }},
api_year as school_year,
unioned.*
{{ extract_extension(model_name=this.name, flatten=True) }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this lose extensions from student_parent_associations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so because the two v_ext columns have been unioned together

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I think the macro will only extract the extensions configured under this.name, which is stg_ef3__student_contact_associations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the macro can take a list of model names, so I added extensions configured under parents: {{ extract_extension([model_name=this.name, 'stg_ef3__parents'], flatten=True) }}

I don't have an implementation with extensions to test this on, but it should work based on the documentation of the macro

from unioned
),
deduped as (
{{
dbt_utils.deduplicate(
relation='keyed',
partition_by='k_student, k_contact',
order_by='pull_timestamp desc'
)
}}
)
select * from deduped
2 changes: 1 addition & 1 deletion models/staging/edfi_3/stage/stg_ef3__survey_responses.sql
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ keyed as (
{{ gen_skey('k_survey') }},
{{ gen_skey('k_staff') }},
{{ gen_skey('k_student') }},
{{ gen_skey('k_parent') }},
{{ gen_skey('k_contact') }},
base_survey_responses.*
{{ extract_extension(model_name=this.name, flatten=True) }}
from base_survey_responses
Expand Down