Feature/assessment normalizations #65

rlittle08 · 2023-07-12T21:16:07Z

Overview

Assessment score results in the ODS are often not in the format needed to present in downstream dashboards, or in analytical queries. This branch adds the ability to create "normalized" columns in fct_student_assessment, fct_student_objective_assessment, where each implementation can customize how they map these values. For example, you may want to map a "Percentile" result on a 1-100 scale to be a "performance_level" on a 1-5 scale. Or, you may want to add ordered integer values e.g. "Low -> 1; Medium -> 2; High -> 3".

Description of Changes

In bld_ef3__student_assessment_long_results and bld_ef3__student_objective assessment_long_results, add normalized_score_result to model, to allow for normalization of results
In fct_student_assessment and fct_student_objective_assessment, create new columns normalize_{score_name} wherever they have been configured. e.g. if you add rows to xwalk_assessment_score_values.csv for performance_level, a new column normalized_performance_level will be created with normalized score results

Dependent on:

These xwalks added to implementation repo:

xwallk_assessment_score_values
xwalk_assessment_score_value_thresholds
xwalk_objective_assessment_score_values
xwalk_objective_assessment_score_value_thresholds

Example PR: https://github.com/edanalytics/stadium_txdemo/pull/9

Questions:

Is "normalized" the right wording for these kind of customizations? I don't want to confuse "display values" with "re-scaled values", but there is overlap there
Is it right to overload a general "normalized_" column with various use cases, when some may need to be integers vs. characters, etc.?
Generally, are edu_wh models the correct place for this kind of normalization?
When should re-mapping of values live in student warehouse tables, vs. in dimensions or xwalks?

TODOs:

Work on a larger "assessments engine" feature that separates normalized info from assessment-specific info & efficiently serves those data to downstream purposes
In the future, we may need to allow these xwalks to join separately on subject, grade level, etc.

jalvord1

As an overall comment, these are minimal enough changes that I think it's fine to add to edu_wh despite the fact that we will continue to add more assessment reporting/normalization features down the line. I think there are no really strong reasons to not add normalized score columns to the fact tables, even if we end up doing more normalization downstream in the future, since score normalization is typically the first ask

jalvord1 · 2023-07-12T21:39:54Z

models/build/edfi_3/assessments/bld_ef3__student_assessments_long_results.sql

+        dedupe_results.score_result,
+        coalesce(xwalk_score_value_thresholds.normalized_score_result::varchar,
+                 xwalk_score_values.normalized_score_result::varchar,
+                 score_result::varchar


We talked here whether or not the original score result should be defaulted to if there is no normalization happening for the score and leaned toward yes for the case when normalization is not necessary. What this could mean is that a score that should be normalized but isn't yet included in either normalization xwalk will make it's way into this column in an ugly format that might not match what is necessary for reporting, but in order for a column to be included here, it must be added to the xwalk_assessment_scores column in the first place, so there is at least a manual step that needs to happen anyway. Someone might not know that this normalized column exists and the values in the normalized column should be an integer if it's a performance level (as a random example), but I think we can communicate this out and it avoids having to map values to themselves.

jalvord1 · 2023-07-12T21:40:46Z

models/build/edfi_3/assessments/bld_ef3__student_assessments_long_results.sql

+        -- todo review my use of try_to_numeric here -- the idea is to allow numeric values to merge, otherwise don't merge without error
+        and try_to_numeric(dedupe_results.score_result) >= xwalk_score_value_thresholds.lower_bound
+        and try_to_numeric(dedupe_results.score_result) <= xwalk_score_value_thresholds.upper_bound
+        -- todo in future, may need to include subject & grade level in this join (with options to join across subjects)


we will definitely run into this at some point but can start without it - especially considering there will be additional assessment normalization features anyway

jalvord1 · 2023-07-12T21:42:58Z

models/build/edfi_3/assessments/bld_ef3__student_assessments_long_results.sql

+        and xwalk_scores.normalized_score_name = xwalk_score_value_thresholds.normalized_score_name
+        -- todo check these comparators -- what if there's a value between the upper and next lower? eg value is 20.4 and the cutoffs are 20 and 21
+        -- todo review my use of try_to_numeric here -- the idea is to allow numeric values to merge, otherwise don't merge without error
+        and try_to_numeric(dedupe_results.score_result) >= xwalk_score_value_thresholds.lower_bound


this will default to int since no scale argument is given, I think that's fine but maybe we consider allowing for decimals (so try_to_decimal)? I assume you could still write out the values in the xwalk as integers

that's a good point, maybe we should be explicit about the data type of this column -- i'm still unsure about this Q I put in the PR "Is it right to overload a general "normalized_" column with various use cases, when some may need to be integers vs. characters, etc.?"

Yeah that's a good q. I think in a lot of cases though the point of a column like this is to normalize values to a similar set of values across all assessments in the table. I don't necessarily think that's always true but my guess is this column would be used for a single particular downstream purpose - like a BI user might use a normalized column where all PLs are integers when creating charts for proper ordering. But again, maybe there is another use case I'm not considering where this could have serious negative effects

jalvord1 · 2023-07-12T21:44:51Z

models/core_warehouse/fct_student_assessment.sql

+        {% set normalized_names_thresholds = dbt_utils.get_column_values(ref('xwalk_assessment_score_value_thresholds'), 'normalized_score_name') or [] %}
+        {{ dbt_utils.pivot(
+            'normalized_score_name',
+            (normalized_names_values + normalized_names_thresholds) | unique,


idea here is that we only want normalized versions of scores that are included in either xwalk (because scores like scale_score and sem will rarely be normalized in this way, so would be overkill in my opinion)

rlittle08 added 5 commits July 7, 2023 16:42

draft normalize scores using xwalks

45d2cbe

add seasons and normalized scores

0b0034c

only add normalize cols if specified in any xwalk

b7b56f5

remove seasons from here, add todo comment

0b50098

add normalized results to stu obj assess

6aa0dc7

rlittle08 requested a review from jalvord1 July 12, 2023 21:22

jalvord1 reviewed Jul 12, 2023

View reviewed changes

rlittle08 requested a review from ejoranlienea July 13, 2023 14:20

rlittle08 marked this pull request as draft July 24, 2023 15:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/assessment normalizations #65

Feature/assessment normalizations #65

rlittle08 commented Jul 12, 2023 •

edited

Loading

jalvord1 left a comment •

edited

Loading

jalvord1 Jul 12, 2023

jalvord1 Jul 12, 2023

jalvord1 Jul 12, 2023

rlittle08 Jul 13, 2023

jalvord1 Jul 13, 2023

jalvord1 Jul 12, 2023

Feature/assessment normalizations #65

Are you sure you want to change the base?

Feature/assessment normalizations #65

Conversation

rlittle08 commented Jul 12, 2023 • edited Loading

Overview

Description of Changes

Dependent on:

Questions:

TODOs:

jalvord1 left a comment • edited Loading

Choose a reason for hiding this comment

jalvord1 Jul 12, 2023

Choose a reason for hiding this comment

jalvord1 Jul 12, 2023

Choose a reason for hiding this comment

jalvord1 Jul 12, 2023

Choose a reason for hiding this comment

rlittle08 Jul 13, 2023

Choose a reason for hiding this comment

jalvord1 Jul 13, 2023

Choose a reason for hiding this comment

jalvord1 Jul 12, 2023

Choose a reason for hiding this comment

rlittle08 commented Jul 12, 2023 •

edited

Loading

jalvord1 left a comment •

edited

Loading