Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SNOW-171] Convert certifiedquizquestion_latest table -> dynamic table #99

Merged
merged 8 commits into from
Jan 11, 2025

Conversation

jaymedina
Copy link
Contributor

@jaymedina jaymedina commented Jan 8, 2025

problem

Currently this latest table is a regular table that gets updated using tasks and streams. We can simplify our data warehouse by turning this into a dynamic table that updates itself without requiring tasks & streams.

solution

Following the instructions of this SOP:

  • Add a script to create a backup of the original table in case of issues
  • Add a script to drop the task + stream for the original table
  • Add a script to drop the original table so we can create a dynamic table with the same name
  • Create dynamic table that introduces some post-processing of the RAW data
  • Add table and column comments to dynamic table
  • Add a script to delete the scheduled task + stream that were for the original table

testing

  1. Ensure there are no duplicates in the latest table
image
  1. Ensure no missing questions for each response_id in the latest table compared to snapshot table
with snapshot_ids as (
    select
        response_id,
        question_index,
        max(snapshot_timestamp) as latest_snapshot_timestamp,
        max(change_timestamp) as latest_change_timestamp
    from 
        temp_latest_unique_rows
    group by 
        response_id, question_index
),
-- 2. Then we get all the unique IDs from latest
latest_ids as (
    select 
        response_id, question_index
    from 
        temp_latest_unique_rows
),
-- 3. Identify missing IDs and exclude those present in both
missing_ids as (
    SELECT 
        COALESCE(snapshot_ids.response_id, latest_ids.response_id) AS response_id,
        COALESCE(snapshot_ids.question_index, latest_ids.question_index) AS question_index,
        CASE 
            WHEN snapshot_ids.response_id IS NULL THEN 'missing_from_snapshots'
            WHEN latest_ids.response_id IS NULL THEN 'missing_from_latest'
            WHEN snapshot_ids.question_index IS NULL THEN 'question_index_missing_from_snapshots'
            WHEN latest_ids.question_index IS NULL THEN 'question_index_missing_from_latest'
        END AS missing_status,
        snapshot_ids.latest_snapshot_timestamp,
        snapshot_ids.latest_change_timestamp
    FROM 
        snapshot_ids
    FULL OUTER JOIN 
        latest_ids
    ON 
        snapshot_ids.response_id = latest_ids.response_id
        AND snapshot_ids.question_index = latest_ids.question_index
    WHERE 
        snapshot_ids.response_id IS NULL 
        OR 
        latest_ids.response_id IS NULL
)
select * from missing_ids
where latest_snapshot_timestamp > current_timestamp - INTERVAL '30 days';

image

@jaymedina jaymedina marked this pull request as ready for review January 10, 2025 17:51
@jaymedina jaymedina requested a review from a team as a code owner January 10, 2025 17:51
Copy link
Contributor

@danlu1 danlu1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Just left a few comments.

@thomasyu888
Copy link
Member

thomasyu888 commented Jan 11, 2025

@jaymedina before merging this : you will want to update the script names because 2.28 has already been taken.

🔥 Great work! Im excited about these changes!

Copy link

@jaymedina jaymedina merged commit 54eec1a into dev Jan 11, 2025
3 checks passed
@jaymedina jaymedina deleted the snow-171-certifiedquizquestions-dynamic-table branch January 11, 2025 02:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants