Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SNOW-180] nodesnapshots: Add columns related to project size and size limit #106

Merged
merged 16 commits into from
Jan 29, 2025

Conversation

jaymedina
Copy link
Contributor

@jaymedina jaymedina commented Jan 18, 2025

problem

Platform has added new metadata related to project sizes and size limits for users to access via Snowflake. The current table structure for nodesnapshots does not include these new columns and must be added.

solution

Following this SOP:

  • Introduce the new columns to the nodesnapshots table
  • Update tasks to UPSERT into snapshots table with new columns
  • Refresh the snapshots table

testing

Depends on Sage-Bionetworks/Synapse-Stack-Builder#688
Depends on Sage-Bionetworks/Synapse-Repository-Services#5221


A query of the synapse_data_warehouse_jmedina.synapse_raw.nodesnapshots table has project_storage_usage backfilled since Jan 16, when the field was first introduced:

image

53,620 rows for this field on dev as of today (Jan 29)

image

All of these rows have the correct projectId field, implying that the project_storage_usage metadata for each row is corresponding to the correct node:

SELECT
  SNAPSHOT_DATE,
  PROJECT_STORAGE_USAGE,
  project_id,
  -- Extract the project_id and remove the 'syn' prefix
  SUBSTR(PARSE_JSON(PROJECT_STORAGE_USAGE):projectId::STRING, 4) AS extracted_project_id
FROM
  nodesnapshots
WHERE
  project_storage_usage IS NOT NULL
  AND SUBSTR(PARSE_JSON(PROJECT_STORAGE_USAGE):projectId::STRING, 4) = project_id;
image

@jaymedina
Copy link
Contributor Author

@philerooski Marco's PRs that this PR depends on (linked above) are still waiting for review which means the data isn't ready yet and we don't have anything to test with in the dev stack. I'd rather have something to test with and backfill than not have something to test with and push straight to production, so I've preemptively added a refresh_nodesnapshots script in with the other two that I got from your version_history PR.

Feel free to start reviewing these edits - I based them off Marco's current PR so things like the column name, etc are subject to change.

Copy link

dpulls bot commented Jan 22, 2025

🎉 All dependencies have been resolved !

@jaymedina
Copy link
Contributor Author

jaymedina commented Jan 27, 2025

I just ran 2.32.0 and 2.32.2 on my clone DB and it doesn't seem like there is any new data to test with yet (my project_storage_usage column only has nulls) but I welcome review comments in the meantime

@jaymedina jaymedina marked this pull request as ready for review January 27, 2025 16:57
@jaymedina jaymedina requested a review from a team as a code owner January 27, 2025 16:57
@philerooski
Copy link
Collaborator

@jaymedina potentially related https://sagebionetworks.jira.com/browse/SNOW-216

Copy link
Contributor

@BryanFauble BryanFauble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Collaborator

@philerooski philerooski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to either modify your backfill script (V2.32.2) or test deploying your V scripts to a clone with a local installation of schemachange (see review comments). Re-request my review once you're happy with any changes you might make.

Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
64.7% Duplication on New Code (required ≤ 50%)

See analysis details on SonarQube Cloud

@jaymedina jaymedina requested a review from philerooski January 29, 2025 18:02
@jaymedina jaymedina merged commit cf61737 into dev Jan 29, 2025
2 of 3 checks passed
@jaymedina jaymedina deleted the snow-180-add-cols-node-latest branch January 29, 2025 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants