-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Auto-populate Sample, Container on Biospecimen create/update #645
Conversation
9200f36
to
a3f4ca4
Compare
a3f4ca4
to
f342275
Compare
params = _get_sample_identifier(biospecimen) | ||
# Add remaining sample attributes | ||
params.update( | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everything in this PR makes sense to me except the idea that we might update a participant ID. I think participant ID should be part of the defining characteristics of a sample so I'm struggling to understand how we could both identify an existing sample (which implies the participant ID on the sample matches that on the specimen being registered) but then update the sample participant ID field (which implies the participant ID does not match the specimen being registered).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh hold on... is this related to participant.kf_id really being the primary ID for particpant and participant_id being a sort of secondary/external ID? So we are updating the external ID if it changes but relying on the kf_id/PK for confirming the sample already exists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@calkinsh Yep, the primary key for participant is participant.kf_id
and the sample has a foreign key to it sample.participant_id
so I think that does make it a defining characteristic of the sample.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically we may not need the Sample.participant_id
or Sample.external_id
bc they are captured in the Sample.sample_event_key
but I included them in the Sample table in case we want to populate the sample event key with something else and bc I felt it would be ok to have some redundancy to gain some clarity on which participant the sample came from and what the original biospecimen's external sample ID was
dfa88d7
to
1074988
Compare
1074988
to
b9aa356
Compare
return container | ||
|
||
|
||
def _upsert_sample(biospecimen): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to change this approach. Read, modify, write is an anti-pattern and doesn't work with concurrent requests. Use postgresql internal upsert (update on conflict)
Closing for now. New approach is to implement the Sample table only. This is an MVP to meet Portal Beta requirements. Will try to autopopulate the Sample table from Biospecimens similar to approach here |
Motivation
#643 introduced the Sample and Container tables in order to address the shortcomings of the Biospecimen table. Now we need a way to populate these tables. And since a sample and container may be derived from a biospecimen, we can auto-populate them.
Approach
Each time a biospecimen is created or updated via an HTTP POST/PATCH, derive the sample and container from the input biospecimen and update sample/container tables.
Sample, Container Management
Find Sample - check if a sample already exists for this biospecimen
sample_event_key = concat(participant_id, external_sample_id, age_at_event_days)
analyte_type
composition
source_text_tissue_type
source_text_anatomic_site
preservation_method
method_of_sample_procurement
concentration_mg_per_ml
Create Sample - if the sample does not exist - create it using the relevant subset of biospecimen attributes
participant_id
external_sample_id
volume_ul
Update Sample - if the sample exists - update it using the relevant subset of biospecimen attributes
Find Container - check if a container already exists for this biospecimen
biospecimen_id
Create Container - if the container does not exist - create it using the relevant subset of biospecimen attributes
sample_id
specimen_status
volume_ul
Update Container - if the container exists - update it using the relevant subset of biospecimen attributes
Sum Volume - update the the sample's
volume_ul
field with the sum of it's container volumes