Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On Study View handle larger number of resource files #5034

Merged
merged 10 commits into from
Nov 1, 2024

Conversation

hweej
Copy link
Contributor

@hweej hweej commented Oct 27, 2024

Fixes cBioPortal/cbioportal#11004 (associated backend PR is merged, see cBioPortal/cbioportal#11051). This PR modifies the FilesAndLinks table on the Study View to use the new resource-data-all endpoint, which reduces the number of API requests and enables showing thousands of resource files without performance degradation

@hweej hweej self-assigned this Oct 27, 2024
Copy link

netlify bot commented Oct 27, 2024

Deploy Preview for cbioportalfrontend ready!

Name Link
🔨 Latest commit 42ea827
🔍 Latest deploy log https://app.netlify.com/sites/cbioportalfrontend/deploys/672522df651b450008100e2d
😎 Deploy Preview https://deploy-preview-5034.cancerrevue.org
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

idMap: { [key: string]: ResourceData[] },
resource: ResourceData
) => {
const { resourceType } = resource.resourceDefinition;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole operation would be more clearly handled by lodash's keyBy function. we like to use these kind of named helpers because they make the operation more explicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I couldn't make lodash keyBy function to work the way I wanted it to for this use case. In python this would be a case for collections.defaultDict. But I couldn't make this work where I could create new arrays and append to them.

LMK if you want me to work on this part some more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hweej sorry, i mean _.groupBy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have replaced these lines with a single lodash groupBy('patientId'). Thanks @alisman !

filtering, clean up code superfluous comments.
@inodb inodb changed the title Modify FilesAndLinks table to use the new resource-data-all endpoint. On Study View handle larger number of resource files Oct 30, 2024
const resourcesForPatientsAndSamples = resourcesForEntireStudy
// Filter the resources to consist of only studyView selected samples
// Also keep patient level resources (e.g. Those don't have a sampleId)
.filter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more thing here: this is operation is Rsamples+Rpatients. could easily get to billion. easily optimized by putting selectedSamples in a map so you can use IN operator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have replaced this with Maps instead of arrays to remove the linear lookup. I don't think you can use the IN operator with Maps so we're opting for the .has('key') expression instead.

.filter(resource =>
selectedIds.has(resource.sampleId || resource.patientId)
)
.groupBy('patientId')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hweej one more little tweak. i'm pretty sure that using the shortcut 'patientId' is not typesafe., so as a rule we use

(r)=>r.patientId)

That way if there is no propertyId, ts will catch it.


// sampleIds (+patientIds) for the selectedSamples
const selectedIds = new Map([
...new Map(selectedSamples.map(item => [item.sampleId, item.studyId])),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is strange. why instantiate maps inside of the map? i think just

   new Map([...selectedSamples.map(xxx)])

@inodb inodb added performance This PR is related to a performance issue and removed enhancement labels Nov 1, 2024
@inodb inodb merged commit 9a02d70 into cBioPortal:master Nov 1, 2024
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance This PR is related to a performance issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ERR_INSUFFICIENT_RESOURCES for study view files and links tab
3 participants