-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
9725 files lookup performance #9736
Conversation
… "findDeep" on the requested version, instead of the full dataset. (Needs more testing). #9725
This comment has been minimized.
This comment has been minimized.
// Optimization hints: retrieve all data in one query; this prevents point queries when iterating over the files | ||
.setHint("eclipselink.left-join-fetch", "o.fileMetadatas.dataFile.ingestRequest") | ||
.setHint("eclipselink.left-join-fetch", "o.fileMetadatas.dataFile.thumbnailForDataset") | ||
.setHint("eclipselink.left-join-fetch", "o.fileMetadatas.dataFile.dataTables") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea whether things like this, that I think only get used when a file is displayed, would be better w/o the hint? I.e. would that allow getting datatables for the ten displayed files (as separate less efficient queries) rather than getting them for 10K files. (Bad example in that this is only for tabular files, but the same question for anything related to creating the view for a file versus deciding which files should be displayed.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very good point. I just haven't gotten around to finding a dataset with a ton of tabular files to confirm that this is indeed going to result in observable savings... But I can't think of a good reason not to drop this hint, no.
(will find a test dataset now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And yes, I get the part that it's not just the DataTables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, does this really mean that we don't need the hint for roleassignments either? Or rather, that the only hints we need are for the 1:1 relations on DataFile? (that would otherwise result in additional individual queries?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to report that dropping the DataTables hint does not appear to make things noticeably better for large datasets with many tab. files; but also makes things worse for those without any tab. files - individual lookup queries are then issued on every file , on account of this in the page, maybe:
for(DataFile f : dataset.getFiles()) {
if(f.isTabularData()) {
hasTabular = true;
break;
}
}
Running out of steam with this thing a little bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The awful loop above can be fixed of course; it needs to be fixed. I was just demoralized a little by seeing things like that. The above happens 40 lines into the init method. There's almost certainly more of the same throughout the rest of the page. Things that were already in 5.13, that had gradually crept into the code, that were making things slow on some large datasets in 5.13 and are offsetting the gains from the 5.14 optimizations now. I wasn't mentally prepared for going this deep. But will keep looking into it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah fun - needed for the download all button I guess. This type of thing definitely looks like a great little post 5.14 spin-out issue (that the SPA will also have).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I became convinced last night that this really needs to be addressed as part of this PR. At least some of it, the more glaring stuff. The "awful loop" above - it is awful because it goes through every file in the entire dataset; negating the benefits of trying to only initialize one version-worth of files. AND it doesn't appear to be necessary - because for all the useful purposes the page and its fragment only need to know if tab files are present in the current version. (There is a separate loop for the tab files in the version there).
Dropping the hint for DataTable for the version doesn't appear to be as straightforward through - the page in its current form needs all the DataTables in the version, both for checking if there are tab files, and for calculating the "original" sizes, if tab files are present. Can (and probably should) be replaced with custom queries, but this may be an effort that can be delayed - will see how it goes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... the full sentence should have been "negating the benefits of trying to only initialize one version-worth of files in some large datasets with sparseley-populated versions" of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should mention that dropping most of the other hints on OneToMany objects in that method appears to be a less problematic affair. Thank you for the tip. Now seems like such an obvious thing in retrospect, of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some small comments / questions - nothing critical. Can you also ee the question @qqmeyers asked? I'm ready to approve, but want to let @landreev have a chance to review what we wrote before that. Let me know when it's set.
|
||
// We are only performing these lookups to obtain the database id | ||
// of the version that we are displaying, and then we will use it | ||
// to perform a .findFind(versionId); see below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small typo in the comment
if (dataset == null) { | ||
logger.warning("No such dataset: "+dataset); | ||
return permissionsWrapper.notFound(); | ||
} | ||
//retrieveDatasetVersionResponse = datasetVersionService.retrieveDatasetVersionById(dataset.getId(), version); | ||
retrieveDatasetVersionResponse = datasetVersionService.selectRequestedVersion(dataset.getVersions(), version); | ||
if (retrieveDatasetVersionResponse == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this necessary? i.e. we already found the dataset, can there be one without a version here? (is this some edge case I'm not immediately seeing?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I see now that it was moved up from lower down. That said see my comment lower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may have had a real data corruption case where a dataset failed to get deleted, and was left in the database without versions. I wasn't thinking about it honestly, just feel like it's a good idea to check for nulls. (but yes, it was a copy-and-paste from further below)
@@ -1958,7 +1980,7 @@ private String init(boolean initFull) { | |||
return permissionsWrapper.notAuthorized(); | |||
} | |||
|
|||
if (!retrieveDatasetVersionResponse.wasRequestedVersionRetrieved()) { | |||
if (retrieveDatasetVersionResponse != null && !retrieveDatasetVersionResponse.wasRequestedVersionRetrieved()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this null check necessary? I guess its being extra safe, but you do check for mull further up, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one IS actually necessary. Because in this PR I finally fixed the bug where the page was not working when called by versionId=
. So yes, there can be a legit case where retrieveDatasetVersionResponse is going to be null here.
(and I figured when the page gets called by the numeric id, if it doesn't exist, it should always be 404/not found, "no substitutes").
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I may have made enough progress addressing the performance on the remaining "outlier" - very high file count - prod. datasets, making it significantly better than under 5.13, as to finally stop (will need to confirm in QA that the numbers I'm seeing are legit, of course). |
This comment has been minimized.
This comment has been minimized.
We should probably move this into QA now, so that the RC can be properly re-tested. Testing it thoroughly, especially on large datasets takes time.
just as a reference, for that infamous 42K files dataset the page takes almost a minute to load in prod. On the perf. test instance (which is less beefy than prod.), it takes closer to 2 min. under 5.13, and it was even worse in the previous iteration of this branch. With the current code, it loads in 9 sec. on the perf. instance and I expect it to be below 5 sec. in prod. The real meaning of this is less about the optimization in the current branch being particularly good and more about how awful it was in 5.13. We apparently missed the point at which something had been added to the page that compromised the effectiveness of the earlier optimizations measures. I was wrong to even consider accepting a further drop in performance, if only on a few large datasets. We should be grateful to @ErykKul for bringing the performance of the page to our attention. |
📦 Pushed preview images as
🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
What this PR does / why we need it:
A few quick refinements to the optimizations added in #9558 to address some performance issues found during the testing of the 5.14 release candidate.
Which issue(s) this PR closes:
Closes #9725
Special notes for your reviewer:
Suggestions on how to test this:
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?:
Additional documentation: