-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
7804 dataverse page speedup #8143
Conversation
…ragment. (#7804) (may still rearrange the code before finalizing)
query that becomes unreasonably expensive for an installation with a lot of download/access activity. This solution is very efficient, but very PostgresQL-specific, hence the use of a stored function. (#7804)
…(but still work in progress; #7804)
…d users" in the comments, for some reason #7804)
…xternal curation labels PR. (#7804)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - lot's of changes but straight forward - adding/moving caching code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this is looking good. I left some comments.
-- This creates a function that ESTIMATES the size of the | ||
-- GuestbookResponse table (for the metrics display), instead | ||
-- of relying on straight "SELECT COUNT(*) ..." | ||
-- Significant potential savings for an active installation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now we see a precise number like "31,523,665 Downloads". Since the number will be estimated, does it mean it will be rounded? "32,000,000 Downloads"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very good StackOverflow thread about this:
https://stackoverflow.com/questions/7943233/fast-way-to-discover-the-row-count-of-a-table-in-postgresql
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"COUNT(*)" is notoriously expensive in PostgresQL for large tables (brute force is the only counting option; so the larger the table, the longer it takes). So I'm operating on the assumption that nobody looking at the home page needs that number to be precise down to the last 3 or 4 digits.
An alternative solution would be to have some application-scope caching service calculate the exact count every hour or so (again, under the assumption that nobody should care whether the number is up-to-date in real time).
if (checkDvoCacheForCommandAuthorization(dvo.getId(), CreateDataverseCommand.class, commandMap) == null) { | ||
boolean canIssueCommand = false; | ||
canIssueCommand = permissionService.requestOn(dvRequestService.getDataverseRequest(), dvo).canIssue(command); | ||
logger.info("rerieved authorization for " + command.toString() + " on dvo " + dvo.getId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably reduce to logger.fine.
Map<Class<? extends Command<?>>, Boolean> newDvoCommandMap = new HashMap<>(); | ||
commandMap.put(dvo.getId(), newDvoCommandMap); | ||
return addCommandtoDvoCommandMap(dvo, command, newDvoCommandMap); | ||
logger.info("using cached authorization for " + command.toString() + " on dvo " + dvo.getId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same. logger.fine here.
|
||
// This map stores looked up permission results (boolean), for multiple DvObjects | ||
// (referenced by the Long ids), for multiple Commands. The current session | ||
// user is assumed when lookups are performed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're operating on the session does that mean that a user will now need to log out and log in again if they've been given a new role? (To be honest, I don't know if this was already required or not.) Or if a role was removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a user logs out and logs in as another user - I understand that the ViewScope of the wrapper ensures that no stale permissions are cached. But it's one of the things that need to be tested carefully.
If a role is revoked in real time, or something else changes on the permission side, the wrapper does continue caching the stale authorization status. This is a known possibility that has always been there. In most cases, the user will NOT be able to perform any tasks they are no longer authorized to. E.g., a link to edit the dataset may still be on the page, but an attempt to actually save any changes will fail, when the authorization is re-checked on execution of the UpdateDatasetVersionCommand, etc. But, there are also some exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdurbin I'm not sure I understand where you ask "If we're operating on the session". Why do you say we're operating on a session? (this bean being view scoped)
boolean canIssueCommand = false; | ||
try { | ||
canIssueCommand = permissionService.userOn(AuthenticatedUsers.get(),dataverse).canIssueCommand("CreateNewDatasetCommand"); | ||
logger.info("rerieved auth users can create datasets"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo ("rerieved") and this should be logger.fine.
if (checkDvoCacheForCommandAuthorization(dataverse.getId(), CreateNewDatasetCommand.class, authUsersCommandMap) == null) { | ||
boolean canIssueCommand = false; | ||
try { | ||
canIssueCommand = permissionService.userOn(AuthenticatedUsers.get(),dataverse).canIssueCommand("CreateNewDatasetCommand"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this canIssueCommand
method is deprecated and the only usages are this line and the one below. Should we consider deleting this deprecated method (since the only usages are in this pull request) and figuring out how to use non-deprecated methods? I hope this means no longer passing a string like "CreateNewDatasetCommand" which seems suboptimal if we ever want to rename that command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing an obvious best way to do this even on a second look...
Wondering if that was the reason these instances of this deprecated method were left in the code, when used in this context. (Again, this is not a typical permission lookup, in that we are checking authorization for an entire Group
- of AuthenticatedUsers as a whole).
The @deprecated
comment - "use DynamicPermissionQuery instead" may not apply to this case... I'm assuming by "Dynamic" it means "RequestPermissionQuery"... but those work entirely on DataverseRequests, as the name suggests. And a DataverseRequest
in turn needs a User
, while AuthenticatedUsers is a Group
.
However, if the hard-coded command name strings are the main concern, the above can be replaced with permissionService.isUserAllowedOn(AuthenticatedUsers.get(), CreateDataverseCommand.class, dataverse);
etc.
...the only catch is that .isUserAllowedOn(...)
is also marked as deprecated.
I'm beginning to think that we should not worry much about the "deprecated" part on its own... The reason it's deprecated is because it's not going to catch any request-level group permissions - i.e. ip groups - but it should still be entirely safe to call it on AuthenticatedUsers.get()
... I'll run it by @scolapasta too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'm going to go ahead and implement the change described in the last comment - switch to a method that does not require passing the command name as a literal string; that would allow me to further streamline the code in the wrapper, and make the handling of these AuthenticatedUsers group lookups more similar to how "normal" user authorizations are handled in there.
We'll discuss the "deprecated" part of it separately, but I'm feeling more confident that it's not an issue, under the circumstances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scolapasta I'm moving the PR back into CR. I feel it should be ready for QA; as I have addressed everything that was brought up. But could you please review this particular thread. I addressed some of the above - I got rid of what @pdurbin pointed out was the biggest problem - the hard-coded command names in the code, old-style. But I am still relying on (another) deprecated method.
Please see if my understanding and justification of why it should be ok in this particular situation makes sense. And/or if I overlooked a non-deprecated simple way of achieving the check in question.
canIssueCommand = permissionService.userOn(AuthenticatedUsers.get(),dataverse).canIssueCommand("CreateDataverseCommand"); | ||
logger.info("rerieved auth users can create dataverses"); | ||
} catch (ClassNotFoundException ex) { | ||
logger.info("ClassNotFoundException checking if authenticated users can create dataverses in dataverse."); | ||
} | ||
addCommandAuthorizationToDvoCache(dataverse.getId(), CreateDataverseCommand.class, authUsersCommandMap, canIssueCommand); | ||
} else { | ||
logger.info("using cached authUsersCanCreateDataversesInDataverse result"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments above about canIssueCommand being deprecated.
Also, "rerieved" typo and logger.fine would be better, I think.
String downloadMethods = getValueForKey(SettingsServiceBean.Key.DownloadMethods); | ||
rsyncDownload = downloadMethods != null && downloadMethods.toLowerCase().contains(SystemConfig.FileDownloadMethods.RSYNC.toString()); | ||
} | ||
return rsyncDownload; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment here and in the corresponding isRsyncOnly method in SystemConfig.java that the logic appears in both places. I could see these two methods drifting apart over time, getting out of, well, sync, if you will.
Perhaps this also applies to other methods. It looks like the wrapper used to call into SystemConfig and now the logic is duplicated. This one just stood out to me with its split and regex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I just realized that this hasn't been addressed yet. Yes, quite a few of these methods are now duplicated between the 2 places. I had a reason to choose to have these methods implemented inside the SettingsWrapper (as opposed to just calling SystemConfig and then caching the result). I can explain... but I'll take a whack at reducing the duplication first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so my original rationale for duplicating these methods was pure penny-, or rather, query-pinching. Performing a check like this (that relies on looking up a single Setting) in the wrapper - as opposed to calling SystemConfig and caching the result - saves 1 (One) query total. (Because the wrapper already has all the settings in memory, looked up all at once; and SystemConfig will in fact have to look up that setting in the database.
After way more deliberation than this issue deserved, I am undoing the duplication of logic in this, and a couple of other methods. These few query lookups - from a small, heavily cached table - are not worth the drawbacks of duplicating the code, and potentially ending up with the pages and the API getting different configuration results.
Saving these queries may be achieved in the future via some application-scoped caching mechanism, as mentioned elsewhere in the PR.
@@ -11,8 +11,8 @@ | |||
xmlns:iqbs="http://xmlns.jcp.org/jsf/composite/iqbs"> | |||
<o:resourceInclude path="/CustomizationFilesServlet?customFileType=header" rendered="#{!widgetWrapper.widgetView}"/> | |||
<o:importFunctions type="edu.harvard.iq.dataverse.authorization.groups.impl.builtin.AuthenticatedUsers" /> | |||
<ui:param name="showAddDataverseLink" value="#{permissionServiceBean.userOn(AuthenticatedUsers:get(),dataverseServiceBean.findRootDataverse()).canIssueCommand('CreateDataverseCommand') }"/> | |||
<ui:param name="showAddDatasetLink" value="#{permissionServiceBean.userOn(AuthenticatedUsers:get(),dataverseServiceBean.findRootDataverse()).canIssueCommand('AbstractCreateDatasetCommand') }"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are a couple places where we seem to be getting rid of the deprecated canIssueCommand
method. Great.
Probably what's happening here is that they were moved to the backing bean. Please see my comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So no, we are not getting rid of the deprecated canIssueCommand - it has simply been moved into the wrapper, and is now only executed once, instead of repeatedly.
I'll look into a new, non-deprecated method of doing this.
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
Issues found:
Done with functional testing, will do a brief stress test. |
What this PR does / why we need it:
This PR should make the home/dataverse page snappier.
This is not by any means an exhaustive optimization of everything that could be improved. More like picking the most low-hanging fruit/fixing the more glaring inefficiencies. But this should be a good start.
Quantifiable results (testing the dataverse page on the root collection using a copy of the IQSS prod. database on the perf. cluster, w/ the 10 "official benchmark datasets", comparing to 5.6):
The number of database queries issued in order to load the page via a browser is reduced 3X (from 900+ to below 300);
The average time it takes to load the page, when accessing the page repeatedly from the command line w/ curl is reduced 2X.
It is harder to predict how much this will improve the performance of the page in actual IQSS production. The average load time will always be higher, or much higher, in prod. when it's under regular load; in absolute number of seconds. The improvement ratio should be of the same order (I know that some queries that are dropped in this PR take much longer on the prod. RDS than on the test cluster db). On top of that, it's harder still to predict how much better this new build will behave when the application is bogged down by something happening outside of the dataverse page itself... The only way to find out for sure will be to deploy this in our production.
As a side effect of this PR - mostly on account of the fixes in the dataverse_header.xhtml fragment - all the other pages that are loading the fragment will be issuing fewer queries too. There's a possibility that other improvements made here could be applied to other pages and components, but that should be done under separate issues.
Certain potential improvements were discussed over the course of working on this PR, but are NOT being addressed in it. For example, various application-wide caching solutions (as opposed to/in addition to the viewscoped caching wrapper services we already have in place). We will continue investigating these ideas under separate dedicated issues.
Which issue(s) this PR closes:
Closes #7804
Special notes for your reviewer:
Suggestions on how to test this:
Mostly we need to carefully retest the dataverse page in all its various modes. Since most of the changes were connected to the rendering logic in the various jsf components of the page, we need to ensure that all the links and buttons are still there, that users still can do what they are authorized to do, and the other way around. I know this sounds like "re-test everything" - sorry.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?:
Additional documentation: