Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance problems with "Assign Role" or "Remove Assigned Role" #2036

Closed
posixeleni opened this issue Apr 16, 2015 · 40 comments
Closed

Performance problems with "Assign Role" or "Remove Assigned Role" #2036

posixeleni opened this issue Apr 16, 2015 · 40 comments
Assignees
Labels
Type: Bug a defect
Milestone

Comments

@posixeleni
Copy link
Contributor

When we try to remove an Assigned Role it spins and when we refresh the page it does nothing. Discovered while meeting with @mcrosas when looking into AJPS.

@posixeleni posixeleni added this to the 4.0.1 milestone Apr 16, 2015
@scolapasta scolapasta modified the milestones: Candidates for 4.0.1, In Design May 8, 2015
@scolapasta scolapasta modified the milestones: 4.0.3, In Design Jun 25, 2015
@pdurbin pdurbin assigned pdurbin and unassigned scolapasta Jul 14, 2015
@pdurbin pdurbin changed the title Permissions: Remove Assigned Role is Not Working Performance problems with "Assign Role" or "Remove Assigned Role" Jul 14, 2015
@pdurbin
Copy link
Member

pdurbin commented Jul 14, 2015

On my laptop on commit ab0b268 (pre-4.1 release) I loaded up production data and played around with granting and revoking a role on a dataverse ("togo") with 32 datasets and 194 files. To me it's unacceptably slow even with Solr is down, taking nearly 10 seconds:

Solr Up Solr Down
grant 0m25.653s 0m9.699s
revoke 0m23.539s 0m9.968s

Obviously, at ~25 seconds it's crazy slow when Solr is up, which is related to #50.

I'm not sure what our target is for response time for the operation to complete but I guess we'll try to make it as fast as we can. ab0b268 contains the script I'm testing with.

@pdurbin
Copy link
Member

pdurbin commented Jul 15, 2015

Based on the profiler (screenshot below), I'm pretty sure the slowdowns are in:

  • instantiating datasets (and possibly files and dataverses): DvObjectServiceBean.findDvObject()
  • committing changes to Solr over and over: SolrServer.commit()

I'm not sure how to make the instantiation of datasets faster. This issue was noted by @raprasad in #1822 and he suggested storing a version of the dataset as JSON. I could dig into query counts like he did but I trust him. :)

As for repetitive commits to Solr, I could look into batching the operations, similar in spirit to the batching we've talked about in the past.

screen shot 2015-07-15 at 2 28 22 pm

@bencomp
Copy link
Contributor

bencomp commented Jul 15, 2015

@pdurbin is an index created automatically for primary keys?

pdurbin added a commit that referenced this issue Jul 16, 2015
The profiler shows significantly better performance when we reduce the
number of calls to findDvObject so we pass around the DvObject itself
more often, rather than looking it up over and over.

The profiler also suggests that batching the commits to Solr may also
yield a good performance gain but that's for a different commit.

The index API endpoint "missing" is removed as part of this commit since
it relies on code that was rewritten as part of this commit but it was
little used anyway since there's a newer "skipIndexed" boolean for other
index endpoints for catching up an index. We can revisit if we want to
re-add the "missing" endpoint.

Finally the createSolrDoc method was used to SearchUtil and tests were
written for it.

Here are the numbers for before and after this commit from running
scripts/issues/2036/grant-role-then-revoke

        | Solr Up   | Solr Down
------- | --------- | -------
grant   | 0m25.653s | 0m9.699s
revoke  | 0m23.539s | 0m9.968s

        | Solr Up   | Solr Down
------- | --------- | -------
grant   | 0m7.422s  | 0m10.348s
revoke  | 0m6.758s  | 0m10.341s
@pdurbin
Copy link
Member

pdurbin commented Jul 16, 2015

In 75e8dba in a new branch I reduced the number of calls to findDvObject but I'd like to review it with @scolapasta before merging it. See the long commit message for details but it end with much shorter times than what I reported at #2036 (comment) before I started working on this issue. The numbers are from running scripts/issues/2036/grant-role-then-revoke

Solr Up Solr Down
grant 0m7.422s 0m10.348s
revoke 0m6.758s 0m10.341s

What lead me here was the profiler, which now no longer suggests that findDvObject is the culprit. Rather, the next thing to look at would be somehow batching the commits to Solr, but this was not the lower hanging fruit (findDvObject was):

screen shot 2015-07-16 at 3 58 31 pm

This screenshot can be compared with the "before" screenshot at #2036 (comment)

pdurbin added a commit that referenced this issue Jul 16, 2015
The profiler shows significantly better performance when we reduce the
number of calls to findDvObject so we pass around the DvObject itself
more often, rather than looking it up over and over.

The profiler also suggests that batching the commits to Solr may also
yield a good performance gain but that's for a different commit.

The index API endpoint "missing" is removed as part of this commit since
it relies on code that was rewritten as part of this commit but it was
little used anyway since there's a newer "skipIndexed" boolean for other
index endpoints for catching up an index. We can revisit if we want to
re-add the "missing" endpoint.

Finally the createSolrDoc method was used to SearchUtil and tests were
written for it.

Here are the numbers for before and after this commit from running
scripts/issues/2036/grant-role-then-revoke

        | Solr Up   | Solr Down
------- | --------- | -------
grant   | 0m25.653s | 0m9.699s
revoke  | 0m23.539s | 0m9.968s

        | Solr Up   | Solr Down
------- | --------- | -------
grant   | 0m7.422s  | 0m10.348s
revoke  | 0m6.758s  | 0m10.341s
@pdurbin
Copy link
Member

pdurbin commented Jul 16, 2015

@scolapasta as we discussed 75e8dba seemed safe enough to merge into the 4.0.2 branch (which will become 4.1) so I'm passing this to QA to check for any regressions around search permissions.

I'll continue to look at the code locally however and may commit again if I can figure out a good way to batch the Solr commits.

@pdurbin pdurbin modified the milestones: 4.1, 4.2 Jul 16, 2015
@pdurbin pdurbin removed their assignment Jul 16, 2015
@kcondon
Copy link
Contributor

kcondon commented Jul 20, 2015

@sbarbosadataverse Can you check this out for performance and pass it back to me to check the other stuff? Thx.

@sbarbosadataverse
Copy link

@kcondon I've been dealing with permission performance all week on production so of course I'll check;))

@pdurbin pdurbin assigned pdurbin and unassigned scolapasta Oct 19, 2015
@pdurbin
Copy link
Member

pdurbin commented Oct 20, 2015

To reflect the fact that I see this issue as lower priority for 4.2.1 than #2648 I'm removing the "Critical" label.

@pdurbin
Copy link
Member

pdurbin commented Oct 21, 2015

As we just decided in a meeting, this issue is like #2472 in that we think the performance improvements would be great but we don't want to hold up 4.2.1 so I'm pulling it out of that milestone. Recent changes have all been in branches so there is nothing to QA for 4.2.1. No changes.

@pdurbin
Copy link
Member

pdurbin commented Dec 17, 2015

I demoed to @kcondon a fix I just pushed to a separate branch in 1770995 that makes the indexing of permissions asynchronous when granting or revoking roles.

Per the output below, the time to grant was reduced from 14.5 seconds to 0.2 seconds.

@scolapasta I'm going put this in the 4.2.3 milestone so you can review the code and decide if you want it in. I know in the past we've had to back out of async attempts but this one is smaller and more focused so I hope it doesn't cause a problem.

Please note that this commit does not attempt to address the "6-8 seconds to look up each name" issues reported above which is an entirely different part of the code. It sounds like @scolapasta @michbarsinai and @landreev chatted about this the other day. Perhaps it should be treated as its own issue.

Below is the before and after of my fix. The madagascar dataverse contains 46 datasets and a total of 472 files as of my testing. I'm not actually sure why this one is so slow. I speculated to @sekmiller that perhaps it's because it has a greater than average number of role assignments.

BEFORE

murphy:dataverse pdurbin$ scripts/issues/2036/grant-role-then-revoke madagascar
Assigning admin to @pdurbin on madagascar...

real    0m14.567s
user    0m0.004s
sys     0m0.004s
{
  "assignee": "@pdurbin",
  "_roleAlias": "admin"
}
Retrieving ID of role to revoke...

real    0m0.075s
user    0m0.003s
sys     0m0.003s
{
  "id": 139263,
  "assignee": "@pdurbin",
  "roleId": 1,
  "_roleAlias": "admin",
  "definitionPointId": 1251
}
Revoking admin from @pdurbin on madagascar...

real    0m13.690s
user    0m0.003s
sys     0m0.003s
"Role Admin revoked for assignee @pdurbin in PSI/Madagascar"

AFTER

murphy:dataverse pdurbin$ scripts/issues/2036/grant-role-then-revoke madagascar
Assigning admin to @pdurbin on madagascar...

real    0m0.203s
user    0m0.004s
sys     0m0.004s
{
  "assignee": "@pdurbin",
  "_roleAlias": "admin"
}
Retrieving ID of role to revoke...

real    0m0.095s
user    0m0.003s
sys     0m0.003s
{
  "id": 139264,
  "assignee": "@pdurbin",
  "roleId": 1,
  "_roleAlias": "admin",
  "definitionPointId": 1251
}
Revoking admin from @pdurbin on madagascar...

real    0m0.077s
user    0m0.003s
sys     0m0.003s
"Role Admin revoked for assignee @pdurbin in PSI/Madagascar"
murphy:dataverse pdurbin$ 

@pdurbin pdurbin modified the milestones: 4.2.3, Not Assigned to a Release Dec 17, 2015
@pdurbin pdurbin assigned scolapasta and unassigned pdurbin Dec 17, 2015
@pdurbin
Copy link
Member

pdurbin commented Jan 6, 2016

@kcondon @scolapasta as we discussed this morning I just merged the 2036-async-grant branch into 4.2.3. It contains a fairly small commit (1770995) that I described previously at #2036 (comment) . In short, the change is make the indexing part of changing role assignments asynchronous.

Passing to QA. Some testing around assigning and removing roles would be appreciated, especially paying attention to how long it takes for known slow cases like the madagascar dataverse. I have not looked into memory usage, as previously mentioned at #2036 (comment) as "characteristic 6GB heap allocation/ free". Please note that while " 6-8 seconds to look up each name" has been mentioned at #2036 (comment) this is an entirely different part of the code and probably deserves its own issue. @scolapasta and @landreev discussed that slowness with @michbarsinai when he was recently in town.

@pdurbin pdurbin assigned kcondon and unassigned scolapasta Jan 6, 2016
@kcondon
Copy link
Contributor

kcondon commented Jan 7, 2016

6-8s to look up user
2-3s to save
2s-3s to delete
Basic responsiveness has been achieved with exception of user look up. Opened a separate ticket for that (#2853). Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug a defect
Projects
None yet
Development

No branches or pull requests

8 participants