Download count limit is too restrictive for many uses #193

ValWood · 2015-05-06T13:33:34Z

Is it possible increase the download option number (restricted to 10,000). This is quite restrictive if you need all of the annotations for a species. GOA/QucikGO has a restriction, but has the option to over ride this restriction (can download at least 500,000 annotations). Is there any reason why AmiGO cannot allow the same?

kltm · 2015-05-06T18:11:27Z

Similar to #35, but higher limit.

kltm · 2015-05-06T18:13:44Z

Would prefer to wait until berkeleybop/bbop-js#16 has been crossed to take advantage of any performance increases and only do the numbers once.

kltm · 2015-05-06T18:20:42Z

@ValWood Yes, the limit is a bit low--we get a steady trickle of comments about it on GO Help. The current number is the product of an ad hoc process during where we had a group of users simultaneously download sets at different limits while we watched the test servers and their response times. We wanted to make sure that one user trying to download a large file could not interfere with the responsiveness of the interface (not really an issue for AmiGO 1.x).

Now that we've been in production for a bit, we should actually have more solid numbers rather than the earlier guesstimate. @mugitty (whoops, wrong downstream), do you have a feel for how stressed the production servers are when they are running 1) during peak times and 2) the most used time when one of the servers is out of the balancer?

kltm · 2015-05-07T15:49:47Z

@kkarra (got the right production site now), now that we've been in production for a bit, we should actually have more solid numbers rather than the earlier guesstimate--do you have a feel for how stressed the production servers are when they are running 1) during peak times and 2) the most used time when one of the servers is out of the balancer?

kkarra · 2015-05-11T16:10:37Z

do you want me to identify peak times or do we already know that and what to know load levels during peak times?
looking back in logs for peak times when only one server was in use?

For increasing in #rows - was it memory issue or something else?

kltm · 2015-05-11T19:54:44Z

@kkarra I'd assume the peak times can be identified pretty easily by looking at the analytics. I'd be interested in machine stress (disk and CPU usage) during two different peak times: the global peak and the time of highest usage when a single machine is in use.

What I'm trying to determine is how much slack we have with our current setup for increasing the max download rows. We can also look at increasing the resources we have to allow downloads for a certain target number (say 500k), but it's a start to look at what we have.

cmungall · 2016-03-09T19:10:42Z

Is this an accurate status update:

we think that we can increase the limit by, say 10x
we need to more stress testing to ensure that we don't end up choking the server

@kkarra are you still available to help with this?

kltm · 2016-03-09T19:28:53Z

I think we can up the limits a rather lot, given testing and possibly additional hardware. The main issue is not slowing down the UX on the main parts of AmiGO.
I'd vote for starting the process by introducing a "download server" URL to the configuration, and then build up from there. Worst case, we throw AWS at it.

cmungall · 2016-03-09T19:30:40Z

I assume the download server option would need some changes in the app (still in 2.4 milestone?) and lots of coordination with production?

kltm · 2016-03-09T21:54:58Z

It will be a new variable that needs to be strung through. Once it's there, we could experiment with it fairly broadly. My druthers would be to start with a separate download server, behind a load balancer, and scale up as needed.

…ology/amigo#193

…geneontology/amigo#193

kltm · 2016-03-19T00:56:41Z

With the next batch of commits coming down the pipe, this issue is fixed in the code. All that we need now is:

a load balanced url to aim it
update configs to reflect this (AMIGO_PUBLIC_GOLR_BULK_URL)

…317 and #193

…n aws; work on #193 and #340

… download agents; TODO: could be more randomized; work on #193

kltm · 2016-04-28T01:51:44Z

EDITED

Okay, a little something in the way of "data" for this. I tried this against our machine here, using ~30s windows.

Five UI agents, no download agents:

Five UI agents, one download agent trying 10000 lines:

Five UI agents, one download agent trying 100000 lines:

Five UI agents, one download agent trying 500000 lines:

Looking at these, in this limited case, and without truly running the numbers, it doesn't look like the download agent is really dragging up the response times for the UI agents, up to 100000.
At 500000, I don't get a response from the download agent (number 6) within the 30s and the UI agents are getting hammered.

Of course, this is just with me server settings, etc., etc. But the takeaway here is that we should not just up to 500000. We either say something between 100000 and 500000 is good enough (gunna guess the lower end of the scale here) or we implement the separate download server. Any thoughts?

EDITED

kltm · 2016-04-28T01:57:42Z

Or maybe the load balancer is smart enough to deal with this kind of fun? @stuartmiyasato @kkarra would it be alright to retry some of these on the production servers at some point? Just a few 30s windows should annoy anybody too much right...?

cmungall · 2016-04-28T16:52:09Z

there are 405k annotations for human. There is no point going beneath this number, we may as well keep it low and make people download via other means.

ValWood · 2016-04-28T16:57:47Z

Won't be nearly so high when the redundancy is removed ;)

kltm · 2016-04-28T17:37:37Z

Okay, from @cmungall 's comment, we'll paint the desired limit at 500k.
That leaves us with, in order of difficulty and things to eliminate:

production servers can take the beating directly
production servers cannot take it, but the load balancer keeps things responsive
more though brought to bear against the current production balancing setup
switch to ui vs. download backend setup (a separate server (or set of) at a different URL)
remove redundant annotations (see Add ability to filter redundant annotations #43); may not be uniform solution

To start eliminating the first two, I just want to make sure I have a thumbs up to test a little against the production setup, both aiming at individual backends, as well as the current load balancer (@stuartmiyasato @kkarra).

stuartmiyasato · 2016-04-28T18:23:18Z

I am okay with testing against production.

cmungall · 2016-04-28T21:04:54Z

On 28 Apr 2016, at 9:57, Val Wood wrote:

Won't be nearly so high when the redundancy is removed ;)

Good point. This is #43. If set by default it will reduce the size of
the typical download.

kltm · 2016-04-28T21:12:18Z

Okay, great. I'll probably start poking at it in a little bit a little later. It will be from the LBL block.
I've added the redundant annotation as an approach.

stuartmiyasato · 2016-04-28T22:52:24Z

Got Nagios/Uptime Robot alerts saying AmiGO and GOlr are down. Are you testing now? Should I restart the servers or will that interrupt any tests in progress?

kltm · 2016-04-28T23:22:24Z

Yes, please restart.

kltm · 2016-04-28T23:22:56Z

Okay, I'm not going to add more graphs, but I'll give a summary here.
I couldn't remember what the individual GOlr backend URLs were, so I just started with the load balancer. Over the same tests, the following (more or less obvious) things seemed to be confirmed, versus the tomodachi server:

The balanced URL was overall slower
The balancer URL had a wider range of times (less consistent)
The balanced URL was unable to return download results even at 100k (which tomodachi was able to give a dozen; nobody accomplished 500k in 30s)
The balancer URL was more robust--it did not have the pronounced knockout slots like tomodachi for the ui agents

Trying to see how long the 500k would actually take, I ran into a 1min timeout from nginx. If that could be upped, I could give it another try and see how long a user would actually have to wait.

Of course there are tons of uncontrolled variables here, include likely critical server settings. Considering that human is like 10% of total annotations and a non-trivial amount of total documents, I imagine that disk makes up a large portion of this.

kltm · 2016-04-28T23:28:39Z

BTW, my testing was over before the servers went down, so I'm not sure how much what I did was involved there.

As well, I forgot the conclusion. If people are willing to wait a minute or so (this will need to be tested) for the download, it's possible to stop this at the load balancer level. If not, we'll need to proceed to the independent download server, not for QoS, but to have something fast enough to server users large files from the index.

stuartmiyasato · 2016-04-28T23:42:52Z

Trying to see how long the 500k would actually take, I ran into a 1min timeout from nginx. If that could be upped, I could give it another try and see how long a user would actually have to wait.

@kltm I'm not sure what configuration needs to be edited to make this happen. Can you give me a URL and/or error message to replicate the timeout?

kltm · 2016-04-28T23:59:18Z

Well, I'm now getting 500k download completes in 40s, most of that on the transfer itself (whereas the query before was killing it), so hard to test. Possibly whatever happened interfered with the later testing?

The error was a fairly generic 502 for nginx, possibly referencing a timeout. I don't have a variable that we use to up this as it's not something we've run into when we were using the nginx reverse proxy server.

kltm · 2016-04-29T00:01:27Z

I'm going to go ahead and try the 500k again on the load balancer. The download query I'm using tries to start at a random spot so as to bypass any attempt to cache; let's see what happens, I'll run it for 90s.

kltm · 2016-04-29T00:45:28Z

Okay, the data is just confusing. This time around, the bbop backend demonstrates few disrupted slots, while the production balanced URL experiences significant disruption; neither successfully allow a 500k download over 90s. Something to do with the peppering by the ui agents maybe? Maybe one of the load balanced backends was unresponsive? It would probably take a lot to untangle all of this.

I see two paths here, as it's obvious that we won't be able to just make this work:

put more effort into the hardware, configuration, and load balancing of what we currently have
put that effort instead into having a separate download URL (either real, location TBD, or AWS)

Any input from @stuartmiyasato about current commitments or capacity?

stuartmiyasato · 2016-04-29T01:49:16Z

Mike Cherry has made it pretty clear to our lab that I won't be on the GO project for the next grant cycle, so it seems rather pointless for me to work on any local (Stanford) GO infrastructure projects. I would vote for AWS as a result, mostly due to my familiarity with it. But since I won't be managing it in the long term, my vote probably shouldn't count for much...

kltm · 2016-05-20T21:18:56Z

Okay, I'm getting bogged down in various things and we need to get this release out, even without the large downloads fully functional. I'm going to try and make this so it is just a configuration issue in the future (server variables and re-install), whatever solution we come up with.

For now, I'm going to thread through a new variable (AMIGO_DOWNLOAD_LIMIT/download_limit) and change the library settings to 100000 (while not 500k, still arguably 10x better and can be used to test the new settings). With these, and the download server addition from before, we should be able to switch to a separate download server down the line once we have it with just a few config changes.

… on #193

kltm · 2016-05-20T21:47:19Z

The code is (well, should be) complete for this fix. I'm going to close this out and open a hotfix for download servers.

kltm changed the title ~~AmiGO download option is very restrictive (number)~~ Download count limit is too restrictive for many uses May 6, 2015

kltm added the bug (C: surface issue) label May 6, 2015

kltm added bug (B: affects usability) and removed bug (C: surface issue) labels May 6, 2015

kltm added this to the 2.3 milestone May 6, 2015

kltm modified the milestones: 2.3, 2.4 Aug 26, 2015

cmungall mentioned this issue Feb 23, 2016

Improve tip text on some download buttons #309

Closed

kltm modified the milestones: 2.4, 2.5 Mar 2, 2016

kltm added a commit to berkeleybop/bbop-manager-golr that referenced this issue Mar 18, 2016

allow for an additional URL to be passed to download; work on geneont…

40ccfef

…ology/amigo#193

kltm added a commit to berkeleybop/bbop-widget-set that referenced this issue Mar 19, 2016

thread through separate download url path through the buttons; work on …

10b623b

…geneontology/amigo#193

kltm added a commit that referenced this issue Mar 22, 2016

fix init init; start stringing in new bulk download variable; work on #…

f7d28fd

…317 and #193

kltm added a commit that referenced this issue Apr 6, 2016

a draft playbook for ansible to get started on golr bulk deployment o…

e8a79e9

…n aws; work on #193 and #340

kltm added a commit that referenced this issue Apr 27, 2016

add a tools for use in benchmarking hits on a golr server with ui vs.…

4b5bbbd

… download agents; TODO: could be more randomized; work on #193

kltm added a commit that referenced this issue Apr 28, 2016

whoops--forgot to seal timers; work on #193

25faca4

kltm added a commit that referenced this issue May 20, 2016

thread through config variable to JS templates; work on #193

da83df1

kltm added a commit that referenced this issue May 20, 2016

make download_limit variable available in the templates as well; work…

20172da

… on #193

kltm added a commit that referenced this issue May 20, 2016

add 100k download limit as default to all configurations; work on #193

0672c0b

kltm added the revisit label May 20, 2016

kltm closed this as completed May 20, 2016

kltm removed the revisit label May 20, 2016

kltm mentioned this issue May 20, 2016

Download count limit is too restrictive for many uses (pt. 2) #354

Open

kltm mentioned this issue May 30, 2017

Move TSV generation from solr to biolink monarch-initiative/monarch-legacy#1461

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Download count limit is too restrictive for many uses #193

Download count limit is too restrictive for many uses #193

ValWood commented May 6, 2015

kltm commented May 6, 2015

kltm commented May 6, 2015

kltm commented May 6, 2015

kltm commented May 7, 2015

kkarra commented May 11, 2015

kltm commented May 11, 2015

cmungall commented Mar 9, 2016

kltm commented Mar 9, 2016

cmungall commented Mar 9, 2016

kltm commented Mar 9, 2016

kltm commented Mar 19, 2016

kltm commented Apr 28, 2016 •

edited

Loading

kltm commented Apr 28, 2016

cmungall commented Apr 28, 2016

ValWood commented Apr 28, 2016

kltm commented Apr 28, 2016 •

edited

Loading

stuartmiyasato commented Apr 28, 2016

cmungall commented Apr 28, 2016

kltm commented Apr 28, 2016

stuartmiyasato commented Apr 28, 2016

kltm commented Apr 28, 2016

kltm commented Apr 28, 2016

kltm commented Apr 28, 2016

stuartmiyasato commented Apr 28, 2016

kltm commented Apr 28, 2016

kltm commented Apr 29, 2016

kltm commented Apr 29, 2016

stuartmiyasato commented Apr 29, 2016

kltm commented May 20, 2016

kltm commented May 20, 2016

Download count limit is too restrictive for many uses #193

Download count limit is too restrictive for many uses #193

Comments

ValWood commented May 6, 2015

kltm commented May 6, 2015

kltm commented May 6, 2015

kltm commented May 6, 2015

kltm commented May 7, 2015

kkarra commented May 11, 2015

kltm commented May 11, 2015

cmungall commented Mar 9, 2016

kltm commented Mar 9, 2016

cmungall commented Mar 9, 2016

kltm commented Mar 9, 2016

kltm commented Mar 19, 2016

kltm commented Apr 28, 2016 • edited Loading

kltm commented Apr 28, 2016

cmungall commented Apr 28, 2016

ValWood commented Apr 28, 2016

kltm commented Apr 28, 2016 • edited Loading

stuartmiyasato commented Apr 28, 2016

cmungall commented Apr 28, 2016

kltm commented Apr 28, 2016

stuartmiyasato commented Apr 28, 2016

kltm commented Apr 28, 2016

kltm commented Apr 28, 2016

kltm commented Apr 28, 2016

stuartmiyasato commented Apr 28, 2016

kltm commented Apr 28, 2016

kltm commented Apr 29, 2016

kltm commented Apr 29, 2016

stuartmiyasato commented Apr 29, 2016

kltm commented May 20, 2016

kltm commented May 20, 2016

kltm commented Apr 28, 2016 •

edited

Loading

kltm commented Apr 28, 2016 •

edited

Loading