-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download count limit is too restrictive for many uses #193
Comments
Similar to #35, but higher limit. |
Would prefer to wait until berkeleybop/bbop-js#16 has been crossed to take advantage of any performance increases and only do the numbers once. |
@ValWood Yes, the limit is a bit low--we get a steady trickle of comments about it on GO Help. The current number is the product of an ad hoc process during where we had a group of users simultaneously download sets at different limits while we watched the test servers and their response times. We wanted to make sure that one user trying to download a large file could not interfere with the responsiveness of the interface (not really an issue for AmiGO 1.x).
|
@kkarra (got the right production site now), now that we've been in production for a bit, we should actually have more solid numbers rather than the earlier guesstimate--do you have a feel for how stressed the production servers are when they are running 1) during peak times and 2) the most used time when one of the servers is out of the balancer? |
For increasing in #rows - was it memory issue or something else? |
@kkarra I'd assume the peak times can be identified pretty easily by looking at the analytics. I'd be interested in machine stress (disk and CPU usage) during two different peak times: the global peak and the time of highest usage when a single machine is in use. What I'm trying to determine is how much slack we have with our current setup for increasing the max download rows. We can also look at increasing the resources we have to allow downloads for a certain target number (say 500k), but it's a start to look at what we have. |
Is this an accurate status update:
@kkarra are you still available to help with this? |
I think we can up the limits a rather lot, given testing and possibly additional hardware. The main issue is not slowing down the UX on the main parts of AmiGO. |
I assume the download server option would need some changes in the app (still in 2.4 milestone?) and lots of coordination with production? |
It will be a new variable that needs to be strung through. Once it's there, we could experiment with it fairly broadly. My druthers would be to start with a separate download server, behind a load balancer, and scale up as needed. |
With the next batch of commits coming down the pipe, this issue is fixed in the code. All that we need now is:
|
… download agents; TODO: could be more randomized; work on #193
Or maybe the load balancer is smart enough to deal with this kind of fun? @stuartmiyasato @kkarra would it be alright to retry some of these on the production servers at some point? Just a few 30s windows should annoy anybody too much right...? |
there are 405k annotations for human. There is no point going beneath this number, we may as well keep it low and make people download via other means. |
Won't be nearly so high when the redundancy is removed ;) |
Okay, from @cmungall 's comment, we'll paint the desired limit at 500k.
To start eliminating the first two, I just want to make sure I have a thumbs up to test a little against the production setup, both aiming at individual backends, as well as the current load balancer (@stuartmiyasato @kkarra). |
I am okay with testing against production. |
On 28 Apr 2016, at 9:57, Val Wood wrote:
Good point. This is #43. If set by default it will reduce the size of |
Okay, great. I'll probably start poking at it in a little bit a little later. It will be from the LBL block. |
Got Nagios/Uptime Robot alerts saying AmiGO and GOlr are down. Are you testing now? Should I restart the servers or will that interrupt any tests in progress? |
Yes, please restart. |
Okay, I'm not going to add more graphs, but I'll give a summary here.
Trying to see how long the 500k would actually take, I ran into a 1min timeout from nginx. If that could be upped, I could give it another try and see how long a user would actually have to wait. Of course there are tons of uncontrolled variables here, include likely critical server settings. Considering that human is like 10% of total annotations and a non-trivial amount of total documents, I imagine that disk makes up a large portion of this. |
BTW, my testing was over before the servers went down, so I'm not sure how much what I did was involved there. As well, I forgot the conclusion. If people are willing to wait a minute or so (this will need to be tested) for the download, it's possible to stop this at the load balancer level. If not, we'll need to proceed to the independent download server, not for QoS, but to have something fast enough to server users large files from the index. |
@kltm I'm not sure what configuration needs to be edited to make this happen. Can you give me a URL and/or error message to replicate the timeout? |
Well, I'm now getting 500k download completes in 40s, most of that on the transfer itself (whereas the query before was killing it), so hard to test. Possibly whatever happened interfered with the later testing? The error was a fairly generic 502 for nginx, possibly referencing a timeout. I don't have a variable that we use to up this as it's not something we've run into when we were using the nginx reverse proxy server. |
I'm going to go ahead and try the 500k again on the load balancer. The download query I'm using tries to start at a random spot so as to bypass any attempt to cache; let's see what happens, I'll run it for 90s. |
Okay, the data is just confusing. This time around, the bbop backend demonstrates few disrupted slots, while the production balanced URL experiences significant disruption; neither successfully allow a 500k download over 90s. Something to do with the peppering by the ui agents maybe? Maybe one of the load balanced backends was unresponsive? It would probably take a lot to untangle all of this. I see two paths here, as it's obvious that we won't be able to just make this work:
Any input from @stuartmiyasato about current commitments or capacity? |
Mike Cherry has made it pretty clear to our lab that I won't be on the GO project for the next grant cycle, so it seems rather pointless for me to work on any local (Stanford) GO infrastructure projects. I would vote for AWS as a result, mostly due to my familiarity with it. But since I won't be managing it in the long term, my vote probably shouldn't count for much... |
Okay, I'm getting bogged down in various things and we need to get this release out, even without the large downloads fully functional. I'm going to try and make this so it is just a configuration issue in the future (server variables and re-install), whatever solution we come up with. For now, I'm going to thread through a new variable (AMIGO_DOWNLOAD_LIMIT/download_limit) and change the library settings to 100000 (while not 500k, still arguably 10x better and can be used to test the new settings). With these, and the download server addition from before, we should be able to switch to a separate download server down the line once we have it with just a few config changes. |
The code is (well, should be) complete for this fix. I'm going to close this out and open a hotfix for download servers. |
Is it possible increase the download option number (restricted to 10,000). This is quite restrictive if you need all of the annotations for a species. GOA/QucikGO has a restriction, but has the option to over ride this restriction (can download at least 500,000 annotations). Is there any reason why AmiGO cannot allow the same?
The text was updated successfully, but these errors were encountered: