Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address known issues with running Glassfish behind an Apache proxy #2180

Closed
pdurbin opened this issue May 18, 2015 · 10 comments
Closed

Address known issues with running Glassfish behind an Apache proxy #2180

pdurbin opened this issue May 18, 2015 · 10 comments
Assignees
Labels
Milestone

Comments

@pdurbin
Copy link
Member

pdurbin commented May 18, 2015

Java web applications are commonly run behind a proxy such as Apache or nginx but we've experienced difficulty running Dataverse 4.0 behind Apache in production at https://dataverse.harvard.edu

In this issue we will first enumerate the challenges we faced with running Dataverse behind a proxy. This will form the basis of our testing plan for validating this configuration.

Until this testing and validation takes place (in an environment similar to production), we plan to update the Installation Guide to advise not front Glassfish with Apache.

Allowing Dataverse to be run behind a proxy should solve the issue of currently having to run Glassfish as root (#1934) and might be a solution for restoring Shibboleth support (#2117).

@pdurbin pdurbin added the Type: Suggestion an idea label May 18, 2015
@scolapasta scolapasta modified the milestones: Candidates for 4.0.2, In Design Jun 1, 2015
@scolapasta
Copy link
Contributor

assigning @pdurbin for now, since it may be related to the shib work

@pdurbin
Copy link
Member Author

pdurbin commented Jun 29, 2015

The biggest problem we've seen so far when trying to reintroduce Apache in the mix is that using AJP causes "OutOfMemoryError: Java heap space" errors when we try to download large files as a zip as explained at payara/Payara#350 and https://java.net/jira/browse/GRIZZLY-1787

See also writeups at https://www.java.net/forum/topic/glassfish/problem-streaming-chunked-encoding-over-ajp-gf41-apache-modproxy and https://community.oracle.com/message/13157533#13157533

It's for this reason we are experimenting with moving away from AJP in #2294.

@smillidge
Copy link
Contributor

Could be a patch on the way as Grizzly team have reproduced the problem. See https://java.net/jira/browse/GRIZZLY-1787

@pdurbin
Copy link
Member Author

pdurbin commented Jun 30, 2015

@smillidge indeed in https://java.net/jira/browse/GRIZZLY-1787 @rlubke hooked me up with a patch that resolves the AJP problem in the simple apachetest app @landreev and I created for payara/Payara#350

Now we need to figure out if Dataverse behaves better with that patch in place. At payara/Payara#350 (comment) @rlubke indicated that it should be fine to take the "glassfish-grizzly-extra-all.jar" file from Glassfish 4.1 and patch the AjpHttpRequest.class" file with the version from a jar file attached to the Grizzly ticket above. That's what I've done and the updated "glassfish-grizzly-extra-all.jar" file is available at http://dvn-vm1.hmdc.harvard.edu/tmp/issues/2180/grizzly-patch/ . I tested it on https://shibtest.dataverse.org and it didn't seem to cause any harm. Additionally, the test we've been using to reproduce the OutOfMemoryError now works just fine. With the patch in place, I can now download the 2 GB rfile.bin.zip file ("200 OK"):

murphy:~ pdurbin$ curl -v https://shibtest.dataverse.org/api/access/datafiles/19 > /tmp/rfile.bin.zip
* Hostname was NOT found in DNS cache
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 140.247.115.182...
* Connected to shibtest.dataverse.org (140.247.115.182) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384
* Server certificate: shibtest.dataverse.org
* Server certificate: InCommon Server CA
* Server certificate: AddTrust External CA Root
> GET /api/access/datafiles/19 HTTP/1.1
> User-Agent: curl/7.37.1
> Host: shibtest.dataverse.org
> Accept: */*
> 
< HTTP/1.1 200 OK
< Date: Tue, 30 Jun 2015 17:43:03 GMT
< Content-disposition: attachment; filename="dataverse_files.zip"
< Content-Type: application/zip; name="dataverse_files.zip"
< Connection: close
< Transfer-Encoding: chunked
< 
{ [data not shown]
100 1953M    0 1953M    0     0   9.9M      0 --:--:--  0:03:16 --:--:-- 6994k
* Closing connection 0
murphy:~ pdurbin$ 

@scolapasta @kcondon let's meet about next steps ( @landreev I know you're out this week). Have we addressed enough of the challenges we faced with running Dataverse behind a proxy? How much of a testing plan do we need to confirm that it's safe to use Apache and AJP. I like the AJP solution much better than the less secure "headers" solution in #2294.

@pdurbin
Copy link
Member Author

pdurbin commented Jul 1, 2015

@kcondon as we discussed with @scolapasta today I'm passing this ticket to QA now that I've done the following:

@pdurbin pdurbin modified the milestones: 4.0.2, In Design Jul 1, 2015
@pdurbin pdurbin removed their assignment Jul 1, 2015
@kcondon kcondon changed the title Validate that Dataverse can be run behind a proxy such as Apache or nginx Address known issues with running Glassfish behind an Apache proxy Jul 20, 2015
@kcondon
Copy link
Contributor

kcondon commented Jul 20, 2015

Bottom line: Downloading a large zip file without a file size due to on-the-fly generation uncovered a problem where Glassfish ran out of heap space. Another possible issue was a perceived reduction in page load performance. This last was anecdotal and not measured.

@kcondon kcondon self-assigned this Jul 20, 2015
@kcondon
Copy link
Contributor

kcondon commented Jul 20, 2015

@pdurbin The install doc does not tell people where to get the new grizzley jar file. Should I get the one above at vm1? http://guides.dataverse.org/en/4.1/installation/shibboleth.html#apply-grizzly-1787-patch

@pdurbin
Copy link
Member Author

pdurbin commented Jul 21, 2015

@kcondon yes, as I wrote before: Note that http://dvn-vm1.hmdc.harvard.edu/tmp/issues/2180/grizzly-patch/glassfish-grizzly-extra-all.jar is the file to use until the v4.0.2 tag is created and we upload the patch there.

@pdurbin
Copy link
Member Author

pdurbin commented Jul 21, 2015

@kcondon you're right, what I wrote in the docs is confusing. I tried again in 68b9e83 and built the docs. If this is still unclear, please let me know:

http://guides.dataverse.org/en/4.1/installation/shibboleth.html#apply-grizzly-1787-patch

(Since the patch is so tiny at 300KB I checked it into our source tree for safe keeping.)

@kcondon
Copy link
Contributor

kcondon commented Jul 21, 2015

Doc looks good, large zip download also good. I compared homepage load with production data using http://www.webpagetest.org. The numbers jumped around a bit but they were in the same ball park.

Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants