Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test OWSLib PR allowing POST for WFS getfeature requests #297

Closed
2 of 7 tasks
huard opened this issue Oct 8, 2020 · 35 comments
Closed
2 of 7 tasks

Test OWSLib PR allowing POST for WFS getfeature requests #297

huard opened this issue Oct 8, 2020 · 35 comments
Assignees
Labels
bug Something isn't working Ouranos

Comments

@huard
Copy link
Contributor

huard commented Oct 8, 2020

Description

We need to make WFS requests for info on many subregions. For a large number of regions, the GET requests exceeds the maximum size. See geopython/OWSLib#439

Francis made a PR geopython/OWSLib#706 to solve this, and we'd need a review of the PR, as well as confirmation that it solves our issue, illustrated in https://pavics-sdi.readthedocs.io/projects/raven/en/latest/notebooks/Subset_climate_data_over_watershed.html#Subsetting-a-gridded-climate-dataset
https://github.com/Ouranosinc/raven/blob/master/docs/source/notebooks/gridded_data_subset.ipynb

The notebook sets the number of features to 2, but it should work with 100s of features.

TODO:

  • Local install of PR branch
  • Add method="POST" to flyingpigeon.subset_base.get_feature
  • Create unit test checking that it works with large number of features (that would crash GET requests).
  • Confirm PR is functional for this test case - report success in OWSLib PR comments
  • Wait for PR to be merged and new version of OWSLib released
  • Make new flyingpigeon release
  • Update flyingpigeon version in raven requirements
@Zeitsperre Zeitsperre added bug Something isn't working Ouranos labels Oct 8, 2020
@Zeitsperre
Copy link
Contributor

If I recall correctly, I had established that there was a problem via a very large request run in a Jupyter notebook. If I can find the issue where I raised this and the notebook/environment where I established the problem existed, it'll make things much easier to formally establish that the issue is fixed.

I'll have some time available to put some work into this as we near the Symposium.

@Zeitsperre
Copy link
Contributor

@f-PLT

Thanks for your work on getting POST support into OWSlib. I'm pinging you on this thread as this problem is RAVEN-centric.

I seem to not be grasping the differences between GET and POST request formatting. Here's a snippet from a request formatting function in a PR I'm currently working towards:

    wfs = WebFeatureService(GEO_URL, version="1.1.0", timeout=30)
    if method.upper() == "GET":
        try:
            resp = wfs.getfeature(
                typename=layer,
                bbox=coordinates,
                srsname="urn:x-ogc:def:crs:EPSG:4326",
                method=method,
            )
        except Exception as e:
            raise Exception(e)
    elif method.upper() == "POST":
        if parse(ows_version) < parse("0.20.0"):
            raise NotImplementedError("POST requires OWSlib >= 0.20.0")
        try:
            resp = wfs.getfeature(typename=layer, bbox=coordinates, method=method)
        except Exception as e:
            raise Exception(e)
    else:
        raise NotImplementedError("Method %s not implemented" % method)

Does the `WebFeatureService().getfeature() method work for POST queries? I've combed your PR and haven't been able to find anything suggesting that it wouldn't work, but I seem to only be receiving this cryptic error when the method is POST:

owslib.util.ServiceException: Could not determine geoserver request from http request org.geoserver.platform.AdvancedDispatchFilter$AdvancedDispatchHttpRequest@577bc7f

Any advice?

@f-PLT
Copy link

f-PLT commented Oct 23, 2020

Yes, I did the implementation for Post in 1.1.0 and 2.0.0.

I tested with a few public WFS services, with both versions, but I did have this kind of error when testing on Boreas (https://boreas.ouranos.ca/geoserver/wfs)

An ordinary Post request to that address with an xml filter worked well, but through OWSLib, that error always popped up with Post.

It has to do with this part : https://github.com/f-PLT/OWSLib/blob/wfs-getfeature-post/owslib/feature/__init__.py#L314

Basically, I took this part from the Get section of getfeature(), which takes the url from the GetCapabilities. In the case of Boreas, that address is http://boreas.ouranos.ca:80/geoserver/wfs which works for method=GET, but not with POST. I tried ordinary Post requests to that specific address and got the same kind of error as before (and yours). I gather it's an accessibility issue, as the unsecured address is not reachable outside of it's deployment network, and there's apparently no redirection to https. Might be related to how POST requests work compared to GET requests.

If I got rid of the url finding cited above and went with base_url = self.url, it would then work, but then lost the automatic redirections.

All this is from memory, I'll take an other look at it on Monday to confirm things, and I could also give you the python script I used while implementing this.

Assuming it's the same cause, of course.

@huard huard assigned tlvu and unassigned Zeitsperre Nov 13, 2020
@huard
Copy link
Contributor Author

huard commented Nov 20, 2020

def get_feature(url, typename, features):
    """Return geometry from WFS server."""
    wfs = WebFeatureService(url, version='2.0.0')
    resp = wfs.getfeature([typename], featureid=features,
                          outputFormat='application/json',
                          method="Post")
    return json.loads(resp.read())


def test_get_feature():
    url = "https://pavics.ouranos.ca/geoserver/wfs"
    typename = "public:USGS_HydroBASINS_lake_na_lev12"
    feature_pat = "USGS_HydroBASINS_lake_na_lev12.{}"

    #features = [feature_pat.format(i) for i in range(67088, 67449)]
    features = [feature_pat.format(i) for i in range(67088, 67089)]
    get_feature(url, typename, features)

Fails with

Traceback (most recent call last):
  File "/home/david/src/flyingpigeon/tests/test_subset.py", line 10, in test_get_feature
    get_feature(url, typename, features)
  File "/home/david/src/flyingpigeon/flyingpigeon/subset_base.py", line 18, in get_feature
    method="Post")
  File "/home/david/src/OWSLib/owslib/feature/wfs200.py", line 324, in getfeature
    u = openURL(url, data, method, timeout=self.timeout, headers=self.headers, auth=self.auth)
  File "/home/david/src/OWSLib/owslib/util.py", line 230, in openURL
    raise ServiceException('\n'.join([t.strip() for t in serviceException.itertext() if t.strip()]))
owslib.util.ServiceException: Could not determine geoserver request from http request org.geoserver.platform.AdvancedDispatchFilter$AdvancedDispatchHttpRequest@3b972d1a

@huard
Copy link
Contributor Author

huard commented Nov 20, 2020

@f-PLT Making this test function pass is our use case.

@huard
Copy link
Contributor Author

huard commented Nov 21, 2020

@huard
Copy link
Contributor Author

huard commented Nov 23, 2020

I see this header is already set in the code... stumped.

@tlvu
Copy link
Contributor

tlvu commented Nov 23, 2020

Got this stack-trace server-side from docker logs geoserver that looks like the same error as in #297 (comment):

23 Nov 15:31:12 ERROR [geoserver.ows] -                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
org.geoserver.platform.ServiceException: Could not determine geoserver request from http request org.geoserver.platform.AdvancedDispatchFilter$AdvancedDispatchHttpRequest@67147182
        at org.geoserver.ows.Dispatcher.dispatch(Dispatcher.java:622)
        at org.geoserver.ows.Dispatcher.handleRequestInternal(Dispatcher.java:258)
        at org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:147)
        at org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:50)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:959)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:893)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:968)
        at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:859)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:622)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:844)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:292)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.apache.catalina.filters.CorsFilter.handleNonCORS(CorsFilter.java:436)
        at org.apache.catalina.filters.CorsFilter.doFilter(CorsFilter.java:177)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.filters.ThreadLocalsCleanupFilter.doFilter(ThreadLocalsCleanupFilter.java:28)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.filters.SpringDelegatingFilter$Chain.doFilter(SpringDelegatingFilter.java:75)
        at org.geoserver.wms.animate.AnimatorFilter.doFilter(AnimatorFilter.java:71)
        at org.geoserver.filters.SpringDelegatingFilter$Chain.doFilter(SpringDelegatingFilter.java:71)
        at org.geoserver.filters.SpringDelegatingFilter.doFilter(SpringDelegatingFilter.java:46)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.platform.AdvancedDispatchFilter.doFilter(AdvancedDispatchFilter.java:50)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:316)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:69)
        at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:126)
        at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:90)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:73)
        at org.geoserver.security.filter.GeoServerCompositeFilter.doFilter(GeoServerCompositeFilter.java:92)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:69)
        at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:114)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:73)
        at org.geoserver.security.filter.GeoServerCompositeFilter.doFilter(GeoServerCompositeFilter.java:92)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
        at org.geoserver.security.filter.GeoServerAnonymousAuthenticationFilter.doFilter(GeoServerAnonymousAuthenticationFilter.java:54)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:69)
        at org.springframework.security.web.authentication.www.BasicAuthenticationFilter.doFilterInternal(BasicAuthenticationFilter.java:158)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:73)
        at org.geoserver.security.filter.GeoServerCompositeFilter.doFilter(GeoServerCompositeFilter.java:92)
        at org.geoserver.security.filter.GeoServerBasicAuthenticationFilter.doFilter(GeoServerBasicAuthenticationFilter.java:84)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:69)
        at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:91)
        at org.geoserver.security.filter.GeoServerSecurityContextPersistenceFilter$1.doFilter(GeoServerSecurityContextPersistenceFilter.java:53)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:73)
        at org.geoserver.security.filter.GeoServerCompositeFilter.doFilter(GeoServerCompositeFilter.java:92)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
        at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:213)
        at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:176)
        at org.geoserver.security.GeoServerSecurityFilterChainProxy.doFilter(GeoServerSecurityFilterChainProxy.java:152)
        at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:346)
        at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:262)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.filters.LoggingFilter.doFilter(LoggingFilter.java:87)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.filters.GZIPFilter.doFilter(GZIPFilter.java:48)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.filters.SessionDebugFilter.doFilter(SessionDebugFilter.java:48)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.filters.FlushSafeFilter.doFilter(FlushSafeFilter.java:44)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:121)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:212)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:94)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:504)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:141)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
        at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:620)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:509)
        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1104)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684)
        at org.apache.tomcat.util.net.AprEndpoint$SocketWithOptionsProcessor.run(AprEndpoint.java:2445)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:745)

@Zeitsperre
Copy link
Contributor

I understand that OWSlib should be working with any instance of GeoServer that uses the proper standards, but I just want to raise a point here that our current instance is quite old (http://geoserver.org/release/2.9.3/). Do we know with certainty that this isn't a bug with POST in the build?

I'll try doing a quick look over the advancements in the past few years to see if this is covered anywhere. Otherwise, perhaps the three of us can chat sometime?

@f-PLT
Copy link

f-PLT commented Nov 23, 2020

When I implemented the Post method, I ran accros this problem, but only while testing with Boreas (the only Ouranos Geoserver I tried).

I had no problems with other Geoservers, either external or locally deployed.

The only way I got it to work was to sacrifice this bit of code : https://github.com/f-PLT/OWSLib/blob/wfs-getfeature-post/owslib/feature/__init__.py#L312-L320

If I made it use the "outside" address https://pavics.ouranos.ca/geoserver/wfs, meaning only having this line instead https://github.com/f-PLT/OWSLib/blob/wfs-getfeature-post/owslib/feature/__init__.py#L320, it then worked with Boreas, but lost the ability to be redirected by what OWSLib got back from the GetCapabilities, which I gather could be a problem for other servers.

The url given to OWSLib is https://boreas.ouranos.ca/geoserver/wfs, but the GetCapabilities of the WFS is pointing to http://boreas.ouranos.ca:80/geoserver/wfs, which is restricted (and usually redirected?). It seems to the be the case with Pavics : https://pavics.ouranos.ca/geoserver/wfs?request=GetCapabilities&version=1.1.0

The same code is present for the Get method, but it does not interfere with the request.

I'm not familiar enough with how requests work and the finer points between Get vs Post requests, but my guess is it could be that the Post request does not handle redirection (or if it's possible to redirect it in the context of how OWSLib handles requests, which is outside of what I implemented).

It would also be consistent with this: https://stackoverflow.com/questions/31620987/redirecting-http-post-requests-to-https-post-requests#:~:text=You%20cannot%20redirect.,3. and https://softwareengineering.stackexchange.com/questions/99894/why-doesnt-http-have-post-redirect

From the above, to redirect a Post request seems like something we should avoid, so the ideal solution would be for the Geoserver's GetCapabilities to list the secured address.

However, I'm pretty sure we rely on Geoserver being unsecured locally so it can interact with the other services. Maybe new flag in getFeature() to force the use of the provided url, instead of searching for the provided one in the GetCapabilities? Not sure how good a design that would be though.

@f-PLT
Copy link

f-PLT commented Nov 23, 2020

I can check into it a bit more with David B. this week, see if he has a better idea.

@tlvu
Copy link
Contributor

tlvu commented Nov 23, 2020

I just want to raise a point here that our current instance is quite old (http://geoserver.org/release/2.9.3/)

Maybe we can take this opportunity to address Ouranosinc/pavics-sdi#183? Since right now we are unable to test Geoserver on other PAVICS instance than Boreas meaning if we perform this upgrade, where are we going to test the upgrade?

@tlvu
Copy link
Contributor

tlvu commented Nov 23, 2020

The url given to OWSLib is https://boreas.ouranos.ca/geoserver/wfs, but the GetCapabilities of the WFS is pointing to http://boreas.ouranos.ca:80/geoserver/wfs, which is restricted (and usually redirected?). It seems to the be the case with Pavics : https://pavics.ouranos.ca/geoserver/wfs?request=GetCapabilities&version=1.1.0

Looks like bird-house/birdhouse-deploy#14 a Twitcher/Magpie bug or new feature needed.

@tlvu
Copy link
Contributor

tlvu commented Nov 23, 2020

It would also be consistent with this: https://stackoverflow.com/questions/31620987/redirecting-http-post-requests-to-https-post-requests#:~:text=You%20cannot%20redirect.,3. and https://softwareengineering.stackexchange.com/questions/99894/why-doesnt-http-have-post-redirect

From the above, to redirect a Post request seems like something we should avoid, so the ideal solution would be for the Geoserver's GetCapabilities to list the secured address.

I've read those 2 links and agree that redirecting POST is probably a bad idea. The summary is POST is usually not an idempotent action (ordering pizza, casting a vote, ...) so should not be repeated (a redirect is like repeating the POST).

However, I'm pretty sure we rely on Geoserver being unsecured locally so it can interact with the other services.

I assume other services can be reconfigured to use the secure httpS connection. httpS is pretty much the norm now.

Maybe new flag in getFeature() to force the use of the provided url, instead of searching for the provided one in the GetCapabilities? Not sure how good a design that would be though.

If the flag is optional, then why not? You are not introducing any breaking changes and not forcing it on users either.

@f-PLT
Copy link

f-PLT commented Nov 23, 2020

I assume other services can be reconfigured to use the secure httpS connection. httpS is pretty much the norm now.

I agree 100%

If the flag is optional, then why not? You are not introducing any breaking changes and not forcing it on users either.

Well, short term, yes, I agree. It's also a really quick fix.

Just to be sure we're talking about the same thing, the flag would need to be in OWSLib's getFeature function

From a design perspective, I'm wary of simply adding a flag to solve this problem.

It's not a problem internal to OWSLib. The Post request, the HTTP protocol and getCapabilities (albeit with an url that is being redirected to httpS) are doing what they're suppose to do. The problem is (most probably) on our end with the way we've configured Geoserver.

Introducing a flag argument in a public module for a very specific use case like this seems, well, out of place? I wouldn't like to see, down the line, 10 flags or more for different things unrelated to the core of the WFS standard.

Just to be clear, I'm simply voicing my concerns here. At the end of the day, we do have time constraints and I'll implement what I'm told to implement.

@f-PLT
Copy link

f-PLT commented Nov 23, 2020

I'll also be double-checking again tomorrow to make sure a standard post request works with the Resquests library, and also without the url fetching done in OWSLib.

@huard
Copy link
Contributor Author

huard commented Nov 23, 2020

I agree with @f-PLT on not using owslib to fix issues with our own infrastructure.

@Zeitsperre
Copy link
Contributor

Zeitsperre commented Nov 23, 2020

@f-PLT

@huard and I had checked over your examples the other week, and everything seemed fine when using the Requests library. For the purposes of Raven, I could potentially patch in a workaround based on your code if the priority is getting Raven up and running sooner.

Maybe we can take this opportunity to address Ouranosinc/pavics-sdi#183? Since right now we are unable to test Geoserver on other PAVICS instance than Boreas meaning if we perform this upgrade, where are we going to test the upgrade?

@tlvu

I can throw some time into that issue this week if it's in the interest of the greater project development. For the HTTPS configurations, not quite my tempo. I know that if we can modernize the version, the data management side of the GeoServer would be much simpler for additional instances. This is a bit out of scope in this thread but let me know if there are other aspects of GeoServer that could use a helping hand.

@huard
Copy link
Contributor Author

huard commented Nov 23, 2020

There is no need for a quick fix on this.
Questions:

  • Does updating geoserver fix this ?
  • If not, where is the relevant configuration for this ?

@Zeitsperre
Copy link
Contributor

There is no need for a quick fix on this.

Good to know. If there's something else that is within Raven that needs help with, ping away. Shall I focus on Ouranosinc/pavics-sdi#183 then ?

@tlvu
Copy link
Contributor

tlvu commented Nov 24, 2020

Hitting directly geoserver http://pavics.ouranos.ca:8087/geoserver/wfs?request=GetCapabilities&version=1.1.0 (only works inside Ouranos firewall), I am getting proper

<ows:Operation name="GetCapabilities">
<ows:DCP>
<ows:HTTP>
<ows:Get xlink:href="http://pavics.ouranos.ca:8087/geoserver/wfs"/>
<ows:Post xlink:href="http://pavics.ouranos.ca:8087/geoserver/wfs"/>
</ows:HTTP>
</ows:DCP>

Not sure why behind Nginx and Twitcher/Magpie https://pavics.ouranos.ca/geoserver/wfs?request=GetCapabilities&version=1.1.0, we end up with the wrong http protocol and the wrong port

<ows:Operation name="GetCapabilities">
<ows:DCP>
<ows:HTTP>
<ows:Get xlink:href="http://pavics.ouranos.ca:80/geoserver/wfs"/>
<ows:Post xlink:href="http://pavics.ouranos.ca:80/geoserver/wfs"/>
</ows:HTTP>
</ows:DCP>

Will attempt to put Geoserver directly behind Nginx see if the error persist. If yes then it's Twitcher/Magpie and most likely relate to bird-house/birdhouse-deploy#14

@tlvu
Copy link
Contributor

tlvu commented Nov 24, 2020

Digging further, geoserver is actually not behind Twitcher/Magpie, only behind Nginx. It's the same Nginx for everything else and the usual headers for performing front proxy is there. Will dig into why geoserver is miss configured.

As for why geoserver data is not protected behind Twitcher/Magpie, it is a little weird since Magpie already have the config to protect it https://github.com/bird-house/birdhouse-deploy/blob/cc4ca20665e5eeb3d3ef51e1588521f4b42bef4d/birdhouse/config/magpie/providers.cfg.template#L74-L88 This predates me starting at Ouranos so should we enable this protection or it's not protected by Magpie for a particular reason? Someone remember?

@Zeitsperre
Copy link
Contributor

Does placing it behind Magpie complicate testing for CI ensembles/notebooks? Will users/services need authentication in order to perform WMS/WFS requests if GeoServer is behind it? I can't recall if I was ever privy to this discussion (it sounds like something my predecessor would have covered).

@f-PLT
Copy link

f-PLT commented Nov 24, 2020

There is no need for a quick fix on this.
Questions:

  • Does updating geoserver fix this ?
  • If not, where is the relevant configuration for this ?

For this issue, I don't think updating geoserver will do much. However, I did try Kartoza's latest geoserver image and the setup/configuration process has been streamlined. (env file and no need for JAVA_OPTS in the docker-compose)

If we need to add the plugin for NetCDF to Geoserver, it could be a good opportunity to update at the same time. It's not a priority right now though.

As for the configuration, I found this, which could be helpful : https://docs.geoserver.org/2.9.3/user/configuration/globalsettings.html#proxy-base-url
For version 2.9.3, I think it would have to be managed by the Tomcat properties, like the the max heap size https://github.com/kartoza/docker-geoserver/tree/2.9.4#setting-tomcat-properties

@f-PLT
Copy link

f-PLT commented Nov 24, 2020

As for why geoserver data is not protected behind Twitcher/Magpie, it is a little weird since Magpie already have the config to protect it https://github.com/bird-house/birdhouse-deploy/blob/cc4ca20665e5eeb3d3ef51e1588521f4b42bef4d/birdhouse/config/magpie/providers.cfg.template#L74-L88 This predates me starting at Ouranos so should we enable this protection or it's not protected by Magpie for a particular reason? Someone remember?

It's also before I started working for CRIM, I'll bring it up at our next standup.

@tlvu
Copy link
Contributor

tlvu commented Nov 24, 2020

Update on this.

As for the configuration, I found this, which could be helpful : https://docs.geoserver.org/2.9.3/user/configuration/globalsettings.html#proxy-base-url

This works but it's a manual UI change. I'd rather have everything automated. This would be the fall back if nothing else work.

For version 2.9.3, I think it would have to be managed by the Tomcat properties, like the the max heap size https://github.com/kartoza/docker-geoserver/tree/2.9.4#setting-tomcat-properties

This is not working. Not sure if I implemented it wrong or our geoserver do not support it (we have a custom docker build "inspired" from kartoza build).

My attempt: bird-house/birdhouse-deploy@fddbd1e (not found too many documentation about configuring tomcat properties, I might get this wrong).

Our custom geoserver docker build: https://github.com/bird-house/birdhouse-deploy/blob/cc4ca20665e5eeb3d3ef51e1588521f4b42bef4d/birdhouse/docker/geoserver/Dockerfile

I'll spend the rest of the day finding alternative. Else are we okay with the manual change? I can do the manual change right now on Boreas for you guys to continue testing.

@tlvu
Copy link
Contributor

tlvu commented Nov 24, 2020

Does placing it behind Magpie complicate testing for CI ensembles/notebooks?

Should not, Thredds is being Magpie and we have e2e notebook test using data from Thredds. Only 1 notebook test Thredds with authentication all the rest are accessing Thredds publicly.

Will users/services need authentication in order to perform WMS/WFS requests if GeoServer is behind it?

I guess yes, if some path is protected. Else, like the current Thredds, everything is currently public so no auth required. In the current state (not behind Magpie) everything is public as well.

@tlvu
Copy link
Contributor

tlvu commented Nov 24, 2020

I can do the manual change right now on Boreas for you guys to continue testing.

I just did the manual change.

https://pavics.ouranos.ca/geoserver/wfs?request=GetCapabilities&version=1.1.0 returns

<ows:Operation name="GetCapabilities">
<ows:DCP>
<ows:HTTP>
<ows:Get xlink:href="https://pavics.ouranos.ca/geoserver/wfs"/>
<ows:Post xlink:href="https://pavics.ouranos.ca/geoserver/wfs"/>
</ows:HTTP>
</ows:DCP>

This is what we want?

In the mean time, I'll keep looking for an automated way to set it.

@Zeitsperre
Copy link
Contributor

That looks right. I can try my hand at testing the POST capabilities using @f-PLT's scripts to get confirmation!

@tlvu
Copy link
Contributor

tlvu commented Nov 24, 2020

@Zeitsperre
Copy link
Contributor

Zeitsperre commented Nov 24, 2020

Seems to now be working perfectly for me as well:

url = self.urls["pavics"]
version = "1.1.0"
wfs = WebFeatureService(url, version)

result = wfs.getfeature(typename="public:CanVec_WaterBodies",
                            maxfeatures=5,
                            outputFormat="application/json",
                            method="post",
                            )

Gives me:

{"type":"FeatureCollection","totalFeatures":128796,"features":
[{"type":"Feature","id":"CanVec_WaterBodies.1","geometry":null,"properties":{"bbox":[-125.1331,64.7883,-117.4586,67.0415]}},
{"type":"Feature","id":"CanVec_WaterBodies.2","geometry":null,"properties":{"bbox":[-90.4187,46.4087,-84.6029,48.3055]}},
{"type":"Feature","id":"CanVec_WaterBodies.3","geometry":null,"properties":{"bbox":[-116.9824,60.8071,-108.9008,62.9534]}},
{"type":"Feature","id":"CanVec_WaterBodies.4","geometry":null,"properties":{"bbox":[-87.6073,41.7607,-84.7217,46.095]}},
{"type":"Feature","id":"CanVec_WaterBodies.5","geometry":null,"properties":{"bbox":[-89.58,46.4996,-84.355,49.0172]}}],
"crs":null}

Edit: Interesting that I don't receive the geometry though... Might be more to this...
Edit2: WFS @v2.0.0 gives me the geometry. The fix seems to be exactly what we need!

@huard
Copy link
Contributor Author

huard commented Nov 24, 2020

Excellent !

Trevor, we'll likely need this feature for the raven distributed model configuration we spoke about today, so this is great.

@f-PLT
Copy link

f-PLT commented Nov 24, 2020

Edit: Interesting that I don't receive the geometry though... Might be more to this...
Edit2: WFS @v2.0.0 gives me the geometry. The fix seems to be exactly what we need!

By adding propertyname=None in a wfs 1.1.0 request, you'll get it to work.

This is an unintended behavior, caused by a preset here : https://github.com/geopython/OWSLib/blob/master/owslib/feature/wfs110.py#L223

It looks voluntary, and works for a Get request, but for 2.0.0, it's defaulted to None : https://github.com/geopython/OWSLib/blob/master/owslib/feature/wfs200.py#L227

I'll see if I can fix it without breaking something else!

@f-PLT
Copy link

f-PLT commented Nov 24, 2020

Well... turns out it looks like there's a problem with propertyname even with Get. I'll investigate and create an issue on OWSLib (or might as well just address it myself since my hands are in the code). Working as intentend, just sum small quirks between 1.1.0 and 2.0.0

I'll fix my part of the code so the default propertyname="*" in 1.1.0 does not interfere with requests.

@tlvu
Copy link
Contributor

tlvu commented Nov 25, 2020

I got the automated way working but GetCapabilities returns https://pavics.ouranos.ca:443/geoserver/wfs (there's 443 in there).

Pretty sure it's still working, my previous https://github.com/bird-house/flyingpigeon/blob/7b7a5ce1dda2cde93db34667368ac224a75d58cc/tests/test_subset.py#L1-L10 still pass.

So just for your info if you notice something weird. It's live on Boreas right now for testing ... sorry I do not have another server with proper test data.

tlvu added a commit to bird-house/birdhouse-deploy that referenced this issue Nov 25, 2020
…properly-behind-proxy

Fix geoserver not configured properly behind proxy.

Hitting https://pavics.ouranos.ca/geoserver/wfs?request=GetCapabilities&version=1.1.0

Before fix (wrong scheme and wrong port):
```
<ows:Operation name="GetCapabilities">
<ows:DCP>
<ows:HTTP>
<ows:Get xlink:href="http://pavics.ouranos.ca:80/geoserver/wfs"/>
<ows:Post xlink:href="http://pavics.ouranos.ca:80/geoserver/wfs"/>
</ows:HTTP>
</ows:DCP>
```

After fix:
```
<ows:Operation name="GetCapabilities">
<ows:DCP>
<ows:HTTP>
<ows:Get xlink:href="https://pavics.ouranos.ca:443/geoserver/wfs"/>
<ows:Post xlink:href="https://pavics.ouranos.ca:443/geoserver/wfs"/>
</ows:HTTP>
</ows:DCP>
```

This config automate manual step to set proxy base url in Geoserver UI https://docs.geoserver.org/2.9.3/user/configuration/globalsettings.html#proxy-base-url

I had to override the docker image entrypoint to edit the `server.xml` on the fly before starting Geoserver (Tomcat) since setting Java proxy config did not seem to work (see first commit).

Related to Ouranosinc/raven#297.
@huard huard closed this as completed Nov 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Ouranos
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants