Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation requests have a maximum size ~20 MB #193

Closed
dehall opened this issue Nov 14, 2024 · 6 comments
Closed

Validation requests have a maximum size ~20 MB #193

dehall opened this issue Nov 14, 2024 · 6 comments

Comments

@dehall
Copy link
Collaborator

dehall commented Nov 14, 2024

We're seeing validation requests larger than ~20 MB fail with an HTTP 400 error. These kinds of requests are uncommon but they can happen with large Bundles. In particular on Inferno we saw this reported for our Service Base URL test kit where the publication Bundle can be very large: https://chat.fhir.org/#narrow/channel/179309-inferno/topic/Inferno.20Service.20Base.20URL.20Test.20Kit.20-.20Validator.20Error.3F

It appears that the issue is with the Jackson JSON parser, which by default has a limit of 20,000,000 characters - see FasterXML/jackson-core#1014
In that thread is a suggestion on how to configure the maximum length, which we can kind of get at via the koin config in src/jvmMain/kotlin/Module.kt: (though note the jsonFactory call there is deprecated)

@@ -1,3 +1,4 @@
+import com.fasterxml.jackson.core.StreamReadConstraints
 import com.fasterxml.jackson.databind.DeserializationFeature
 import com.fasterxml.jackson.databind.SerializationFeature
 import controller.debug.debugModule
@@ -93,6 +94,7 @@ fun Application.setup() {
              * model classes, and map them to classes across JVM/Common/JS.
              */
             configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
+            jsonFactory.setStreamReadConstraints(StreamReadConstraints.builder().maxStringLength(Int.MAX_VALUE).build())
         }
     }

An alternative would be to switch to Gson as the json handler, which theoretically would be as easy as the below change (also in Module.kt), since the requisite libraries are already defined in build.gradle. This change also enabled validation of the nearly 50 MB file noted in that zulip thread but it breaks the API unfortunately - this field in ValidationOutcome is serialized as "issues" by Jackson based on the annotation but as "messages" by Gson based on the field name.
https://github.com/hapifhir/org.hl7.fhir.core/blob/master/org.hl7.fhir.validation/src/main/java/org/hl7/fhir/validation/cli/model/ValidationOutcome.java#L14

@@ -85,14 +86,17 @@ fun Application.setup() {
     }

     install(ContentNegotiation) {
-        jackson {
-            enable(SerializationFeature.INDENT_OUTPUT)
+        gson {
+            // enable(SerializationFeature.INDENT_OUTPUT)
+            setPrettyPrinting()

             /*
              * Right now we need to ignore unknown fields because we take a very simplified version of many of the fhir
              * model classes, and map them to classes across JVM/Common/JS.
              */
-            configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
+            //configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
+            // this seems to already be the behavior in gson
+
         }
     }
@dotasek
Copy link
Collaborator

dotasek commented Nov 19, 2024

I have no objection to switching to gson if it gets us over this. I have no loyalty to Jackson.

There are annotations that you can use to override the default field names in gson. I can't recall what they are right now, but I did exactly that ages ago in another project...

@dotasek
Copy link
Collaborator

dotasek commented Nov 19, 2024

Probably this: https://howtodoinjava.com/gson/gson-serializedname/

I will give that a shot. Jackson and Gson can live in harmony with regard to annotations, if I remember correctly, so we can keep both in case we need to jump back for some reason.

@dotasek
Copy link
Collaborator

dotasek commented Nov 19, 2024

What's the limit on Gson with regard to size? If it's unlimited, that's asking for DOS attacks. Better we allow a configuration for something reasonable, both here, and on the raw request.

@dehall
Copy link
Collaborator Author

dehall commented Nov 19, 2024

I don't think Gson imposes any size limits of its own.
And I'm fine with making a configurable limit, I would just ask that there be an error message that explains what's going on. Right now the error is thrown from within this method call:


and then it just returns HTTP 400, no body or anything explaining what's wrong.

@dotasek
Copy link
Collaborator

dotasek commented Nov 21, 2024

@dehall I've got a working configuration for gson, which I think is the first step. It works for me, manually and via the smoke tests I've included here: https://github.com/hapifhir/org.hl7.fhir.validator-wrapper/tree/master/http-client-tests

Could you take a look at it, and ensure that it works for you as well?

You'll have to use the following PRs for core and for the validator-wrapper:

hapifhir/org.hl7.fhir.core#1824
#195

@dehall
Copy link
Collaborator Author

dehall commented Dec 4, 2024

Closing this as completed by #195

@dehall dehall closed this as completed Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants