Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request Body Size Limit? #1733

Closed
MTKnife opened this issue Mar 27, 2018 · 7 comments
Closed

Request Body Size Limit? #1733

MTKnife opened this issue Mar 27, 2018 · 7 comments

Comments

@MTKnife
Copy link

MTKnife commented Mar 27, 2018

I've got a Flask application doing natural language processing (NLP), and it accepts a request body consisting of a JSON array of tokens (words). Thus, a document 400 words long become an array of length 400, and so on.

When I pass in an array longer than about 8K characters (including the commas and such), I get no error whatsoever in the gunicorn log, but the request passed to the API is completely empty. If I pass in a smaller document, I have no problem. If I run the app with werkzeug.serving.run_simple rather than gunicorn, I have no problems with long documents.

I've had a look at #376 and #1659, and judging by the response to #1659, the request body shouldn't be limited at all. Nonetheless, I added the following to my config file:
limit_request_line = 0
limit_request_fields = 32768
limit_request_field_size = 0

The new lines in the config don't help. That fact suggests that this is a different issue from #1704, since I can't get my app to work even with values of unlimited (0) for the two settings that allow that--and at any rate, I'm not getting the "Bad Request" error I've seen referenced elsewhere (I don't recall where).

Thanks for any help you can provide.

@MTKnife
Copy link
Author

MTKnife commented Mar 28, 2018

OK, apparently the problem is Flask's inability to handle chunked data: https://medium.com/@DJetelina/python-and-chunked-transfer-encoding-11325245a532

As noted in #1264, Django has a similar problem.

Ugh.

EDIT: And more to the point, see #1653.

@MTKnife
Copy link
Author

MTKnife commented Mar 28, 2018

For anyone else experiencing this problem with a Flask app, adding this function to the app code (note the "before_request" decorator) fixes it:

@app.before_request
def handle_chunking():
    """
    Sets the "wsgi.input_terminated" environment flag, thus enabling
    Werkzeug to pass chunked requests as streams.  The gunicorn server
    should set this, but it's not yet been implemented.
    """

    transfer_encoding = request.headers.get("Transfer-Encoding", None)
    if transfer_encoding == u"chunked":
        request.environ["wsgi.input_terminated"] = True

@tuukkamustonen
Copy link

I believe the content of this ticket is different from the title. Rename?

Also, chunked data support is handled in #1653. Close this one?

@tuukkamustonen
Copy link

tuukkamustonen commented Apr 8, 2019

On the other hand, I would like to ask clarification on body size:

Actually, I asked also about it #1659 but I'm not sure I understand. In a nutshell, my conclusion there was that (using Flask):

  1. Chunked transfers get streamed to the application (by gunicorn->werkzeug->code). It's not possible to know the stream size beforehand, so application must read it up to a point, and then kill the connection if it's just too large. There's no built-in support for this - you have to check it manually.
  2. With non-chunked, request gets buffered by gunicorn...? or werkzeug? By the time you can check Content-Length in your code (or in werkzeug via MAX_CONTENT_LENGTH), the whole request has already been buffered, and you can only prevent code from processing an overly large request, not prevent reading it? If someone sends 1 GB payload, it helps little to check Content-Length after everything has been already downloaded.

If my second bullet is correct, then limit_request_body option would make sense (to have gunicorn check Content-Length asap, before buffering the whole request)?

@tilgovi
Copy link
Collaborator

tilgovi commented Apr 13, 2019

  1. Correct.
  2. Gunicorn does not buffer the whole input.

@tuukkamustonen
Copy link

Ok, so if gunicorn doesn't buffer it, werkzeug (or Flask) still might. I thought I could assume that werkzeug is smart enough to check MAX_CONTENT_LENGTH on the fly, before actually downloading the content, but after looking at the docs/code I'm not so sure anymore. Anyway, I might open a ticket on their tracker, thanks for answering on this repetition again @tilgovi

@benoitc
Copy link
Owner

benoitc commented Nov 22, 2019

we now support wsgi.input_terminated .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants