Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not send Content-length: 0 in GET/HEAD/TRACE/OPTIONS requests (#2167) #2191

Merged
merged 1 commit into from
Oct 16, 2017
Merged

Conversation

socketpair
Copy link
Contributor

@socketpair socketpair commented Aug 12, 2017

What do these changes do?

Are there changes in behavior for the user?

Related issue number

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the changes folder
    • name it <issue_id>.<type> for example (588.bug)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

req = ClientRequest(
'post', URL('http://python.org/'),
data={}, loop=loop)
req.update_body_from_data.assert_called_once_with({})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks to Pycharm.

meth, URL('http://python.org/'), data={'life': '42'},
loop=loop)
assert '/' == req.url.path
assert b'life=42' == req.body._value
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely wrong test.

Copy link
Member

@kxepal kxepal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't introduce new behaviour while fixing some stuff. And especially, don't do that within the same commit. This makes commit history review much harder.

@@ -82,6 +88,10 @@ def __init__(self, method, url, *,
self.url = url.with_fragment(None)
self.original_url = url
self.method = method.upper()

if self.method in self.GET_METHODS and any((data, compress, chunked)):
raise ValueError('Method %s should not contain request-data.' % self.method)
Copy link
Member

@kxepal kxepal Aug 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will eventually break requests to elastic search. Technically, you can send GET with payload and that's correct from the point of spec.

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I didn't know that ElasticSearch predates RFC, where it's said that HTTP-server SHOULD ignore payload of GET-requests. I'm very upset about everything that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, that was a big surprise for me as well, but that's how things are. You can google the subject, iirc, on some w3c mailing list that case was discussed and considered as fine. Also, SHOULD is a strong recommendation, but it's not a mandatory behaviour as what MUST does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the checks and fix tests accordingly.

hdrs.METH_GET,
hdrs.METH_HEAD,
hdrs.METH_OPTIONS,
hdrs.METH_TRACE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should DELETE method be here are well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing is said regarding DELETE in RFC. And also:

  1. a message-body is explicitly forbidden in responses to HEAD requests (section 9, and 9.4 specifically)
  2. an entity-body is explicitly forbidden in TRACE requests only, all other request types are unrestricted (section 9, and 9.8 specifically)

Should I add raise ValueError in constructor in these requests ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add raise ValueError in constructor in these requests ?

Let's open another issue for that. If we can help users to not produce invalid requests that would be good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK: #2192

@@ -103,7 +106,8 @@ def __init__(self, method, url, *,
self.update_proxy(proxy, proxy_auth, proxy_from_env)

self.update_body_from_data(data)
self.update_transfer_encoding()
if data or self.method not in self.GET_METHODS:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make a sense to call update_transfer_encoding for POST/PUT with no body?

Copy link
Contributor Author

@socketpair socketpair Aug 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. It is required to send Content-Length: 0 in these requests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, ok. Asking because you can send POST without content-length set and that's fine as well. curl does so.

kxepal
kxepal previously approved these changes Aug 12, 2017
Copy link
Member

@kxepal kxepal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGFM

@socketpair socketpair changed the title Do not send Content-length: 0 in GET/HEAD/TRACE requests (#2167) Do not send Content-length: 0 in GET/HEAD/TRACE/OPTIONS requests (#2167) Aug 12, 2017
@codecov-io
Copy link

codecov-io commented Aug 12, 2017

Codecov Report

Merging #2191 into master will decrease coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #2191      +/-   ##
=========================================
- Coverage   97.24%   97.2%   -0.05%     
=========================================
  Files          39      39              
  Lines        8208    8183      -25     
  Branches     1436    1434       -2     
=========================================
- Hits         7982    7954      -28     
- Misses         98     101       +3     
  Partials      128     128
Impacted Files Coverage Δ
aiohttp/client_reqrep.py 97.36% <100%> (ø) ⬆️
tests/autobahn/client.py 94.72% <0%> (-0.74%) ⬇️
aiohttp/web_fileresponse.py 90.54% <0%> (-0.31%) ⬇️
aiohttp/web_middlewares.py 97.22% <0%> (-0.15%) ⬇️
aiohttp/http_parser.py 97.77% <0%> (-0.03%) ⬇️
aiohttp/web.py 99.65% <0%> (-0.02%) ⬇️
aiohttp/helpers.py 97.26% <0%> (-0.01%) ⬇️
aiohttp/web_request.py 99.68% <0%> (-0.01%) ⬇️
aiohttp/abc.py 100% <0%> (ø) ⬆️
aiohttp/web_urldispatcher.py 99.34% <0%> (ø) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 411af98...5d4ea3e. Read the comment docs.

@asvetlov
Copy link
Member

asvetlov commented Aug 12, 2017

Should we don't send zero Content-Length by default or add and option for skipping the header autogeneration explicitly?
I feel the second option is more correct, proper web server should respect either Content-Length or use chunked encoding, otherwise HTTP 1.1 protocol is broken, isn't it?

@fafhrd91
Copy link
Member

I think we should set and allow to skip.

@asvetlov
Copy link
Member

@fafhrd91 please make your last message clean: do you mean we should set Content-Length: 0 by default and skip it if skip_auto_headers in ClientSession contains Content-Length?

@fafhrd91
Copy link
Member

@asvetlov your interpretation is right. I think our current behavior is right we just need a way to override defaults

@socketpair
Copy link
Contributor Author

No! totally disagree with you.

  1. GET requests SHOULD NOT contain body. HEAD/TRACE MUST NOT contain body.
  2. Body presence is detected either by having content-length or transfer-ecoding.
  3. Yes, server should respect any of these headers, but presence of body in GET very often considered as a bug, since RFC say us SHOULD NOT. For example, libhtp (yes with one t) used in Suricata IDS/IPS AFAIK consider such requests as wrong. AFAIK
  4. All HTTP-clients I know do not set these headers in GET-requests. Why we should ?

@fafhrd91
Copy link
Member

I agree in general, but there is reality. Dropping content-length would be huge semantic change, can we reliably test it? content-length exists from the begging of the client code, I'd play it safe.

@socketpair
Copy link
Contributor Author

socketpair commented Aug 14, 2017

First of all, this change will actually affect only GET-requests. People should never rely on the content-length in such requests (if data was not passed). So, I consider we are not breaking anything real. But instead, we are fixing ancient bug.

@fafhrd91
Copy link
Member

Ok. Your call.

@asvetlov @kxepal ?

@kxepal
Copy link
Member

kxepal commented Aug 14, 2017

I'm with @socketpair, he has a point. Content-Length signs a presense of message body. No message body - no need to send a header. This, I think, has a good analogue: ({}).get('key') vs ({'key': None}).get('key') - both produces the same result, but in first case there is no key while in second there is, but it has None value. Missing value != empty one.

Here is a story about Content-Length: 0 with GET request: http://redmine.lighttpd.net/issues/1717
It turns into two edge case: servers have to understand Content-Length: 0 as no content at all, while clients shouldn't send such HEAD/GET requests.

@asvetlov
Copy link
Member

At least ElasticSearch uses GET with BODY for multi-get requests.
I think it is gray zone but we should support it somehow (we do call elastic by aiohttp at work for example).

@kxepal
Copy link
Member

kxepal commented Aug 15, 2017

Can we ask aioes maintainers to run their tests with this pathc and no Content-Length: 0? I'm pretty sure everything will be fine.

@asvetlov
Copy link
Member

aioes is not maintained anymore but there are async transports for official elasticsearch-py.

Do you suggest not send Content-Length if body is not present but keep it only if the body is sent?

@kxepal
Copy link
Member

kxepal commented Aug 15, 2017

Yes, I do. There indeed could be semantical difference since no body means resp.content is None while zero sized resp.content == b'', but I don't think there is a practical difference between. When you send GET/POST requests via curl with no payload, it doesn't sends Content-Length header as well:

$ curl -v -XPOST http://localhost:5984
* Rebuilt URL to: http://localhost:5984/
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 5984 (#0)
> POST / HTTP/1.1
> Host: localhost:5984
> User-Agent: curl/7.50.1
> Accept: */*

vs

$ curl -v -XPOST http://localhost:5984 --data ''
Note: Unnecessary use of -X or --request, POST is already inferred.
* Rebuilt URL to: http://localhost:5984/
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 5984 (#0)
> POST / HTTP/1.1
> Host: localhost:5984
> User-Agent: curl/7.50.1
> Accept: */*
> Content-Length: 0
> Content-Type: application/x-www-form-urlencoded

@socketpair
Copy link
Contributor Author

socketpair commented Aug 15, 2017

So, current fix is right :) and CURL has a bug (regarding POST without an entity).

@asvetlov
Copy link
Member

Well, looks good but I like to have explicit test for GET with BODY -- we should support this mode.

@socketpair
Copy link
Contributor Author

socketpair commented Aug 15, 2017

@asvetlov there is one. An original test:

https://github.com/socketpair/aiohttp/blob/7601e892343bdbc2b9b12c0cdc0d8fe729a461ae/tests/test_client_request.py#L654

Even more, I added the comment # Elasticsearch API requires to send request body with GET-requests

@asvetlov
Copy link
Member

This test doesn't perform actual call.
I believe we need a test in test_client_functional.py with explicit check that GET BODY was passed through the wire correctly.

@socketpair
Copy link
Contributor Author

Myabe just add assert int(req.headers['CONTENT_LENGTH']) > 0 ?

@asvetlov
Copy link
Member

Please no, only with functional test I could sleep safe and quiet :)

@socketpair
Copy link
Contributor Author

ohhh. I forgot about this PR. Will fix tests in nearest time.

@asvetlov
Copy link
Member

Ping

kxepal
kxepal previously approved these changes Oct 11, 2017
Copy link
Member

@kxepal kxepal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGFM

@kxepal
Copy link
Member

kxepal commented Oct 11, 2017

@socketpair @asvetlov
Anything else blocks to ship it?

@socketpair
Copy link
Contributor Author

Only specific tests are missing (@asvetlov asks me to write them). I'm in Dubai now (Gitex), so I can't work on it. Beside this, everything is ready and coulud be merged.

@socketpair
Copy link
Contributor Author

I just have rebased, no code change.

@asvetlov asvetlov merged commit dad3652 into aio-libs:master Oct 16, 2017
@asvetlov
Copy link
Member

Ok

@socketpair socketpair deleted the ctlen branch October 16, 2017 17:16
@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].
[new issue]: https://github.com/aio-libs/aiohttp/issues/new

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bot:chronographer:provided There is a change note present in this PR outdated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants