-
Notifications
You must be signed in to change notification settings - Fork 965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON API package queries returning empty response "randomly" #10387
Comments
@pradyunsg and @uranusjr, has |
@j-martin, any way you can provide us with the complete HTTP response, including headers? |
@di I have added some extra logging to my local poetry install, but I could not get the issue to happen again. Most likely because another coworker hit (and resolved after a few retries) the same issue before me. I'll keep the extra logging in, and hopefully, it repeats itself at some point. |
First ever time I’ve ever heard of this happening. |
Well, pip does not use the JSON end point for PyPI (since that’s not standardised + doesn’t have any tamper-checking possibilities). There’s no way any pip user would notice this. |
here's some headers i grabbed while pdb'ing into poetry when this issue occurred and trying out a patch to auto retry
(Pdb) json_response.content
b''
(Pdb) pp dict(json_response.headers)
{'Accept-Ranges': 'bytes',
'Access-Control-Allow-Headers': 'Content-Type, If-Match, If-Modified-Since, '
'If-None-Match, If-Unmodified-Since',
'Access-Control-Allow-Methods': 'GET',
'Access-Control-Allow-Origin': '*',
'Access-Control-Expose-Headers': 'X-PyPI-Last-Serial',
'Access-Control-Max-Age': '86400',
'Cache-Control': 'max-age=900, public',
'Connection': 'keep-alive',
'Content-Encoding': 'gzip',
'Content-Length': '9341',
'Content-Security-Policy': "base-uri 'self'; block-all-mixed-content; "
"connect-src 'self' https://api.github.com/repos/ "
'*.fastly-insights.com sentry.io '
'https://api.pwnedpasswords.com '
'https://2p66nmmycsj3.statuspage.io; default-src '
"'none'; font-src 'self' fonts.gstatic.com; "
"form-action 'self'; frame-ancestors 'none'; "
"frame-src 'none'; img-src 'self' "
'https://warehouse-camo.ingress.cmh1.psfhosted.org/ '
'www.google-analytics.com *.fastly-insights.com; '
"script-src 'self' www.googletagmanager.com "
'www.google-analytics.com *.fastly-insights.com '
"https://cdn.ravenjs.com; style-src 'self' "
'fonts.googleapis.com; worker-src '
'*.fastly-insights.com',
'Content-Type': 'application/json',
'Date': 'Sun, 07 Nov 2021 15:11:35 GMT',
'ETag': '"JuWbHOCwMq+jjOgbucA39g"',
'Referrer-Policy': 'origin-when-cross-origin',
'Server': 'nginx/1.13.9',
'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload',
'Vary': 'Accept-Encoding',
'X-Cache': 'HIT',
'X-Cache-Hits': '1',
'X-Content-Type-Options': 'nosniff',
'X-Frame-Options': 'deny',
'X-Permitted-Cross-Domain-Policies': 'none',
'X-PyPI-Last-Serial': '8237314',
'X-Served-By': 'cache-wdc5550-WDC',
'X-Timer': 'S1636297896.913867,VS0,VE1',
'X-XSS-Protection': '1; mode=block'}
(Pdb) json_response.request
<PreparedRequest [GET]>
(Pdb) json_response.request.__dict__
{'method': 'GET', 'url': 'https://pypi.org/pypi/cached-property/json', 'headers': {'User-Agent': 'python-requests/2.26.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'If-None-Match': '"JuWbHOCwMq+jjOgbucA39g"'}, '_cookies': <RequestsCookieJar[]>, 'body': None, 'hooks': {'response': []}, '_body_position': None} i can reproduce this with some regularity (ubuntu 20.04, python 3.9.4, poetry 1.1.11), happy to provide additional context if helpful or try workarounds. |
This indicates that our CDN is expecting the response to have a body, so either this is a bug with Fastly or something Poetry-specific. I'd suggest attempting to remove all caching in Poetry and see if that alleviates the issue. |
@di Looks like CacheControl is used here I'll see if disabling it changes anything. |
fwiw, I confirmed that disabling when CacheControl / headers and we don't see any response issues. note the original solution here was to just keep retrying till pypi/cdn returned content with a body also worked. afaics looking at the poetry code, the use of cache control is tied its ability to use a disk cache of packages, so non trivial to remove without affecting user experience. [update] actually look at cache disk structure here its just the json response caching here, the package cache itself is managed separately. |
Thanks for adding more detail @kapilt. I'll close this issue as it is not caused by pypi. |
@j-martin let's keep this open, afaics the issue is actually in pypi infrastructure, adding standard http caching headers to a request should not result in randomly broken/empty responses. ie, poetry should be able to use http caching headers without getting empty responses back from pypi infrastructure. potentially its an issue with CacheControl, but that doesn't seem likely given that simply retrying works, wide spread usage of the package (300k+ daily downloads, used by pip, etc), no known issues there wrt to this. disabling cache control in poetry is a work around / hack afaics. |
I see, I thought the issue comes from poetry and not from pypi, but I think you are right, it's more how fastly responds to the request. |
I'm going to close this because we haven't gotten any reports about this other than the Poetry users in this issue, so I strongly suspect this is Poetry-specific, but feel free to reopen if we have evidence otherwise. |
fwiw, i haven't actually hit this in a while. i think the actual issue was solved on the pypi cdn side, perhaps directly on the fastly side. |
Describe the bug
When querying the JSON API, the body of the response is sometimes empty (as-in
b''
). Querying the same endpoint/package after returns a proper JSON response.Looks like a caching issue on Pypi's side.
I have submitted this PR to poetry to work around the issue.
Maybe pure
pip
is better at handling these as poetry queries those endpoints before invokingpip
for the actual installation.Expected behavior
The API should return a valid JSON response the first time it is queried.
To Reproduce
poetry add <a-package-with-a-lot-of-dependencies>
b''
My Platform
macOS BigSur and Monterey Intel python 3.9.7
macOS Monterey arm64 python 3.9.7
Additional context
The text was updated successfully, but these errors were encountered: