-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error with classification() #783
Comments
thanks for the report @jordancasey !
that wouldn't work because the underlying http package is https://github.com/ropensci/crul - you can achieve the same thing e.g., like it may have been fixed in the latest version of curl, but for now the only thing we can try is opting out of http/2, and that's what the curl maintainer suggested too, try
with > GET /entrez/eutils/esearch.fcgi?db=taxonomy&term=Teleostei HTTP/1.1 <== HERE
Host: eutils.ncbi.nlm.nih.gov And when you don't force to http/1, you would see: > GET /entrez/eutils/esearch.fcgi?db=taxonomy&term=Teleostei HTTP/2 <== HERE
Host: eutils.ncbi.nlm.nih.gov I was able to replicate that framing layer error on my macos when using http/2 at least once, so getting the same thing sometimes as you |
Hi @sckott - thanks for your reply! I've tried to force http/1, which works sometimes, but other times it still reverts to http/2 and fails. I'm trying to use taxize to automate filling in the taxonomy of a list of 16,000 taxa, which is why it's problematic when it fails sometimes. When I try to use http_version = 0L, the failed output is: > classification("Teleostei", db = "ncbi", http_version = 0L, verbose = TRUE)
══ 1 queries ═══════════════
Retrieving data for taxon 'Teleostei'
* Found bundle for host eutils.ncbi.nlm.nih.gov: 0x101dca350 [can multiplex]
* Re-using existing connection! (#186) with host eutils.ncbi.nlm.nih.gov
* Connected to eutils.ncbi.nlm.nih.gov (2607:f220:41e:4290::110) port 443 (#186)
* Using Stream ID: 5 (easy handle 0x10763f800)
> GET /entrez/eutils/esearch.fcgi?db=taxonomy&term=Teleostei&api_key=secret HTTP/2
Host: eutils.ncbi.nlm.nih.gov
Accept-Encoding: gzip, deflate
Accept: application/json, text/xml, application/xml, */*
User-Agent: r-curl/4.2 crul/0.8.4 rOpenSci(taxize/0.9.9)
X-USER-AGENT: r-curl/4.2 crul/0.8.4 rOpenSci(taxize/0.9.9)
< HTTP/2 200
< date: Fri, 08 Nov 2019 10:10:34 GMT
< server: Finatra
< strict-transport-security: max-age=31536000; includeSubDomains; preload
< content-security-policy: upgrade-insecure-requests
< x-ratelimit-remaining: 9
< ncbi-phid: 322C591853035CF50000299F6251D536.1.1.m_1
< cache-control: private
< ncbi-sid: 2430B08F031608C6_9C20SID
< content-encoding: gzip
< x-ratelimit-limit: 10
< access-control-allow-origin: *
< content-type: text/xml; charset=UTF-8
* Added cookie ncbi_sid="2430B08F031608C6_9C20SID" for domain nih.gov, path /, expire 1604830235
< set-cookie: ncbi_sid=2430B08F031608C6_9C20SID; domain=.nih.gov; path=/; expires=Sun, 08 Nov 2020 10:10:35 GMT
< x-ua-compatible: IE=Edge
< x-xss-protection: 1; mode=block
<
* Connection #186 to host eutils.ncbi.nlm.nih.gov left intact
✔ Found: Teleostei
══ Results ═════════════════
● Total: 1
● Found: 1
● Not Found: 0
Error in curl::curl_fetch_memory(x$url$url, handle = x$url$handle) :
Error in the HTTP2 framing layer When I try to use http_version = 1.1, the failed output is: > classification("Teleostei", db = "ncbi", http_version = 1.1, verbose = TRUE)
══ 1 queries ═══════════════
Retrieving data for taxon 'Teleostei'
* Found bundle for host eutils.ncbi.nlm.nih.gov: 0x101ddd190 [can multiplex]
* Re-using existing connection! (#180) with host eutils.ncbi.nlm.nih.gov
* Connected to eutils.ncbi.nlm.nih.gov (130.14.29.110) port 443 (#180)
* Using Stream ID: 3 (easy handle 0x114a5ea00)
> GET /entrez/eutils/esearch.fcgi?db=taxonomy&term=Teleostei&api_key=secret HTTP/2
Host: eutils.ncbi.nlm.nih.gov
Accept-Encoding: gzip, deflate
Accept: application/json, text/xml, application/xml, */*
User-Agent: r-curl/4.2 crul/0.8.4 rOpenSci(taxize/0.9.9)
X-USER-AGENT: r-curl/4.2 crul/0.8.4 rOpenSci(taxize/0.9.9)
< HTTP/2 200
< date: Fri, 08 Nov 2019 10:03:35 GMT
< server: Finatra
< strict-transport-security: max-age=31536000; includeSubDomains; preload
< content-security-policy: upgrade-insecure-requests
< x-ratelimit-remaining: 9
< ncbi-phid: D0BD50AE07953D85000051BCA3232334.1.1.m_1
< cache-control: private
< ncbi-sid: 3A1D96765A95B8F0_4A9BSID
< content-encoding: gzip
< x-ratelimit-limit: 10
< access-control-allow-origin: *
< content-type: text/xml; charset=UTF-8
* Added cookie ncbi_sid="3A1D96765A95B8F0_4A9BSID" for domain nih.gov, path /, expire 1604829816
< set-cookie: ncbi_sid=3A1D96765A95B8F0_4A9BSID; domain=.nih.gov; path=/; expires=Sun, 08 Nov 2020 10:03:36 GMT
< x-ua-compatible: IE=Edge
< x-xss-protection: 1; mode=block
<
* Connection #180 to host eutils.ncbi.nlm.nih.gov left intact
✔ Found: Teleostei
══ Results ═════════════════
● Total: 1
● Found: 1
● Not Found: 0
Error in curl::curl_fetch_memory(x$url$url, handle = x$url$handle) :
Error in the HTTP2 framing layer Even when it works with http_version = 0L, the http version is reported as HTTP/2.0. > classification("Teleostei", db = "ncbi", http_version = 0L, verbose = TRUE)
> GET /entrez/eutils/esearch.fcgi?db=taxonomy&term=Teleostei&api_key=secret HTTP/2 When it works with http_version = 1.1, the http version is reported as HTTP/1.0. > classification("Teleostei", db = "ncbi", http_version = 1.1, verbose = TRUE)
> GET /entrez/eutils/esearch.fcgi?db=taxonomy&term=Teleostei&api_key=secret HTTP/1.0
Host: eutils.ncbi.nlm.nih.gov I'm not sure whether that's relevant? Also, I'm working on a different system today (although I get the same errors on the system I used yesterday): Session Info> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] taxize_0.9.9 purrr_0.3.2 dplyr_0.8.3
loaded via a namespace (and not attached):
[1] Rcpp_1.0.2 pillar_1.4.2 compiler_3.6.1 plyr_1.8.4 iterators_1.0.12
[6] tools_3.6.1 jsonlite_1.6 tibble_2.1.3 nlme_3.1-141 lattice_0.20-38
[11] pkgconfig_2.0.3 rlang_0.4.1 foreach_1.4.7 cli_1.1.0 rstudioapi_0.10
[16] crul_0.8.4 curl_4.2 yaml_2.2.0 parallel_3.6.1 httr_1.4.1
[21] stringr_1.4.0 xml2_1.2.2 triebeard_0.3.0 grid_3.6.1 tidyselect_0.2.5
[26] reshape_0.8.8 glue_1.3.1 httpcode_0.2.0 data.table_1.12.6 R6_2.4.0
[31] reshape2_1.4.3 magrittr_1.5 urltools_1.7.3 codetools_0.2-16 assertthat_0.2.1
[36] bold_0.9.0 ape_5.3 stringi_1.4.3 crayon_1.3.4 zoo_1.8-6 Thanks for any advice! |
(hope you don't mind, i edited your reply to make it easier to see the code chunks; and I replaced your api key with i'm guessing 1.1 isn't a valid value to pass to |
On a related note, with recent versions of taxize, the |
@jordancasey Try |
Hi @sckott - thanks for editing my code & api key (oops!) When I use http_version = 2L, it works sometimes, but like before, sometimes it reverts to http/2 and fails. This is the first time that it successfully forces http/1.1 sometimes, so at least there's progress in the right direction. Output when it works:
Output when it fails:
However, I did manage to run my script on a colleague's computer, without even having to specify http version. It always used http/1.1 automatically. Here's her Session Info:
|
thanks @jordancasey - glad there's progress. I'm considering hard-coding forcing to http 1.1 for NCBI requests throughout the package. I'll ping you soon |
it's hard coded now to always do http 1.1 requests for all ncbi requests across the pkg, let me know if you still have problems |
Hi Scott, Thanks for continuing to work on this. I've updated to v0.9.91:
Unfortunately, it sometimes still reverts to http/2 (I've also run this without specifying http_version with the same results). Here's a successful run followed by a failed run:
|
hmm, the user agent string shows that you are still using taxize |
indeed, R just needed a proper restart. It's working perfectly now! Thanks, Scott! |
great, glad it works. to be clear, |
When I run the classification() function, I get an error, approximately 20% of the time:
Error in curl::curl_fetch_memory(x$url$url, handle = x$url$handle) :
Error in the HTTP2 framing layer
I tried to fix this using:
httr::set_config(httr::config(http_version = 0))
I also tried to specify NCBI:
classif <- classification(t, db="ncbi")
None of those fixes work. Any thoughts on how to fix this error?
Here's a reproducible example (again, the error message only pops up ~20% of the time, and it is unrelated to the queried taxa):
Here's my session info:
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
[5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8 LC_PAPER=fr_FR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] httr_1.4.1 curl_4.2 stringr_1.4.0 rentrez_1.2.2 taxize_0.9.9 purrr_0.3.3 dplyr_0.8.3
loaded via a namespace (and not attached):
[1] Rcpp_1.0.2 pillar_1.4.2 compiler_3.6.1 plyr_1.8.4 iterators_1.0.12 tools_3.6.1
[7] jsonlite_1.6 tibble_2.1.3 nlme_3.1-141 lattice_0.20-38 pkgconfig_2.0.3 rlang_0.4.1
[13] foreach_1.4.7 cli_1.1.0 crul_0.9.0 parallel_3.6.1 xml2_1.2.2 triebeard_0.3.0
[19] grid_3.6.1 tidyselect_0.2.5 reshape_0.8.8 glue_1.3.1 httpcode_0.2.0 data.table_1.12.6
[25] R6_2.4.0 XML_3.98-1.20 reshape2_1.4.3 magrittr_1.5 urltools_1.7.3 codetools_0.2-16
[31] assertthat_0.2.1 bold_0.9.0 ape_5.3 stringi_1.4.3 crayon_1.3.4 zoo_1.8-6
Thanks!
The text was updated successfully, but these errors were encountered: