-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rsync: failed to connect to ftp.ncbi.nlm.nih.gov #114
Comments
Dear @narsapuramvijaykumar , Since kraken v1 rsync is used for retrieving the genome fasta files. For me the problem was our company's proxy server. Try to set the environment variable "RSYNC_PROXY=http;//<PROXY_IP>:/". However, this did not resolve the issue for me because rsync uses port 873 and our proxy did not allow connections on this port. I ended up downloading the genome fasta files with wget followed by (unchecked): THREADS=4
#Download taxonomy still uses wget
kraken-build --download-taxonomy --db kraken_db
#Download sequences
#Download *_genomic.fna.gz from ftp://ftp.ncbi.nlm.nih.gov/genomes/ and
gunzip *_genomc.fna.gz
FASTAS=$(find . -name *.fna)
#Add sequences
for FASTA in ${FASTAS}
do
kraken-build \
--add-to-library ${FASTA} \
--db kraken_db
done
#Build the database
kraken-build --build --threads ${THREADS} --db kraken_db
kraken-build --clean --threads ${THREADS} --db kraken_db Centrifuge first checks if rsync is available and if not falls back to wget which is a much better approach. Best |
I will try out as you have suggested. |
I am getting rsync issue with the bacterial and the viral libraries - archaea, human, and plasmid all downloaded fine, but with bacteria and viral I get: rsync: failed to connect to ftp.ncbi.nlm.nih.gov: Network is unreachable (101) Obviously it's not a port issue since the other libraries do work. I'm using Kraken version 1, since I'm working with metawrap. Any suggestions? |
@your-highness @DerrickWood I am getting a similar issue. When running: I get the error: Any suggestions? Thanks! |
Hi all, sorry for the late reply. @cmajones try wget with a single genome from the NCBI server first to see if that works? |
Hi @jenniferlu717 , wget works no problem from NCBI FTP server on our cluster. Still getting same error as above with latest Github version of kraken2 and using |
What happens when you download without the --use-ftp option? |
@jenniferlu717 see below:
|
@jenniferlu717 @DerrickWood any updates on this? Seems like several users are having a similar issue and it has not been solved. Thanks! |
@cmajones definitely working on this. will update soon. Sorry for the delay. |
@cmajones I made probably a really ugly fix that allows a kraken-build --use-wget option. Please try the newest kraken version and let me know if it breaks .....stuff. |
@jenniferlu717 Thanks for looking into it! I ran:
and got a resulting folder that contains:
When I tried to run kraken2 on it, I get error message:
|
After running the --download-library commands, you need to build the database using |
I am trying to build the standard database using kraken (I can't use kraken2 because of my downstream needs). I used this code: and got this error: This code resulted in a directory with this structure:
Am I doing something wrong with the |
@7670367 please open a new issue in the future. I have taken a look and seen that some of the sequences described in the assembly_summary file have "na" as their ftp path, which is causing an issue for the script. We will make a new version of the download script to fix this issue. |
My apologies. Thanks. |
Hi, I'm facing the same problem with kraken-build command. Do you have any idea when it will be fixed? |
You can modify the rsync_from_ncbi.pl(miniconda3/envs/kraken2/libexec/rsync_from_ncbi.pl) file to solve this problem, add in the location of the picture. if ( $full_path =~/^na/){ |
works perfectly! |
Unfortunately, none of the alleged fixes have worked for me - using kraken2 2.1.2, stuck on bacteria. |
Same here, kraken2 2.1.2 fails to build standard viral library:
I will have to custom build, but I must admit it is a shame kraken2-build doesn't take in compressed fasta files the same way kraken2 does :) Even 'just' RefSeq is cumbersome to uncompress... |
I have the same problem with kraken2 2.1.2 to build standard library:
|
I'm also facing the same problem with krakenq2 2.1.2 to build bacterial library: |
When I was trying to execute the custom build for bacterial genome download using below command. I was encountered with below error.
Commad
kraken-build --download-library bacteria --db $DBNAME
ERROR
Step 1/3: performing rsync dry run...
Rsync dry run complete, removing any non-existent files from manifest.
Step 2/3: Performing rsync file transfer of requested files
rsync: failed to connect to ftp.ncbi.nlm.nih.gov (130.14.250.12): Connection timed out (110)
rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::12): Network is unreachable (101)
rsync error: error in socket IO (code 10) at clientserver.c(122) [Receiver=3.0.9]
rsync_from_ncbi.pl: rsync error, exited with code 10
Need some help or suggestions for alternative download option if available.
Thanks in advance.
Regards,
Vijay N
The text was updated successfully, but these errors were encountered: