-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download: Get DSL2 singularity containers #832
Conversation
Also add support for direct downloads of https container URLs
Codecov Report
@@ Coverage Diff @@
## dev #832 +/- ##
==========================================
- Coverage 77.72% 74.87% -2.85%
==========================================
Files 22 22
Lines 2496 2695 +199
==========================================
+ Hits 1940 2018 +78
- Misses 556 677 +121
Continue to review full report at Codecov.
|
Remove some code duplication, improve logic for cache dirs.
Now have two progress bars - one showing how far through the images we are and one for the specific image download. Customises the rich progress bar rendering to allow different output fields for the different types of task.
Tells tool to overwrite any existing files it finds.
* Make the different operations go in order (copy cache first, then downloads, then pulls) * Refactored the confusing function that copied cached files and returned file paths * Improved progress bars with better, changing, description text
To prevent accidentally using partial broken downloads, use an additional .partial file extension whilst download is running and rename the file to remove this when complete.
Simultaneous downloads are awesome, but now |
self.kill_with_fire
31c7fdd
to
7dd36f0
Compare
Still thinking about complicated cases. |
I don't think that this will change anything for Sarek. |
OK, we'll figure it out then. |
90ee064
to
6e943dc
Compare
Ok, done a load more testing and bugfixing, written some docs, hopefully should be about ready to go now I think. @drpatelh / @maxulysse would be great if you guys could give it another test! I rewrote how the logging works for the Singularity pull containers so now that's beautiful as well 🤩 (and you know that something is actually happening..) Phil |
Singularity image pull demo: singularity_pull.mp4 |
Trying the latest version now. |
Now, my only issue is to figure out how not to download every modules... I don't really care if people download both the gatk and the gatk_spark modules, it won't take that much space. But the extra containers for annotation are something else. Sounds good to you? |
Sounds great! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried it and works well
Just waiting on @KevinMenden to be happy now 👍🏻 |
I'm happy, just had this weird error yesterday 🤔 but that wasn't breaking anything. Just didn't show the log for some reason. |
With DSL2, we have multiple containers and they are specified within module files. These are no longer found by
nextflow config
and thereforenf-core download
doesn't see them (#818).This PR adds several new bits of functionality to
nf-core download
for singularity containers:Finding DSL2 containers
The tool now scrapes any
.nf
files inmodules/
recursively, and parses them to look for a lines that look like this:If multiple matches are found in a file, any that start with
http
are prioritised and the first match after that is used.Direct image downloads
The code can now directly download images if we have a URL. If the container name starts with
http
then we download this directly in Python. If not, then we pass to asingularity pull
command as before. Python downloads are made pretty with a nice progress bar etc.Caching downloads
Finally, the code now checks for the
NXF_SINGULARITY_CACHEDIR
environment variable and saves singularity images there first if set, before copying to the archive. This speeds things up massively if running a few times and will be increasingly important with shared images across multiple DSL2 pipelines. If the env var is not set then the files are saved directly to the archive, but a log message with a tip about it is shown.Still to do:
Note 1: This PR does not fix #513, but I suspect that this will be less of a problem as all pipelines transition to DSL2 so I think that we can probably ignore that issue now.
PR checklist
CHANGELOG.md
is updateddocs
is updated