Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --extra-inputs to containers-add #190

Merged
merged 28 commits into from
Mar 8, 2023

Conversation

nobodyinperson
Copy link
Contributor

@nobodyinperson nobodyinperson commented Jan 19, 2023

This PR adds the datalad containers-add --extra-inputs option to specify additional files that should be included in the run-record and thus auto-fetched upon datalad containers-run.

Furthermore the --call-fmt now also has a {img_dir} placeholder for easier specification of files next to the image.

I added a test for containers-add asserting that the correct values land in .datalad/config. There is currently no test for containers-run as I couldn't get this to work in CI. I tried a shell-script as 'container' and sourced files as 'overlays', works on my machine, weird errors in CI.

Full test script test/run.sh for reference, a comparable pytest is shipped with this PR:

#!/bin/sh
cd $(dirname "$(readlink -f "$0")")
(chmod -R +w ds;rm -rf ds) 2>/dev/null
set -x
datalad create ds
cd ds
mkdir containers
sudo apptainer build containers/alpine.sif docker://alpine
sudo apptainer overlay create --size 64 containers/overlay.img
sudo chown $USER:$USER containers/overlay.img
sudo apptainer overlay create --size 64 containers/overlay2.img
sudo chown $USER:$USER containers/overlay2.img
datalad $@ containers-add alpine -i containers/alpine.sif
datalad $@ containers-add alpine-with-overlay \
    --call-fmt 'apptainer exec --overlay {img_dir}/overlay.img {img} {cmd}' \
    --extra-inputs containers/overlay.img \
    -i containers/alpine.sif
datalad $@ containers-add alpine-with-two-overlays \
    --call-fmt 'apptainer exec --overlay {img_dir}/overlay.img --overlay {img_dir}/overlay2.img:ro {img} {cmd}' \
    --extra-inputs containers/overlay.img containers/overlay2.img \
    -i containers/alpine.sif
datalad save
echo 1 > file.txt
datalad save
cat .datalad/config
datalad $@ containers-run -n alpine -o alpine-uname-a.txt 'uname -a > {outputs[0]}'
cat alpine-uname-a.txt
datalad $@ containers-run -n alpine-with-overlay -o alpine-with-overlay-uname-a.txt 'uname -a > {outputs[0]}'
cat alpine-with-overlay-uname-a.txt
datalad $@ containers-run -n alpine-with-two-overlays -o alpine-with-two-overlays-uname-a.txt 'uname -a > {outputs[0]}'
cat alpine-with-two-overlays-uname-a.txt

closes #189

@codeclimate
Copy link

codeclimate bot commented Jan 19, 2023

Code Climate has analyzed commit d46b443 and detected 2 issues on this pull request.

Here's the issue category breakdown:

Category Count
Duplication 2

View more on Code Climate.

@bpoldrack
Copy link
Member

Thanks for approaching that, @nobodyinperson!

WRT to the CI failure: The new parameter needs to also become part of the actual signature:
https://github.com/datalad/datalad-container/pull/190/files#diff-74ae0e7052ad9775f337a738a2a3915e46a23f555f2acbf4c24c72f6ebbab4caR182

@nobodyinperson
Copy link
Contributor Author

Thanks for approaching that, @nobodyinperson!

WRT to the CI failure: The new parameter needs to also become part of the actual signature: https://github.com/datalad/datalad-container/pull/190/files#diff-74ae0e7052ad9775f337a738a2a3915e46a23f555f2acbf4c24c72f6ebbab4caR182

Yep, working on it!

@nobodyinperson nobodyinperson force-pushed the 189-specifying-extra-inputs branch from 002e5b3 to 92c0989 Compare January 19, 2023 11:28
@codecov
Copy link

codecov bot commented Jan 19, 2023

Codecov Report

Base: 93.92% // Head: 93.49% // Decreases project coverage by -0.43% ⚠️

Coverage data is based on head (d46b443) compared to base (66941a9).
Patch coverage: 85.71% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #190      +/-   ##
==========================================
- Coverage   93.92%   93.49%   -0.43%     
==========================================
  Files          19       19              
  Lines         889      938      +49     
==========================================
+ Hits          835      877      +42     
- Misses         54       61       +7     
Impacted Files Coverage Δ
datalad_container/containers_add.py 89.67% <66.66%> (-1.94%) ⬇️
datalad_container/containers_run.py 84.37% <70.00%> (-2.67%) ⬇️
datalad_container/tests/test_containers.py 100.00% <100.00%> (ø)
datalad_container/tests/test_run.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@nobodyinperson nobodyinperson marked this pull request as ready for review January 19, 2023 16:09
@nobodyinperson
Copy link
Contributor Author

I'm happy with this so far. I give up on implementing a test for containers-run as I wasn't able to get it working in CI. 🤷

@bpoldrack
Copy link
Member

Thank you very much, @nobodyinperson . I'm looking into the test issues to finish this.

Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brief comments on how to harmonize with other interfaces etc

datalad_container/containers_add.py Outdated Show resolved Hide resolved
datalad_container/containers_add.py Outdated Show resolved Hide resolved
datalad_container/containers_add.py Outdated Show resolved Hide resolved
datalad_container/containers_run.py Outdated Show resolved Hide resolved
datalad_container/containers_run.py Outdated Show resolved Hide resolved
@nobodyinperson
Copy link
Contributor Author

@yarikoptic I implemented all of your suggestions. I also made img_dirpath and img_dspath available in --extra-input and added a check in containers-add for that (for --call-fmt invalid placeholers are silently ignored until containers-run, might want to add a similar check there as well).

I tried again to make a test for containers-run with extra-input but I couldn't get it work. Please one of you experts who are more familiar with the code base step in as I am hitting a dead end here and just burning time.

@yarikoptic
Copy link
Member

@nobodyinperson
Copy link
Contributor Author

just to make sure -- do you mean TODO at https://github.com/datalad/datalad-container/pull/190/files#diff-c9a6a0cde74f0c35ded765fc396d537f9485fe47645c4e00698ab951138042d6R218 ?

Yes, in general a test for containers-run. The above linked comments are not enough to test the new --extra-input feature.

@yarikoptic
Copy link
Member

@nobodyinperson I sent a PR for your fork with suggestive changes/fixup to the test: nobodyinperson#1

@nobodyinperson
Copy link
Contributor Author

Thanks @yarikoptic. How is the run-record accessed? The ´ds.repo apparently can't access the git log.

@yarikoptic
Copy link
Member

probably look at rerun for somewhat ultimate referral on how "accessed": https://github.com/datalad/datalad/blob/HEAD/datalad/local/rerun.py#L282

@nobodyinperson
Copy link
Contributor Author

Thanks @yarikoptic, I was now finally able to add a proper test for containers-run asserting the extra_inputs land in the run-record. ✅

Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me . Just nit picking to make a little more consistent with the rest of inconsistencies we have ;)

datalad_container/containers_run.py Outdated Show resolved Hide resolved
datalad_container/containers_run.py Outdated Show resolved Hide resolved
datalad_container/containers_run.py Outdated Show resolved Hide resolved
datalad_container/containers_run.py Outdated Show resolved Hide resolved
datalad_container/tests/test_run.py Show resolved Hide resolved
nobodyinperson and others added 12 commits February 17, 2023 10:48
It now fails on the 2nd test with

	E           datalad.runner.exception.CommandError: CommandError: 'datalad containers-run -n sub/mycontainer XXX' failed with exitcode 1 under /home/yoh/.tmp/datalad_temp_tree_test_extra_inputsratuqu5q [out: 'image=sub/containers/container.img cmd=XXX img_dspath=sub img_dirpath=sub/containers
	E           run(error): /home/yoh/.tmp/datalad_temp_tree_test_extra_inputsratuqu5q (dataset) [Input did not match existing file: overlay1.img]
	E           run(ok): /home/yoh/.tmp/datalad_temp_tree_test_extra_inputsratuqu5q (dataset) [echo image=sub/containers/container.img ...]
	E           action summary:
	E             get (notneeded: 4)
	E             run (error: 1, ok: 1)
	E             save (notneeded: 2)']

and not sure yet if legit since that overlay1.img extra_input is relative
path, and intentionally or not is not present in super dataset:

- we could have had it as a feature that somehow that extra_input is relative
to current invocation path, while instructing people to use {img_dspath}
whenever it needs to be relative from the img_ds.

  Then we would need to add one more overlay1.img in superdataet

- but above would be a bit too of an obscure feature I think, so not sure if
should be facilitated.
Co-authored-by: Yaroslav Halchenko <debian@onerussian.com>
Co-authored-by: Yaroslav Halchenko <debian@onerussian.com>
Co-authored-by: Yaroslav Halchenko <debian@onerussian.com>
@nobodyinperson nobodyinperson force-pushed the 189-specifying-extra-inputs branch from 70b4a1a to af1034b Compare February 17, 2023 09:53
@nobodyinperson
Copy link
Contributor Author

@yarikoptic I addressed all your remarks and rebased the branch onto master to get rid of the conflict around from datalad.interface.{utils,base} import eval_results.

Concerning the code style discussions I'd recommend using black as code formatter. Takes the discussion out of the formatting, it's one de-facto standard for Python code style.

@yarikoptic yarikoptic added minor Increment the minor version when merged CHANGELOG-missing labels Feb 18, 2023
@yarikoptic
Copy link
Member

@yarikoptic I addressed all your remarks and rebased the branch onto master to get rid of the conflict around from datalad.interface.{utils,base} import eval_results.

Concerning the code style discussions I'd recommend using black as code formatter. Takes the discussion out of the formatting, it's one de-facto standard for Python code style.

We do use black with pre-commit in dandi-cli and I quite like it. Given that there is no lineup of outstanding PRs we might as well add it here, may be even with pre-commit. Filed #196 to gather feedback ;)

@@ -0,0 +1,3 @@
### 🚀 Enhancements and New Features

- Add `--extra-inputs` to `containers-add`. Fixes [#189](https://github.com/datalad/datalad-container/issues/189) via [PR #190](https://github.com/datalad/datalad-container/pull/190) (by [@nobodyinperson](https://github.com/nobodyinperson))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel welcome to expand here on how/when to use this feature, e.g. close to your use case.

xi_kwargs = dict(
img_dspath=image_dspath,
img_dirpath=op.dirname(image_path) or ".",
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codeclimate btw warned about code duplication here. I think it is ok, although indeed duplicates with cmd_kwargs definition above so we could have some common_kwargs to be this structure and defined before any use and then only enhanced for cmd_kwargs. But really might be not worth the hassle this far, just wanted to leave a note that I am ok to suppress codeclimate for this.

…mand errors

On appveyor env we got

	Selecting previously unselected package singularity-ce.
	(Reading database ... 300710 files and directories currently installed.)
	Preparing to unpack /tmp/singularity-ce.deb ...
	Unpacking singularity-ce (3.11.0-focal) ...
	dpkg: dependency problems prevent configuration of singularity-ce:
	 singularity-ce depends on uidmap; however:
	  Package uidmap is not installed.
	dpkg: error processing package singularity-ce (--install):
	 dependency problems - leaving unconfigured
	Processing triggers for man-db (2.9.1-1) ...
	Errors were encountered while processing:
	 singularity-ce
	Reading package lists... Done
	Building dependency tree
	Reading state information... Done
	Correcting dependencies... Done
	The following additional packages will be installed:
	  uidmap
	The following NEW packages will be installed:
	  uidmap
	0 upgraded, 1 newly installed, 0 to remove and 155 not upgraded.
	1 not fully installed or removed.
	Need to get 26.4 kB of archives.
	After this operation, 172 kB of additional disk space will be used.
	Do you want to continue? [Y/n] Abort.
	Command exited with code 1
@yarikoptic
Copy link
Member

pushed potential fix for appveyor failure to configure singularity. If doesn't work out -- can troubleshoot in a separate PR

@yarikoptic
Copy link
Member

It did turn green -- let's proceed, thank you @nobodyinperson !

@yarikoptic yarikoptic merged commit afc5486 into datalad:master Mar 8, 2023
@github-actions
Copy link

PR released in 1.2.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
minor Increment the minor version when merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Specifying more extra_inputs like overlays
4 participants