-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataLad as DVC for ML analysis #569
Conversation
This adds a separate clean rule for non-narrative sections. At the moment, it only contains DVC. This section does not have any relation to the DataLad-101 narrative, and hence doesn't need to be redone with a complete rebuild of the book. Other such unrelated sections can be added to this rule (which could also be renamed), but so far, this section is the only one in the book (apart from usecases) that does not depend on the narrative. Rebuilding DVC is also tedious, because this section also pushes back to GitHub-based repos
Also, ideas on where to put such a comparison would be appreciated :) currently I just put it into the "clutter" chapter in "Beyond basics" |
Preview link to the section: https://datalad-handbook--569.org.readthedocs.build/en/569/beyond_basics/101-168-dvc.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some review on initial portion... yet to digest in full -- but looks great overall.
Took about a week longer than I wanted it to take, but this section is ready now ;-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor recommended suggestions in the opening... yet to read more. So far so good ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reached the end
Thanks much @yarikoptic. I'll add the remainder of your comments locally, and then I'll squash the past 20 commits. |
DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Fix version specification Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: tweak language Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Publish -> push Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: Typo Co-authored-by: Yaroslav Halchenko <debian@onerussian.com> DVC: link more footnotes
@yarikoptic, it doesn't seem as if DVC people are giving feedback on this. Do you think this is ready to be merged? My general plan would be to use some of the content in this tutorial also for a stand-alone, DataLad-centric usecase (for the "DataLad for machine-learning" workshop), but if this is ready from your POV, I would be happy to reduce the number of open PRs by one. :) |
yes, let's merge!!! Thank you for this section - I think it is great and will be a valuable addition to the handbook. |
This is a section that takes a quite elaborate DVC tutorial and recreates it with DataLad.
I've split the original tutorial into three "parts" (Versioning data, Sharing data, Analysis), and in each part, I execute the DVC workflow and a corresponding DataLad workflow.
Instead of adjusting the workflow into a YODA-principle or similar DataLad-centric workflow, I'm recreating the workflow as closely as I can with DataLad as for a tool comparison in the "domain" of DVC.
Both DVC and DataLad pipelines are executed and included with output. Apart from a slightly different order and skipping the tutorial section in which only scripts are executed without pipelining ("Version Datasets and Models"), I've recreated the tutorial very closely, and didn't make any code changes.
I would appreciate feedback on the following:
metrics/accuracy.json
) across more than two refs?CC @mih @yarikoptic