Skip to content

Commit

Permalink
Merge pull request #569 from datalad-handbook/dvc
Browse files Browse the repository at this point in the history
DataLad as DVC for ML analysis
  • Loading branch information
adswa committed Sep 18, 2020
2 parents b91a59f + d000d6f commit 85135bb
Show file tree
Hide file tree
Showing 70 changed files with 40,297 additions and 5 deletions.
10 changes: 9 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,12 @@ clean-examples:
# wipe out the RIA store
@rm -vrf /home/me/myriastore

# do not touch whats in the DataLad narrative, only certain unrelated wdirs and examples
clean-DVC:
# wipe out the DVC comparison
@find docs/beyond_basics/_examples -name DL-101-168* -type f | xargs rm -vrf
@chmod +w -R /home/me/DVCvsDL; rm -vrf /home/me/DVCvsDL

# wipe out usecases
clean-usecases:
# check if we have something like .xsession or a .bashrc
Expand All @@ -38,4 +44,6 @@ clean:
@chmod +w -R /home/me/pushes; rm -vrf /home/me/pushes
@rm -vrf /home/me/makepushtarget.py
# wipe out the RIA store
@rm -vrf /home/me/myriastore
@rm -vrf /home/me/myriastore
# wipe out the DVC comparison
@chmod +w -R /home/me/DVCvsDL; rm -vrf /home/me/DVCvsDL
2 changes: 1 addition & 1 deletion docs/artwork
2 changes: 2 additions & 0 deletions docs/basics/101-127-yoda.rst
Original file line number Diff line number Diff line change
Expand Up @@ -343,6 +343,8 @@ of all the contents you created in the wake of your analysis project.
This established trust in your results, and enables others to understand
where files derive from.

.. _yodaproc:

The YODA procedure
^^^^^^^^^^^^^^^^^^

Expand Down
8 changes: 5 additions & 3 deletions docs/basics/101-138-sharethirdparty.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ web server, but also a third party services cloud storage such as
`Google <https://google.com>`_,
`Amazon S3 buckets <https://aws.amazon.com/s3/?nc1=h_ls>`_,
`Box.com <https://www.box.com/en-gb/home>`_,
`Figshare <https://figshare.com/>`_,
`Figshare <https://figshare.com/>`__,
`owncloud <https://owncloud.org/>`_,
`sciebo <https://sciebo.de/>`_,
or many more. The key to achieve this lies within :term:`git-annex`.
Expand Down Expand Up @@ -497,6 +497,7 @@ books, or the cropped logos from chapter :ref:`chapter_run`::
$ datalad get books/TLCL.pdf
get(ok): /home/some/other/user/DataLad-101/books/TLCL.pdf (file) [from dropbox-for-friends]
.. _gitlfs:
Use GitHub for sharing content
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -545,6 +546,7 @@ from GitHub.
Unfortunately, it is impossible to :command:`drop` contents from Git LFS:
`help.github.com/en/github/managing-large-files <https://help.github.com/en/github/managing-large-files/removing-files-from-git-large-file-storage#git-lfs-objects-in-your-repository>`_
.. _figshare:
Built-in data export
^^^^^^^^^^^^^^^^^^^^
Expand All @@ -563,7 +565,7 @@ special features a DataLad dataset provides will be available, such as its
history or configurations.
Another example is :command:`export-to-figshare`. Running this command allows
you to publish the dataset to `Figshare <https://figshare.com/>`_. As the
you to publish the dataset to `Figshare <https://figshare.com/>`__. As the
:command:`export-archive` is used by it to prepare content for upload to
Figshare, annexed files also will be annotated as available from the archive on
Figshare using ``datalad-archive`` special remote. As a result, if you publish
Expand All @@ -577,7 +579,7 @@ be able to fetch content from the tarball shared on Figshare via DataLad.
access to the server and client side of your GitLab instance. Find out more
`here <https://docs.gitlab.com/ee/administration/git_annex.html>`_.
Alternatively, GitHub can integrate with
`GitLFS <https://git-lfs.github.com/>`_, a non-free, centralized service
`GitLFS <https://git-lfs.github.com/>`__, a non-free, centralized service
that allows to store large file contents. The last paragraph in this
section shows an example on how to use their free trial version.
Expand Down
7 changes: 7 additions & 0 deletions docs/basics/101-180-FAQ.rst
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,12 @@ that it appears similar to git-annex.
A more elaborate delineation from related solutions can be found in the DataLad
`developer documentation <http://docs.datalad.org/en/latest/related.html>`_.

What is the difference between DataLad and DVC?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

`DVC <https://dvc.org/>`_ is a version control system for machine learning projects.
We have compared the two tools in a dedicated handbook section, :ref:`dvc`.

DataLad version-controls my large files -- great. But how much is saved in total?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -443,6 +449,7 @@ MEG, EEG, iEEG, and ECoG data. It publishes hosted data as DataLad datasets on
obtain the datasets just as any other DataLad datasets with :command:`datalad clone`
or :command:`datalad install`.

.. _gitannexbranch:

What is the git-annex branch?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
Loading

0 comments on commit 85135bb

Please sign in to comment.