-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ADAMContext/GenomicRDD/pipe docs #1422
Add ADAMContext/GenomicRDD/pipe docs #1422
Conversation
Test PASSed. |
The `JavaADAMContext` class provides Java-friendly methods that are equivalent | ||
to the `ADAMContext` methods. Specifically, these methods use Java types, and do | ||
not make use of default parameters. In addition to the load/save methods | ||
described above, the `ADAMContext` adds the implicit methods needed for using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we explicitly say that the Scala and Java labels in the above function calls indicate whether they can be used with JavaADAMContext/ADAMContext?
docs/source/60_building_apps.md
Outdated
<dependency> | ||
<groupId>org.bdgenomics.adam</groupId> | ||
<artifactId>adam-core${binary.version}</artifactId> | ||
<version>0.21.0</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would prefer a ${version}
or ${adam.version}
variable here, so that the docs don't need to be updated on every new release
docs/source/60_building_apps.md
Outdated
conversion from a `SparkContext` to an `ADAMContext`. To use this, import the | ||
implicit, and call an `ADAMContext` method: | ||
|
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the syntax highlighting language used in Github-flavored markdown cause problems with pandoc generation? If not, it would be useful to have ```scala
here (and elsewhere below)
docs/source/60_building_apps.md
Outdated
|
||
In Java, instantiate a JavaADAMContext, which wraps an ADAMContext: | ||
|
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...and ```java
here
|
||
With an `ADAMContext`, you can load: | ||
|
||
* Single reads as an `AlignmentRecordRDD`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this be easier to read as a table?
With an `ADAMContext`, you can load: | ||
|
||
* Single reads as an `AlignmentRecordRDD`: | ||
* From SAM/BAM/CRAM using `loadBam` (Scala only) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If something says (Scala only)
here, does that mean the method is not usable from Java or is simply inconvenient to do so? I've been working with some of these from Java without issue.
docs/source/60_building_apps.md
Outdated
functional [transformations]{#transforming} across the whole dataset. | ||
|
||
Around an `RDD`, ADAM adds metadata which describes the genome, samples, or | ||
pipeline that a dataset came from. Specifically, ADAM supports the following |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't obvious which definition pipeline
has here; this is about read groups?
docs/source/60_building_apps.md
Outdated
We provide four implementations: | ||
|
||
* `ADAMContext.sameTypeConversionFn`: For piped commands that do not change the | ||
type of the `GenomicRDD` (e.g., `AlignmentRecordRDD` -> `AlignmentRecordRDD`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> → →
(this must be my favorite review comment ever)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
// save to vcf | ||
variantContexts.saveAsVcf("hdfs://mynamenode/my/variants.vcf") | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be useful to add a section/paragraph describing how data streamed from GenomicRDD
s to external apps is post-partitioning; performance implications thereof. I see now this is mentioned above on line 483.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, just a couple of nits for examples/further clarification.
Although `GenomicRDD`s do not extend Apache Spark's `RDD` class, `RDD` | ||
operations can be performed on them using the `transform` method. Currently, | ||
we only support `RDD` to `RDD` transformations that keep the same type as | ||
the base type of the `GenomicRDD`. To apply an `RDD` transform, use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Add an example: i.e. We can transform AlignmentRecordRDDs to increment the position of the AlignmentRecord by 1 for every AlignmentRecord, but we cannot transform an AlignmentRecordRDD to a FeatureRDD.
I think it will make it much more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, sounds good, will add.
Addressed reviewer comments and pulled changes in from #1384. |
Test PASSed. |
Should I rebase and merge as is or did you want to squash some of the commits first? |
3f1d348
to
6cbecf2
Compare
Resolves bigdatagenomics#1368. Also adds docs for ADAMContext and GenomicRDD.
6cbecf2
to
6ec6749
Compare
Hi @heuermh! This should be good to go now. I figured that #1435 and this PR were sufficiently distinct that it made more sense to just merge #1435 into trunk separately. I wanted to pull #1384 into this since we were touching the same section of the document and would conflict otherwise. Anyways, I've got this squashed down as desired. |
Test PASSed. |
Test FAILed. Build result: FAILURE[...truncated 16 lines...] > /home/jenkins/git2/bin/git rev-parse origin/pr/1422/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 8933ec928ae71b6fd796e053ba91083c0b56f46a # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1422/merge^{commit} # timeout=10Checking out Revision 8933ec928ae71b6fd796e053ba91083c0b56f46a (origin/pr/1422/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 8933ec928ae71b6fd796e053ba91083c0b56f46aFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.3.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,2.0.0,centosTriggering ADAM-prb ? 2.3.0,2.10,2.0.0,centosTriggering ADAM-prb ? 2.6.0,2.11,2.0.0,centosTriggering ADAM-prb ? 2.6.0,2.10,2.0.0,centosTriggering ADAM-prb ? 2.3.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.11,1.6.1,centosADAM-prb ? 2.3.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,2.0.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,2.0.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,2.0.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,2.0.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,1.6.1,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'Test FAILed. |
Jenkins, retest this please. |
Test PASSed. |
Thank you, @fnothaft @devin-petersohn! |
Resolves #1368. Also added latest citation information to the README.