Skip to content

Commit

Permalink
added documentation api
Browse files Browse the repository at this point in the history
  • Loading branch information
Matthew Jones authored and Matthew Jones committed Jul 30, 2021
1 parent bd2db32 commit d5b94ec
Show file tree
Hide file tree
Showing 9 changed files with 199 additions and 10 deletions.
29 changes: 29 additions & 0 deletions docs/api/data.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
===========
Data
===========
.. module:: cassiopeia.data
.. currentmodule:: cassiopeia

CassiopeiaTrees
~~~~~~~~~~~~~~~~~~~

The main data structure that Cassiopeia uses for all tree-based analyses is the CassiopeiaTree:

.. autosummary::
:toctree: reference/

data.CassiopeiaTree

Utilities
~~~~~~~~~~~~~~~~~~~

We also have several utilities that are useful for working with various data related to phylogenetics:

.. autosummary::
:toctree: reference/

data.compute_dissimilarity_map
data.get_lca_characters
data.sample_bootstrap_allele_tables
data.sample_bootstrap_character_matrices
data.to_newick
17 changes: 17 additions & 0 deletions docs/api/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
===
API
===


Import Cassiopeia as::

import cassiopeia as cas

.. toctree::
:maxdepth: 1

preprocess
data
solver
simulator
plotting
18 changes: 18 additions & 0 deletions docs/api/plotting.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
==========
Plotting
==========

.. module:: cassiopeia.pl
.. currentmodule:: cassiopeia

Plotting
~~~~~~~~~~~~~~~~~~~

Currently, our plotting functionality is linked to the rich iTOL framework:

.. autosummary::
:toctree: reference/

pl.upload_and_export_itol


43 changes: 43 additions & 0 deletions docs/api/preprocess.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
===========
Preprocess
===========
.. module:: cassiopeia.pp
.. currentmodule:: cassiopeia

Data Preprocessing
~~~~~~~~~~~~~~~~~~~

We have several functions that are part of our pipeline for processing sequencing data from single-cell lineage tracing technologies:

.. autosummary::
:toctree: reference/

pp.align_sequences
pp.call_alleles
pp.call_lineage_groups
pp.collapse_umis
pp.convert_fastqs_to_unmapped_bam
pp.error_correct_cellbcs_to_whitelist
pp.error_correct_intbcs_to_whitelist
pp.error_correct_umis
pp.filter_bam
pp.filter_molecule_table
pp.filter_cells
pp.filter_umis
pp.resolve_umi_sequence




Data Utilities
~~~~~~~~~~~~~~~~~~~

We also have several functions that are useful for converting between data formats for downstream analyses:

.. autosummary::
:toctree: reference/

pp.compute_empirical_indel_priors
pp.convert_alleletable_to_character_matrix
pp.convert_alleletable_to_lineage_profile
pp.convert_lineage_profile_to_character_matrix
41 changes: 41 additions & 0 deletions docs/api/simulator.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
===========
Simulator
===========
.. module:: cassiopeia.sim
.. currentmodule:: cassiopeia


Our simulators for cassiopeia are split up into those that simulate topologies and those that simulate data on top of the topologies.

Tree Simulators
~~~~~~~~~~~~~~~~~~~

We have several frameworks available for simulating topologies:

.. autosummary::
:toctree: reference/

sim.BirthDeathFitnessSimulator
sim.CompleteBinarySimulator
sim.SimpleFitSubcloneSimulator


Data Simulators
~~~~~~~~~~~~~~~~~~~

These simulators are subclasses of the `DataSimulator` class and implement the `overlay_data` method which simulates data according to a given topology.

.. autosummary::
:toctree: reference/

sim.Cas9LineageTracingDataSimulator

Leaf SubSamplers
~~~~~~~~~~~~~~~~~~~
These are utilities for subsampling lineages for benchmarking purposes. For example, sampling a random proportion of leaves or grouping together cells into clades to model spatial data.

.. autosummary::
:toctree: reference/

sim.SupercellularSampler
sim.UniformLeafSubsampler
41 changes: 41 additions & 0 deletions docs/api/solver.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
===========
Solver
===========
.. module:: cassiopeia.solver
.. currentmodule:: cassiopeia

CassiopeiaSolvers
~~~~~~~~~~~~~~~~~~~

We have several algorithms available for solving phylogenies:

.. autosummary::
:toctree: reference/

solver.HybridSolver
solver.ILPSolver
solver.MaxCutSolver
solver.MaxCutGreedySolver
solver.NeighborJoiningSolver
solver.PercolationSolver
solver.SharedMutationJoiningSolver
solver.SpectralSolver
solver.SpectralGreedySolver
solver.UPGMASolver
solver.VanillaGreedySolver


Dissimilarity Maps
~~~~~~~~~~~~~~~~~~~

For use in our distance-based solver and for comparing character states, we also have available several dissimilarity functions:

.. autosummary::
:toctree: reference/

solver.dissimilarity_functions.cluster_dissimilarity
solver.dissimilarity_functions.hamming_distance
solver.dissimilarity_functions.hamming_similarity_normalized_over_missing
solver.dissimilarity_functions.hamming_similarity_without_missing
solver.dissimilarity_functions.weighted_hamming_distance
solver.dissimilarity_functions.weighted_hamming_similarity
8 changes: 4 additions & 4 deletions notebooks/benchmark.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@
"\n",
"This notebook serves as an entry point for understanding how to interface with Cassiopeia for the purposes of simulating trees, data, benchmarking algorithms.\n",
"\n",
"You can install Cassiopeia by following the guide [here](https://cassiopeia-lineage.readthedocs.io/en/testdeployment/installation).\n",
"You can install Cassiopeia by following the guide [here](https://cassiopeia-lineage.readthedocs.io/en/latest/installation).\n",
"\n",
"All of our documentation is hosted [here](https://cassiopeia-lineage.readthedocs.io/en/testdeployment/)."
"All of our documentation is hosted [here](https://cassiopeia-lineage.readthedocs.io/en/latest/)."
]
},
{
Expand Down Expand Up @@ -50,7 +50,7 @@
"\n",
"We can use a simple birth-death model with fitness to simulate trees.\n",
"\n",
"Specifically, this is a continuous-time birth-death process in which birth and death events are sampled from indepedent waiting distributions. Importnatly, we can incorporate fitness into this framework by modulating the `scale` of the birth waiting distribution. This is done by sampling a random number of fitness events per generation, each with a fitness effect drawn from a distribution. The documentation for this class can be found [here](https://cassiopeia-lineage.readthedocs.io/en/testdeployment/api/reference/cassiopeia.sim.BirthDeathFitnessSimulator.html). "
"Specifically, this is a continuous-time birth-death process in which birth and death events are sampled from indepedent waiting distributions. Importnatly, we can incorporate fitness into this framework by modulating the `scale` of the birth waiting distribution. This is done by sampling a random number of fitness events per generation, each with a fitness effect drawn from a distribution. The documentation for this class can be found [here](https://cassiopeia-lineage.readthedocs.io/en/latest/api/reference/cassiopeia.sim.BirthDeathFitnessSimulator.html). "
]
},
{
Expand Down Expand Up @@ -204,7 +204,7 @@
"\n",
"Cassiopeia has implemented several CassiopeiaSolvers for reconstructing trees. Each of these can take in several class-specific parameters and at a minimum implements the `solve` routine which operates on a CassiopeiaTree. \n",
"\n",
"The full list of solvers can be found [here](https://cassiopeia-lineage.readthedocs.io/en/testdeployment/api/solver.html). For a full tutorial on tree reconstruction, refer to the [Tree Reconstruction notebook](https://github.com/YosefLab/Cassiopeia/blob/testdeployment/notebooks/reconstruct.ipynb).\n",
"The full list of solvers can be found [here](https://cassiopeia-lineage.readthedocs.io/en/latest/api/solver.html). For a full tutorial on tree reconstruction, refer to the [Tree Reconstruction notebook](https://github.com/YosefLab/Cassiopeia/blob/latest/notebooks/reconstruct.ipynb).\n",
"\n",
"Here we use the VanillaGreedySolver, which was described in the [Cassiopeia paper published in 2020](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02000-8)."
]
Expand Down
6 changes: 3 additions & 3 deletions notebooks/preprocess.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
"\n",
"\n",
"## Pipeline API\n",
"All of the key modules of the preprocessing pipeline can be invoked by a call from `cassiopeia.pp`. Assuming the user would like to begin at the beginning of the pipeline, we'll start with the `convert` stage. You can find all documentation on our [main site](https://cassiopeia-lineage.readthedocs.io/en/testdeployment/).\n",
"All of the key modules of the preprocessing pipeline can be invoked by a call from `cassiopeia.pp`. Assuming the user would like to begin at the beginning of the pipeline, we'll start with the `convert` stage. You can find all documentation on our [main site](https://cassiopeia-lineage.readthedocs.io/en/latest/).\n",
"\n",
"An alternative to running the pipeline interactively is to take advantage of the command line tool `cassiopeia-preprocess`, which takes in a configuration file (for example in Cassiopeia/data/preprocess.cfg) and runs the pipeline end-to-end. For example, if you have a config called `example_config.cfg`, this can be invoked from the command line with:\n",
"\n",
Expand Down Expand Up @@ -333,7 +333,7 @@
"\n",
"The `min_umi_per_cell` and `min_avg_reads_per_umi` behave the same as the \"resolve\" step.\n",
"\n",
"See the [documentation](https://cassiopeia-lineage.readthedocs.io/en/testdeployment/api/reference/cassiopeia.pp.filter_molecule_table.html#cassiopeia.pp.filter_molecule_table) for more details."
"See the [documentation](https://cassiopeia-lineage.readthedocs.io/en/latest/api/reference/cassiopeia.pp.filter_molecule_table.html#cassiopeia.pp.filter_molecule_table) for more details."
]
},
{
Expand Down Expand Up @@ -365,7 +365,7 @@
"\n",
"The `min_umi_per_cell` and `min_avg_reads_per_umi` behave the same as the \"resolve\" step.\n",
"\n",
"See the [documentation](https://cassiopeia-lineage.readthedocs.io/en/testdeployment/api/reference/cassiopeia.pp.call_lineage_groups.html#cassiopeia.pp.call_lineage_groups) for more details."
"See the [documentation](https://cassiopeia-lineage.readthedocs.io/en/latest/api/reference/cassiopeia.pp.call_lineage_groups.html#cassiopeia.pp.call_lineage_groups) for more details."
]
},
{
Expand Down
6 changes: 3 additions & 3 deletions notebooks/reconstruct.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"source": [
"# Reconstructing trees with Cassiopeia\n",
"\n",
"Cassiopeia offers several utilities for reconstructing phylogenies, carrying users from the allele tables they've created in the [benchmarking tutorial]() to the full phylogeneis. This tutorial serves as a general overview of the tools that Cassiopeia offers for tree reconstruction."
"Cassiopeia offers several utilities for reconstructing phylogenies, carrying users from the allele tables they've created in the preprocessing tutorial to the full phylogeneis. This tutorial serves as a general overview of the tools that Cassiopeia offers for tree reconstruction."
]
},
{
Expand Down Expand Up @@ -1115,7 +1115,7 @@
"source": [
"### Creating and working with CassiopeiaSolvers\n",
"\n",
"As mentioned previously, Cassiopeia works with a general class of CassiopeiaSolvers. We have implemented several solvers, which you can find [here](https://cassiopeia-lineage.readthedocs.io/en/testdeployment/api/solver.html).\n",
"As mentioned previously, Cassiopeia works with a general class of CassiopeiaSolvers. We have implemented several solvers, which you can find [here](https://cassiopeia-lineage.readthedocs.io/en/latest/api/solver.html).\n",
"\n",
"Perhaps the most popular are the `VanillaGreedySolver`, `ILPSolver`, `HybridSolver`, and `NeighborJoiningSolver`. Here, we'll provide a quick overview of each of these.\n",
"\n",
Expand Down Expand Up @@ -1272,7 +1272,7 @@
"\n",
"The `ILPSolver` is an implementaion of Steiner-Tree approach described in [Jones et al, 2020](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02000-8). The constructor takes in several options controlling the size and complexity of the potential graph to infer as well as stopping criteria for the integer-linear program (ILP) optimization routine.\n",
"\n",
"There are several parameters of interest which can all be explored on our [documentation website](https://cassiopeia-lineage.readthedocs.io/en/testdeployment/api/reference/cassiopeia.solver.ILPSolver.html#cassiopeia.solver.ILPSolver). Because this process can take a long time, we'll restrict the potential graph layer size to 500 nodes and the convergence time to 500s. A more realistic solver might use our defaults - namely, a maximum potential graph layer size of 10,000 and a convergence time of 12,600s (3.5hr).\n",
"There are several parameters of interest which can all be explored on our [documentation website](https://cassiopeia-lineage.readthedocs.io/en/latest/api/reference/cassiopeia.solver.ILPSolver.html#cassiopeia.solver.ILPSolver). Because this process can take a long time, we'll restrict the potential graph layer size to 500 nodes and the convergence time to 500s. A more realistic solver might use our defaults - namely, a maximum potential graph layer size of 10,000 and a convergence time of 12,600s (3.5hr).\n",
"\n",
"The `ILPSolver` logs the progress of the potential graph inference and optimization in a user-defined logfile (by default, `stdout.log`). This logfile will also be output here.\n",
"\n",
Expand Down

0 comments on commit d5b94ec

Please sign in to comment.