You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/source/ge/10xChromium3v2.md
+34-32
Original file line number
Diff line number
Diff line change
@@ -79,15 +79,16 @@ For the purpose of demonstration, we will use the __10x Genomics Single Cell 3'
79
79
80
80
```{eval-rst}
81
81
.. note::
82
-
Mereu E, Lafzi A, Moutinho C, Ziegenhain C, McCarthy DJ, Álvarez-Varela A, Batlle E, Sagar, Grün D, Lau JK, Boutet SC, Sanada C, Ooi A, Jones RC, Kaihara K, Brampton C, Talaga Y, Sasagawa Y, Tanaka K, Hayashi T, Braeuning C, Fischer C, Sauer S, Trefzer T, Conrad C, Adiconis X, Nguyen LT, Regev A, Levin JZ, Parekh S, Janjic A, Wange LE, Bagnoli JW, Enard W, Gut M, Sandberg R, Nikaido I, Gut I, Stegle O, Heyn H (2020) **Benchmarking single-cell RNA-sequencing protocols for cell atlas projects.** *Nat Biotechnol* 38:747–755. https://doi.org/10.1038/s41587-020-0469-4
82
+
Setty M, Kiseliovas V, Levine J, Gayoso A, Mazutis L, Pe'er D (2019) **Characterization of cell fate probabilities in single-cell data with Palantir.** *Nat Biotechnol* 37:451-460. https://doi.org/10.1038/s41587-019-0068-4
83
+
83
84
```
84
85
85
-
where the authors benchmarked quite a few different scRNA-seq methods using a standardised sample: a mixture of different human, mouse and dog cells. We are going to use the data from the __10x Genomics Single Cell 3' V2__ method. There are quite a few experiments with this technology, and specifically, we will just use the [10X 2x 5K cells 250K reads](https://www.ebi.ac.uk/ena/browser/view/PRJNA551745?show=reads) experiment as an example. You can download the `fastq` file from [this ENA page](https://www.ebi.ac.uk/ena/browser/view/PRJNA551745?show=reads). There are two runs, but I'm just downloading the first run for the demonstration.
86
+
where the authors developed a computational method called `Palantir` to perform trajectory analysis on scRNA-seq data. They used the method on human bone marrow scRNA-seq to study haematopoietic differentiation. The library prepration method is __10x Genomics Single Cell 3' V2__. There are quite a few samples in this study, and you can find the raw `FASTQ` files via the accession code [PRJEB37166](https://www.ebi.ac.uk/ena/browser/view/PRJEB37166) from **ENA**. The full metadata can be obtained from the [Human Cell Atlas data portal](https://explore.data.humancellatlas.org/projects/091cf39b-01bc-42e5-9437-f419a66c8a45/project-metadata). Note that the `FASTQ` files are also available from the Human Cell Atlas website, but I found it is easier to download from the **ENA** webpage. Here, for the demonstration, we will just use the `HS_BM_P1_cells_1` sample from the donor `HS_BM_P1`. We could download them as follows:
@@ -127,19 +128,19 @@ If you understand the __10x Genomics Single Cell 3' V2__ experimental procedures
127
128
128
129
> Use 4 cores for the preprocessing. Change accordingly if using more or less cores.
129
130
130
-
`--genomeDir mix_hg38_mm10/star_index`
131
+
`--genomeDir hg38/star_index`
131
132
132
-
> Pointing to the directory of the star index. The public data from the above paper was produced using the HCA reference sample, which consists of human PBMCs (60%), and HEK293T (6%), mouse colon (30%), NIH3T3 (3%) and dog MDCK cells (1%). Therefore, we need to use the species mixing reference genome. We also need to add the dog genome, but the dog cells only take 1% of all cells, so I did not bother in this documentation.
133
+
> Pointing to the directory of the star index. The public data from the above paper was produced using CD34+ cells from bone marrow sorted by FACS from human donors. Therefore, we are using the human reference.
133
134
134
135
`--readFilesCommand zcat`
135
136
136
137
> Since the `fastq` files are in `.gz` format, we need the `zcat` command to extract them on the fly.
137
138
138
-
`--outFileNamePrefix mereu2020/star_outs/`
139
+
`--outFileNamePrefix setty2019/star_outs/`
139
140
140
-
> We want to keep everything organised. This directs all output files inside the `mereu2020/star_outs` directory.
141
+
> We want to keep everything organised. This directs all output files inside the `setty2019/star_outs/` directory.
> If you check the manual, we should put two files here. The first file is the reads that come from cDNA, and the second the file should contain cell barcode and UMI. In __10x Genomics Single Cell 3' V2__, cDNA reads come from Read 2, and the cell barcode and UMI come from Read 1. Check [the 10x Genomics Single Cell 3' V2 GitHub Page](https://teichlab.github.io/scg_lib_structs/methods_html/10xChromium3.html) if you are not sure.
145
146
@@ -151,7 +152,7 @@ If you understand the __10x Genomics Single Cell 3' V2__ experimental procedures
151
152
152
153
> The name of the parameter is pretty much self-explanatory. If using `--soloType CB_UMI_Simple`, we can specify where the cell barcode and UMI start and how long they are in the reads from the first file passed to `--readFilesIn`. Note the position is 1-based (the first base of the read is 1, NOT 0).
> The plain text file containing all possible valid cell barcodes, one per line. __10x Genomics Single Cell 3' V2__ is a commercial platform. The whitelist is taken from their commercial software `cellranger`.
157
158
@@ -174,11 +175,12 @@ If you understand the __10x Genomics Single Cell 3' V2__ experimental procedures
174
175
If everything goes well, your directory should look the same as the following:
0 commit comments