git clone git@github.com:AllenInstitute/BarWare-pipeline
cd BarWare-pipeline
git submodule update --init
R -e 'install.packages("BarMixer", type = "source", repos = NULL)'
chmod +x BarCounter-release/barcounter
The BarWare pipeline is currently configured to demultiplex 10x Genomics 3' scRNA-seq data. Before analysis with BarCounter and BarMixer, we recommend that you run cellranger count to obtain the necessary scRNA-seq inputs for use with this pipeline.
If you are using a different analysis pipeline or method and would like to utilize BarWare, please let us know in the Issues page.
You'll need 2 critical input files before running the BarWare pipeline:
1: A Well Sheet .csv file
You will need to generate a Well Sheet .csv file to specify which wells will be demultiplexed. This .csv should have the following columns:
- well_id: An identifier for each well
- fastq_path: The path to the directory containing the HTO FASTQ files for your well
- fastq_prefix: The prefix for your well that is appended to your HTO FASTQ files (multiple lanes will be automatically identified)
- cellranger_outs: The full path to the cellranger count outs/ directory for each well
Example well_sheet.csv
well_id,fastq_path,fastq_prefix,cellranger_outs
X017-P1C1W1,/mnt/barware-manuscript/X017_fastq/,Pool-16-HTO,/mnt/barware-manuscript/code-testing/X017-P1C1W1/outs/
X017-P1C1W2,/mnt/barware-manuscript/X017_fastq/,Pool-24-HTO,/mnt/barware-manuscript/code-testing/X017-P1C1W2/outs/
X017-P1C1W3,/mnt/barware-manuscript/X017_fastq/,Pool-32-HTO,/mnt/barware-manuscript/code-testing/X017-P1C1W3/outs/
2: A Sample Sheet .csv file
The samplesheet.csv file specifies which samples are associated with which barcodes. This .csv should have the following columns:
- sample_id: The name of each multiplexed sample
- pool_id: An identifier for the pool of samples to demultiplex
- hash_name: The name of the HTOs used for hashing
- hash_tag: The sequence of the HTO barcodes used for hashing
example sample_sheet.csv
sample_id,pool_id,hash_name,hash_tag
2735BW-MEM-1,X017-P1,HT1,GTCAACTCTTTAGCG
2735BW-MEM-2,X017-P1,HT2,TGATGGCCTATTGGG
2735BW-NIV-1,X017-P1,HT3,TTCCGCCTCTCTTTG
2735BW-NIV-2,X017-P1,HT4,AGTAAGTTCAGCGTA
2735BW-NON-1,X017-P1,HT5,AAGTATCGTTTCGCA
2735BW-NON-2,X017-P1,HT6,GGTTGCCAGATGTCA
With these in hand, you're ready for the BarWare pipeline.
A convenient wrapper script is provided in BarWare to run multiple wells in sequence using BarCounter: 01_run_BarCounter.sh
. This script has 4 parameters:
-s
: the full path to the sample_sheet.csv file-w
: the full path to the well_sheet.csv file-o
: the full path of a directory to use for outputs
For example:
bash BarWare-pipeline/01_run_BarCounter.sh \
-s $(pwd)/X017_sample_sheet.csv \
-w $(pwd)/X017_well_sheet.csv \
-o $(pwd)/X017_demultiplex_results
Stage 1 will generate outputs for each well:
<output_dir>/
<well_id>/
hto_counts/
<fastq_prefix>_Tag_Counts.csv
<fastq_prefix>_BarCounter.log
<fastq_prefix>_valid_barcodes.txt
BarMixer demultiplexing can be run using the 02_run_BarMixer.sh
shell script. This script has 3 parameters:
-s
: the full path to the sample_sheet.csv file-w
: the full path to the well_sheet.csv file-o
: the full path of a directory to use for outputs
Note that the parameters -s
, -w
, and -o
should be the same for both Stage 1 and Stage 2.
bash BarWare-pipeline/02_run_BarMixer.sh \
-s $(pwd)/X017_sample_sheet.csv \
-w $(pwd)/X017_well_sheet.csv \
-o $(pwd)/X017_demultiplex_results
Stage 2 will generate outputs for each well, and combined results for all wells. Final outputs separated by sample for downstream use are in the merged_h5/ subfolder.
<output_dir>/
<well_id>/
hto_processed/
<well_id>_hto_category_table.csv.gz
<well_id>_hto_count_matrix.csv.gz
<well_id>_hto_processing_metrics.json
<well_id>_hto_report.html
rna_metadata/
<well_id>.h5
<well_id>_well_metadata_report.html
<well_id>_well_metrics.json
split_h5/
<well_id>_<hash_tag>.h5
<well_id>_multiplet.h5
<well_id>_split_h5_metrics.json
<well_id>_split_report.html
merged_h5/
merge_report.html
<pool_id>_<sample_id>.h5
<pool_id>_multiplet.h5
Image retrieval
A pre-built docker image containing the BarWare pipeline can be downloaded from dockerhub using:
docker pull hypercompetent/barware:latest
Image building
If you would like to re-build the Docker image, the Dockerfile is provided in the BarWare-pipeline repository:
cd BarWare-pipeline
docker build ./ -t barware:v1.0
The license for this package is available on Github in the file LICENSE in this repository
We are not currently supporting this code, but simply releasing it to the community AS IS but are not able to provide any guarantees of support. The community is welcome to submit issues, but you should not expect an active response.
If you contribute code to this repository through pull requests or other mechanisms, you are subject to the Allen Institute Contribution Agreement, which is available in the file CONTRIBUTING in this repository