BC_tamoxifen_response

This repository is for ER+ breast cancer scRNA-seq data processing from 10x Genomics scRNA-seq FASTQ files and generation of figures.

Installing

git clone https://github.com/hyunsoo77/BC_tamoxifen_response.git

scRNA-seq data processing

Step 1: align sequences in scRNA-seq FASTQ files to GRCh38 reference transcriptome by 10x Genomics cellranger count to obtain two filtered_feature_bc_matrix.h5 files for two samples.

Step 2: Make the following directoy structure with copy or link.

../count_er+bc-pairs
├── Tumor5
│   ├── outs
│   │   └── filtered_feature_bc_matrix.h5
└── Tumor5_TAM
    └── outs
        └── filtered_feature_bc_matrix.h5

Step 3: Make Seurat object for each sample with the following command:

./make_sc-rna-seq_seurat_obj.R --dir_count ../count_er+bc-pairs --dir_output ./output_er+bc-pairs --dir_seurat_obj ./output_er+bc-pairs/rds_er+bc-pairs --type_qc arguments --min_ncount_rna 5000 --min_nfeature_rna 2000 --th_percent.mt 25 --max_dimstouse 30 --seurat_resolution 0.8 --method_to_update_cell_types epithelial_cell_types --method_to_identify_subtypes none --type_infercnv_argset vignettes --infercnv_pos_notpos er+bc-pairs Tumor5

The above example is only for Tumor5, you can make another Seurat object for Tumor5_TAM by changing the last argument. The contents of the output directory of "./output_er+bc-pairs" follows:

output_er+bc-pairs/
├── infercnv
│   ├── er+bc-pairs_Tumor5_cnv_postdoublet
│   └── er+bc-pairs_Tumor5_TAM_cnv_postdoublet
├── log
├── rds_er+bc-pairs
│   ├── er+bc-pairs_Tumor5_sc-rna-seq_sample_seurat_obj.rds
│   ├── er+bc-pairs_Tumor5_TAM_sc-rna-seq_sample_seurat_obj.rds
│   └── wilcox_degs
├── tsv
│   ├── infercnv_input_barcode_group_er+bc-pairs_Tumor5.tsv
│   └── infercnv_input_barcode_group_er+bc-pairs_Tumor5_TAM.tsv
└── xlsx
    ├── er+bc-pairs_Tumor5_sc-rna-seq_pipeline_summary.xlsx
    └── er+bc-pairs_Tumor5_TAM_sc-rna-seq_pipeline_summary.xlsx

Step 4: Merge Seurat objects for multiple samples to make merged Seurat object by the following command:

./make_sc-rna-seq_merged_seurat_obj.R --dir_output ./output_er+bc-pairs --dir_seurat_obj ./output_er+bc-pairs/rds_er+bc-pairs --k.anchor 5 --max_dimstouse 30 --seurat_resolution 0.8 --cancer_type_for_parsing_rds_filename er+bc-pairs --type_parsing_rds_filename_for_donor 2nd_item_after_parsing_with_underbar --harmony_theta 0  er+bc-pairs

The output file is located under ./output_er+bc-pairs/rds_er+bc-pairs that was defined by an argument of --dir_seurat_obj.

output_er+bc-pairs/
│   ...
├── rds_er+bc-pairs
│   ├── er+bc-pairs_Tumor5_sc-rna-seq_sample_seurat_obj.rds
│   ├── er+bc-pairs_Tumor5_TAM_sc-rna-seq_sample_seurat_obj.rds
│   ├── er+bc-pairs_sc-rna-seq_merged_seurat_obj.rds
│   └── wilcox_degs
...

Jupyter notebook

Figures were generated by Jupyter notebook scripts. In order to install Jupyter notebook/lab, see jupyter.org. You need to change dir_rna and/or dir_atac to locate the merged Seurat object or final ArchRProject object you generated. The output files include PDF files that will be located at the directory of "pdf".

./
├── figure1_01_umap.ipynb
├── figure1_02_barplot.ipynb
├── figure2_01_umap.ipynb
├── figure2_02_dge.ipynb
├── figure3_01_umap.ipynb
├── figure3_02_boxplot.ipynb
├── figure3_03_dge_pairs.ipynb
├── figure4_01_umap.ipynb
├── figure4_02_dge.ipynb
├── figure5_01_umap.ipynb
├── figure5_02_barplot.ipynb
├── figure5_03_drug_effect.ipynb
├── figure_s1_01_dge.ipynb
├── figure_s2_01_barplot.ipynb
├── figure_s2_02_dge.ipynb
├── log
├── pdf
│   ├── ...
│   ├── barplot_er+bc-pairs_cluster_type_prop_rna.pdf
│   ├── ...
│   ├── heatmap_er+bc-pairs_control_vs_tamoxifen_Tumor_cells_zscore.pdf
│   ├── ...
│   ├── umap_er+bc-pairs_cluster_labels_rna.pdf
│   ├── umap_er+bc-pairs_cluster_types_rna.pdf
│   ├── umap_er+bc-pairs_log2fc_t47d_down_genes_rna.pdf
│   └── umap_er+bc-pairs_samples_rna.pdf
├── r
├── reference
├── txt
│   └── sessionInfo.txt
└── xlsx
    ├── ...
    └── er+bc-pairs_control_vs_tamoxifen_Tumor_cells.xlsx

Let's check the cell numbers for each cell type.

The scRNA-seq pipeline is actively developed. Other single cell data analysis projects will use the current version with different parameters or upgraded version of these pipelines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BC_tamoxifen_response

Installing

scRNA-seq data processing

Jupyter notebook

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
png		png
r		r
sc-rna-seq_pipeline		sc-rna-seq_pipeline
README.md		README.md
figure1_01_umap.ipynb		figure1_01_umap.ipynb
figure1_02_barplot.ipynb		figure1_02_barplot.ipynb
figure2_01_umap.ipynb		figure2_01_umap.ipynb
figure2_02_dge.ipynb		figure2_02_dge.ipynb
figure3_01_umap.ipynb		figure3_01_umap.ipynb
figure3_02_boxplot.ipynb		figure3_02_boxplot.ipynb
figure3_03_dge_pairs.ipynb		figure3_03_dge_pairs.ipynb
figure4_01_umap.ipynb		figure4_01_umap.ipynb
figure4_02_dge.ipynb		figure4_02_dge.ipynb
figure5_01_umap.ipynb		figure5_01_umap.ipynb
figure5_02_barplot.ipynb		figure5_02_barplot.ipynb
figure5_03_drug_effect.ipynb		figure5_03_drug_effect.ipynb
figure_s1_01_dge.ipynb		figure_s1_01_dge.ipynb
figure_s2_01_barplot.ipynb		figure_s2_01_barplot.ipynb
figure_s2_02_dge.ipynb		figure_s2_02_dge.ipynb

hyunsoo77/BC_tamoxifen_response

Folders and files

Latest commit

History

Repository files navigation

BC_tamoxifen_response

Installing

scRNA-seq data processing

Jupyter notebook

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages