I'm a Bioinformatics Data Scientist passionate about translating complex biological data into clinical insights. My expertise lies at the intersection of biostatistics, data analysis, and genomics, with a focus on cancer research, immunotherapy, and public health.
I leverage a unique background in clinical healthcare to build robust analytical pipelines in R, uncovering biomarkers from large-scale datasets like TCGA and GEO to help advance personalized medicine.
This section lists my key technical skills. You can see them applied in my pinned repositories below.
- Languages: R (tidyverse, ggplot2, survminer), SQL
- Bioinformatics & Genomics:
- TCGA & GEO Data Acquisition and Processing (
TCGAbiolinks
,GEOquery
) - RNA-Seq and Gene Expression Analysis (TPM/FPKM Normalization, GSVA)
- Biomarker Discovery and Validation
- TCGA & GEO Data Acquisition and Processing (
- Statistics & Machine Learning:
- Survival Analysis (Kaplan-Meier Curves, Log-rank Test, Cox Proportional Hazards Models)
- Predictive Modeling (Logistic Regression)
- Hypothesis Testing (Chi-squared Test, etc.)
- Data Visualization & Reporting:
- Publication-Quality Graphics (
ggplot2
,pheatmap
) - Interactive Dashboards (Power BI)
- Data Reporting (Excel, Google Sheets)
- Publication-Quality Graphics (
I specialize in conducting end-to-end data analysis projects that answer critical questions in biomedical research. My work typically involves:
- Developing and implementing robust, reproducible analysis pipelines in R.
- Analyzing large-scale genomic and clinical data to identify potential prognostic or predictive biomarkers.
- Building and interpreting statistical models to assess the significance of clinical and biological variables.
- Communicating complex findings through clear data visualizations, reports, and interactive dashboards.