Skip to content
View Tszon's full-sized avatar

Block or report Tszon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Tszon/README.md
LinkedIn Banner 4

πŸ‘‹ Hi there, I'm Tszon Tseng :)

I'm currently a Data Science Intern at BEAT (Better Environment and Transportation) since June 2025 🌍.
Actively seeking long-term Data Science/AI roles in 2025!


πŸš€ About Me

🎯 Aspiring Data Scientist & AI Engineer, certified by DataCamp

πŸŽ“ BSc in Astrophysics with Space Science (1st Class)

I started by studying galaxies, but found my real passion in data-driven discovery. Today, I design end-to-end ML projects from raw data pipelines to interactive dashboards that reveal insights and drive smarter decisions.

πŸ› οΈ Tech Stack

Languages
Python | SQL (PostgreSQL, Snowflake) | Bash | LaTeX

Data Science & ML
Pandas NumPy Matplotlib Seaborn SciPy
Scikit-learn XGBoost PyTorch SHAP LIME
emcee (Bayesian Inference)

Tools & Platforms
Git & GitHub | Docker | VS Code | Streamlit | MLflow

✨ Currently focused on:

  • ⚑ Scientific Computing & Simulation : Transportation sector
  • 🧠 LLMs & Generative AI: Hugging Face, fine-tuning transformers
  • πŸ“ˆ Time-Series Forecasting: LSTM, CNNs, SARIMAX for predictive analytics
  • πŸ”Ž Interpretable ML: SHAP, LIME for model transparency

πŸ“‚ Featured Projects

πŸ“Š End-to-End Customer Churn Prediction

  • Built an ML pipeline (Logistic Regression, RF, XGBoost, Voting Classifier) β†’ ROC-AUC: 0.87
  • Applied KMeans & HDBSCAN clustering with UMAP to segment customers into actionable personas
  • Used SHAP explainability to reveal key churn drivers (tenure, contract type, fibre optic service)
  • Deployed interactive Streamlit App for business users
  • Containerised workflow with Docker, reducing setup time by >80% and ensuring reproducibility
  • Tracked experiments, metrics, and models with MLflow, improving transparency and versioning across the pipeline
  • Designed an A/B testing simulator with Chi-Square tests to measure the impact of retention strategies

🌱 My Current Learning Roadmap

  • Big Data Platforms (PySpark, AWS, Databricks)
  • Deep Reinforcement Learning

πŸ“« Let's Connect!


"The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka!' but 'That's funny...'" β€” Isaac Asimov

Popular repositories Loading

  1. End-to-End_DS_ML_Project End-to-End_DS_ML_Project Public

    I built an end-to-end customer churn segregation and prediction project.

    Jupyter Notebook 1

  2. Personal-Projects Personal-Projects Public

    Here's a website, and a game demo I built using Unity engine :)

    Mathematica

  3. BSc_Astrophysics_Projects BSc_Astrophysics_Projects Public

    I built an Orbital Transfer Optimisation model and published it in LaTeX, and collected data from the observatory to visualise the initial mass function of star formation.

    Jupyter Notebook

  4. Tszon Tszon Public