This repository serves as an archive for me to reflect on past projects during my undergrad. It contains technical reports from previous class/personal projects as well as research posters used in conferences. Since this repository also acts as my portfolio for potential employers, I will also provide a description of my contributions for each project below in chronological order starting with most recent/present works.
File name: CardiacModelFieldEffect.pdf
Over the summer, I was part of the FROST Undergraduate Research Program where I researched under supervision of Dr. Joyce Lin assist with her publication on analyzing ephaptic effects in cardiac tissues. This is research I am currently working on as my senior project, and the document I provided is the poster presented at the Cal Poly Mathematics Research Conference.
- Ran over 1000 simulations using MATLAB to gather cardiac tissue data
- Developed MATLAB scripts to extract data and produce visualizations
- Applied numerical techniques to reduce errors in figures
- Documented and presented research findings to over 50 people at a research conference
File name: LoanDefaultRisk.pdf
This was a three-person class project for data science process and ethics. The goal is develop and test different linear classifiers that predicts the risk of an applicant to default on their loan. An additional challenge is implementing these linear classifiers from scratch, along with submitting to the Kaggle competition.
- Produced and implemented imputation techniques for missing attributes
- Developed additional features and preprocessing pipeline
- Developed and implemented logistic regression and support vector classifier using gradient descent
- Organized and documented model metrics and results
File name: IowaAlcoholSalesModel.pdf
This was a three-person class project for data science process and ethics. The goal is to develop both interpretive and predictive models on Iowa alcohol consumption using alcohol sale transactions from the Iowa government website to serve as a proxy. An additional challenge is implementing linear regression from scratch.
- Gathered external datasets for additional features
- Produced and implemented imputation techniques for missing attributes
- Developed and implemented linear regression algorithm, including ridge regression
- Organized and documented model metrics and results
File name: CalPassReport.pdf
This was a three-person class project for knowledge and discovery through data. The goal is to develop a chatbot that answers question pertaining to class scheduling of computer science and statistics classes at Cal Poly.
- Gathered and extracted data via web scraping and regular expressions from Cal Poly's scheduling websites
- Designed and implemented a database to store faculty and scheduling information
- Developed and implemented a query classifier that achieved 88% accuracy
File name: ProductPriceParityReport.pdf
This was a four-person open class project for mathematical foundation of data science. The goal is to engineer a process that automates the computation of product price parity. The dataset used was provided by DxHub.
- Extracted, cleaned, and standardized JSON wrapped dataset for store and product information
- Feature engineered and preprocessed dataset
- Implemented a product category classifier that achieved 73% accuracy
File name: SpotifyLinearModels.pdf
This was a four-person open class project for applied linear models. The goal is to analyze trends in valence from popular music worldwide from 2010 to 2019, then producing a linear model predicting valence scores given certain music metrics and year.
- Provided exploratory data analysis between different music metrics to valence scores in different years
- Produced data visualization of correlation plots and interaction plots
- Documented model performance and error analysis