Skip to content

MMBazel/Springboard-DataScienceTrack-Student

Repository files navigation

alt text

Springboard Data Science Career Track

Hi!

My name is Mikiko Bazeley and this is my repo for the Springboard Data Science Track.

From Oct 2018 to April 2019 I completed a number of projects, including two capstones, as part of the DS track.

All of the documentation, code, and notes can be found here, as well as links to other resources I found helpful for successfully completing the program.

For questions or comments, please feel free to reach out on LinkedIn.

If you find my repo useful, let me know OR ☕ consider buying me a coffee! https://www.buymeacoffee.com/mmbazel ☕.

Regards, Mikiko

alt text


Project List by Unit of Study

For a comprehensve list of the projects and corresponding skills needed, please see the list below.

1. The Python Data Science Stack

Topics covered:

  • Python
  • Matplotlib, Seaborn—visualization tools in Python
  • Writing clear, elegant, readable code in Python using the PEP8 standard

2. Data Wrangling

Topics covered:

  • Deep dive into Pandas for data wrangling
  • Data in files: Work with a variety of file formats from plain text (.txt) to more structured and nested formats files like csv and JSON
  • Data in databases: Get an overview of relational and NoSQL databases and practice data querying with SQL
  • APIs: Collect data from the internet using Application Programming Interfaces (APIs)

Projects:

3. Data Story

4. Statistical Inference

Topics covered:

  • Theory of inferential statistics
  • Statistical significance
  • Parameter estimation
  • Hypothesis testing
  • Correlation and regression
  • Exploratory data analysis
  • A/B testing

5. Machine Learning

Topics covered:

  • Scikit-learn
  • Supervised and unsupervised learning
  • Top machine learning techniques:
    • Linear and logistic regression
    • naive bayes
    • support vector machines
    • decision trees
    • clustering
  • Ensemble learning with random forests and gradient boosting
  • Best practices
  • Evaluating and tuning machine learning systems

6. Capstone Project 1: Building a Data Product

7. The Natural Language Processing (NLP) Track

Topics covered:

  • How to work with text and natural language data
  • NLP in Python, using common libraries such as NLTK and spaCy
  • Basics of Deep Learning in NLP using word2vec and TensorFlow
  • Data Science at Scale using Spark
  • Software Engineering for Data Scientists

8. Second Capstone Project: NLP