Skip to content

Latest commit

 

History

History
76 lines (68 loc) · 2.55 KB

high-level-outline.md

File metadata and controls

76 lines (68 loc) · 2.55 KB

High Level Outline for 2019 DSI residential bootcamp

This document lists the high level points that will be covered in the DSI bootcamp. Each instructor will build out the details of their section.

July 1, AM: Intro to computers for scientists

  1. High Level questions to frame the program - not to answer today
  • When should I terminal/script/program/text editor/IDE?
  • When should I use R v Python?
  • Does this fit in memory/ Do I need more than my laptop?
  • It's ok to freeze and not know what to do, that means you are about to learn. Feel the burn
  • Which ml tool is right for the job?
  • How do I translate a research question into a data question? How to I translate a data answer to a research answer?
  1. operating systems
  2. the shell
  • Command Line Interface (CLI):
    1. We use the Bourne-again Shell (BASH)
    2. use sagemaker for unified terminal environment
    3. mkdir, cd, ls, history
    4. installation on student laptop is homework
  • GUI
  1. version control (git/github)
  2. text editor v word processor v IDE (eg: vi vs. VS Code vs. Rstudio)
  3. grab bag
  • "#" is a comment in BASH,R,Python
  1. Reading list:

July 1, PM: R

  1. History
  2. RStudio and installation
  3. tour / hotkeys
  4. projects and working diretory
  5. commandline from inside RStudio
  6. Base R essential tools
  7. c(...), functions how they work
  8. "anatomy of coding" aka syntax or grammar
  9. "?", "<-"
  10. indexing (start from 1)
  11. operators
  12. Data Frames
  13. details
  14. getting data in
  15. manipulating
  16. Simple Plots
  17. Tidying up
  18. ReadingList:

July 2, AM: Python3

  1. History
  2. anaconda/spyder/jupyter
  3. tour and hotkeys
  4. working directory
  5. commandline from inside spyder
  6. python 3 essentials
  7. int,float,string,lists
  8. indexing (start at 0)
  9. functions, "anatomy"
  10. assignment operator and a couple others
  11. Data frames - import pandas as pd
  12. simple plot
  13. Load data into data frame
  14. Reading List:

July 2, PM: Thinking like a data scientist (aka Putting it all togegther (aka the most fun part))

This is language agnostic. The prompt works for R and Python. We give examples in both languages. The goal of this part is to open the world to how a data scientist operates and thinks.

  1. reading in data
  2. munging data
  3. plotting data
  4. presenting data