Skip to content

Latest commit

 

History

History
742 lines (501 loc) · 26.9 KB

README.md

File metadata and controls

742 lines (501 loc) · 26.9 KB

SideCar Academy Batch 2 Students Repository

Welcome to Lisbon Data Science SideCar Academy Batch 2 Students repository!

Your first step in this journey is to carefully read the steps in this tutorial. You'll learn:

  • ➡️ How to set up your environment;
  • ➡️ The weekly workflow to follow during the Academy
  1. Initial Setup

    1. Windows 10 Setup
    2. MacOS Intel Setup
    3. MacOS M1 Setup
    4. Ubuntu Setup
    5. Setup for all Operating Systems
    6. Setup Git and GitHub
    7. Setup your Workspace Repository
    8. Get the Learning Material
  2. Learning Unit Workflow

  3. Updates to Learning Units

  4. Help

    1. Learning Unit
    2. Troubleshooting
    3. Other

Initial Setup


Windows 10 Setup

This section deals with setting up Windows Subsystem for Linux (WSL) on Windows 10.

If you are using MacOS or Linux you can skip this section.

Why do I need to install WSL?

Because of the differences in command line syntax between Windows vs Mac OS/Linux, it would be a great challenge for us to support and provide instructions for both Operating Systems. For this reason, we’d ask you to install Windows Subsystem for Linux which enables you to run Linux command lines inside Windows.

⚠️ Keep in mind that these are simply extensions to your Windows operating system, hence, installing this software will not do any changes on your laptop. It is also quick to do so. ⚠️

Step 1: Follow this guide to setup WSL on Windows 10.

Step 2: Open a terminal (remember this!!) and run the following command:

sudo apt update && sudo apt upgrade && sudo apt install git

Step 3: Open a terminal (remember this!!) and check if you already have python3.7 by usind the command below. If your version is Python 3.7.x (x = any number), you can skip to step 4, otherwise continue with step 3.1 and 3.2

python3.7 --version

Step 3.1: Run the following commands to setup Python 3.7 (if you get an error with this command, check this ):

sudo add-apt-repository ppa:deadsnakes/ppa

Step 3.2: Run the following commands to install Python 3.7

sudo apt update && sudo apt install python3.7 -y

Step 4 Run the following command to get pip and venv:

sudo apt update && sudo apt upgrade && sudo apt install python3-pip python3.7-venv -y

Why do we install these?

We'll be using pip which is the reference Python package manager. You should always use a virtual environment to install python packages. We'll use venv to set them up.


MacOS Intel Setup

Some of the steps in the following sections will require Homebrew for MacOS. Homebrew will make it easier to install software that we will use later on.

Step 1: To open the terminal, choose one:

  • In Finder Finder, open the /Applications/Utilities folder, then double-click Terminal.

  • By pressing cmd + space then type terminal and press enter.

    The terminal should now be open:

Step 2: To install Homebrew for MacOS, copy and paste the following line in the terminal:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"

Step 2.1: Sometimes it's necessary to install xcode command line utils. To do so, do the following command before installing homebrew:

xcode-select --install

You may be prompted to install the Command Line Developers Tools. Confirm and, once it finishes, continue installing Homebrew by pressing enter again.

Step 3: open a terminal and run the following command:

brew update --verbose

Step 4: then run the following command:

brew install git

Step 5: then run the following command:

brew install python@3.7

Step 6: then run the following command:

brew link python@3.7

MacOS M1 Setup

So you got the new M1 and you're supper happy with how fast it is.. Unfortunately dealing with apple silicon requires a little get around. But don't worry, we'll be able to get there in the end.

Step 1: To open the terminal, choose one:

  • In Finder Finder, open the /Applications/Utilities folder, then double-click Terminal.

  • By pressing cmd + space then type terminal and press enter.

    The terminal should now be open:


Step 1.1: To use intel-based software, you'll need Rosetta2. Most of you should already have it installed for varied reasons. If you don't simply run the following line in the terminal:

softwareupdate --install-rosetta

This will launch the rosetta installer and you’ll have to agree to a license agreement.

Step 2: To install Homebrew x86 version, aka ibrew for MacOS, copy and paste the following line in the terminal:

arch -x86_64 /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"

Step 2.1: Sometimes it's necessary to install xcode command line utils. To do so, do the following command before installing homebrew:

xcode-select --install

Step 3: Add an alias with ibrew to your $PATH

echo 'alias ibrew="arch -x86_64 /usr/local/bin/brew"' >> ~/.zshrc

Step 4: Activate the alterations done to .zshrc

source ~/.zshrc

Step 5: Install python 3.7 with ibrew

ibrew install python@3.7

Step 6: Add python 3.7 to $PATH

export PATH="/usr/local/opt/python@3.7/bin:$PATH" >> ~/.zshrc

Step 7 Re-activate the alterations done to .zshrc

source ~/.zshrc

Ubuntu Setup

So you're using Ubuntu, hun? Well, kudos to you. You just need to install a couple of packages.

Step 1: Open a terminal and check what version of Python you have by using the command below. If your version is Python 3.7.x (x = any number), you can skip to step 2, otherwise continue with step 1.1 and 1.2

python3.7 --version

Step 1.1: Run the following commands to setup Python 3.7 (if you get an error with this command, check this ):

sudo add-apt-repository ppa:deadsnakes/ppa

Step 1.2: Run the following commands to install Python 3.7

sudo apt update && sudo apt install python3.7 -y

Step 2 Run the following command to get pip and venv:

sudo apt update && sudo apt upgrade && sudo apt install python3-pip python3.7-venv -y

Why do we install these?

We'll be using pip which is the reference Python package manager. You should always use a virtual environment to install python packages. We'll use venv to set them up.


Setup for all Operating Systems

Creating a Python Virtual Environment

Bellow are the instructions that are enough to get the setup done and get you up and running :) You can also follow this guide for a more in depth set of instructions that accomplish exactly the same thing.

⚠️ You should always be using a virtual environment to install python packages. ⚠️ We'll use venv to set them up.

To install and update packages, we'll be using pip which is the reference Python package manager.

Step 1 Start by installing ensuring pip, setuptools, and wheel are up to date:

python3 -m pip install --user --upgrade pip setuptools wheel

Step 2 Create a virtual environment with the name slu00

python3 -m venv ~/.virtualenvs/slu00

Step 3 Activate the environment

source ~/.virtualenvs/slu00/bin/activate

Note: after you activate your virtual environment you should see at the leftmost of your command line the name of your virtual environment surrounded by parenthesis, like this:

mig@my-machine % source ~/.virtualenvs/slu00/bin/activate
(slu00) mig@my-machine %

And you're able to make sure your virtual environment is active using the which command (it outputs the location of your virtual environment's python installation):

(slu00) mig@my-machine % which python
/Users/mig/.virtualenvs/slu00/bin/python

Step 4 Now update pip.

(slu00) pip install -U pip

Setup Git and GitHub

Having a GitHub account and knowing the basics of committing and pushing changes are mandatory for this academy.

⚠️If you don't have a GitHub account, complete the following steps:

  1. Sign up for a GitHub account if you don't already have one.

⚠️If you have a GitHub account but git is not set up in your system, complete the following steps:

  1. Checking for existing SSH keys
  2. Generating a new SSH key and adding it to the ssh-agent
  3. Adding a new SSH key to your GitHub account
  4. Testing your SSH connection

Setup your Workspace Repository

The workspace directory/repository is where you will place everything you are working on, solve exercises, make changes to files, etc.

Creating the Workspace

  1. Log into GitHub
  2. Create a new private GitHub repository called sidecar-academy-workspace, see Creating a new repository.
    1. You need to explicitly select Private - This is your work and nobody else's.
    2. Initialize with a README. This is mostly just so that you don't initialize an empty repo.
    3. Add a Python .gitignore.

Create Repository

Cloning the Workspace

  1. Open a Terminal or Git Bash, the next steps are on this terminal

  2. Clone your <username>/sidecar-academy-workspace repository

    If you're not sure where to clone the repository in, you can create a ~/projects folder, and clone it there

  3. Clone the students repository If you have your ssh keys set up as instructed:

git clone git@github.com:<username>/sidecar-academy-workspace

else

git clone https://github.com/<username>/sidecar-academy-workspace

Get the Learning Material

You will be cloning the sidecar-academy-batch2 repository. All of the learning material you need will be made available on this repo as the academy progresses.

  1. Open a Terminal or Git Bash, the next steps are on this terminal
  2. Clone the students repository sidecar-academy-batch2
git clone https://github.com/LDSSA/sidecar-academy-batch2.git

Or if you have your ssh keys set up:

git clone git@github.com:LDSSA/sidecar-academy-batch2

Working on the Learning Unit

All learning units come as a set of Jupyter Notebooks. Notebooks are documents that can contain text, images and live code that you can run interactively.

In this section we will launch the Jupyter Notebook application. The application is accessed through the web browser.

Once you have the application open feel free to explore the first learning unit structure. It will give you a handle on what to expect and what rules the instructors follow (and the effort they put) when creating a learning unit.

So let's start the Jupyter Notebook app:

  1. Activate your virtual environment

    source ~/.virtualenvs/slu00/bin/activate
  2. Enter the Learning unit directory in your workspace directory (sidecar-academy-workspace).

    Note: It is VERY IMPORTANT that you ALWAYS work on the files on your sidecar-academy-workspace repository, and NEVER work on files that are in your sidecar-academy-batch2 repository!

    cd ~/projects/sidecar-academy-workspace/sample/"SLU00 - LU Tutorial"
  3. Installing the necessary packages

    pip install -r requirements.txt
  4. Run the jupyter notebook If you are running WLS on Windows 10 run the following:

    jupyter notebook --NotebookApp.use_redirect_file=False

else:

```bash
jupyter notebook
```

When you run the jupyter notebook command, you should see something similar to this in your terminal: Open exercise notebook Your browser should pop up with Jupyter open, however, if this does not happen, you can simply copy the link you see on your terminal (the one that contains localhost) and past it in your browser's address bar:

Open exercise notebook

Note: If you see these scarry looking error messages, don't worry, you can just ignore them.

Open exercise notebook

The Exercise Notebook

Make sure you open and go through the Learning Notebook first.

Every learning unit contains an exercise notebook with exercises you will work on. So let's have a look at the sample Learning Unit.

  1. On the Jupyter Notebook UI in the browser open the exercise notebook Open exercise notebook
  2. Follow the instructions provided in the notebook

Besides the exercises and the cells for you to write solutions you will see other cells with a series of assert statements. This is how we (and you) will determine if a solution is correct. If all assert statements pass, meaning you dont get an AssertionError or any other kind of exception, the solution is correct.

Once you've solved all of the notebook we recommend the following this simple checklist to avoid unexpected surprises.

  1. Save the notebook (again)
  2. Run "Restart & Run All" Restart & Run All
  3. At this point the notebook should have run without any failing assertions

Commit and Push

Now you have worked on the sample learning unit and you have some uncommitted changes. It's time to commit the changes, which just means adding them to your sidecar-academy-workspace repository history, and pushing this history to you remote on GitHub.

  • Using the terminal commit and push the changes
git add .
git commit -m 'Testing the sample notebook'
git push

Learning Unit Workflow

You will need to follow this workflow whenever new learning materials are released.

Learning units will be announced in the academy's #announcements channel. At this point they are available in the sidecar-academy-batch2 repository. A new Learning Unit is released according to the SideCar Academy's calendar!

The steps you followed during the initial setup are exactly what you are going to be doing for each new Learning Unit. Here's a quick recap:

  1. If you haven't, activate your virtual environment

    source ~/.virtualenvs/slu00/bin/activate
  2. Once a new Learning Unit is available, pull the changes from the sidecar-academy-batch2 repo:

    • enter the ~/projects/sidecar-academy-batch2/ using the cd command, then use the git pull command:
    cd ~/projects/sidecar-academy-batch2/
    git pull
  3. Copy the Learning Unit to your sidecar-academy-workspace repo

    cp -r ~/projects/sidecar-academy-batch2/"<specialization ID> - <specialization name>"/"<learning unit ID> - <learnin unit name>" ~/projects/sidecar-academy-workspace/"<specialization ID> - <specialization name>"

    For example, for the S01 - Bootcamp and Binary Classification and SLU01 - Pandas 101, it would look like this:

    cp -r ~/projects/sidecar-academy-workspace/"S01 - Bootcamp and Binary Classification"/"SLU01 - Pandas 101" ~/projects/sidecar-academy-workspace/"S01 - Bootcamp and Binary Classification"
  4. Create a new virtual environment for the Learning Unit you'll be working on.

    • To do this you will run the following command:
    python3 -m venv ~/.virtualenvs/<learning unit ID>
    • and you would replace the <learning unit ID> with the learning unit ID, such that for SLU01, for example, the command would be:
    python3 -m venv ~/.virtualenvs/slu01
  5. Activate your virtual environment

    source ~/.virtualenvs/slu01/bin/activate
  6. Install the python packages from requirements.txt for the specific Learning Unit (you must do this for each Learning Unit, and there are multiple Learning Units in a Specialization)

    pip install -r ~/projects/sidecar-academy-workspace/"<specialization ID> - <specialization name>"/"<learning unit ID> - <learnin unit name>"/requirements.txt

    For example, for the S01 - Bootcamp and Binary Classification and SLU01 - Pandas 101, it would look like this:

    pip install -r ~/projects/sidecar-academy-workspace/"S01 - Bootcamp and Binary Classification"/requirements.txt
  7. Change to the sidecar-academy-workspace dir

    cd ~/projects/sidecar-academy-workspace
  8. Open Jupyter Notebook

    jupyter notebook
  9. Work

  10. Once all tests pass or once you're happy, save your work, close the browser tab with the Jupyter Notebook, close the terminal and open a new terminal

  11. Then commit the changes and push

    cd ~/projects/sidecar-academy-workspace
    git add .
    git commit -m "Worked on SLU01 exercises"
    git push
  12. Profit

Updates to Learning Units

As much as we try and have processes in place to prevent errors and bugs in the learning units some make it through to you. If the problem is not in the exercise notebook you can just pull the new version from the students repo and replace the file. The problem is if the correction is in the exercise notebook, you can't just replace the file your work is there and you'll lose it!

When a new version of the exercise notebook is released (and announced) two things will happen. If you submit an old version of the notebook it will be flagged as out of date and not graded. You will have to merge the work you've already done into the new version of the notebook.

At the moment our suggestion to merge the changes is:

  1. Rename the old version
  2. Copy the new exercise notebook over
  3. Open both and copy paste your solutions to the new notebook

We understand it's not ideal and are working on improving this workflow using nbdime. If you are comfortable installing Python packages you can try it out, but we offer no support for this at the moment.

Help

During the academy you will surely run into problems and have doubts about the material. We provide you with some different channels to ask for help.

Learning Unit

If you feel something is not clear enough or there is a bug in the learning material please follow these steps. Remember, there is no such thing as a dumb question, and by asking questions publicly you will help others!

If you have more conceptual questions about the materials or how to approach a problem you can also reach out to the instructors on slack. You can find the main contact for the learning unit in the Portal this instructor can help you out or redirect you to someone that is available at the moment.

Troubleshooting

  1. When I open Windows Explorer through Ubuntu it goes to a different folder than in the guide
  2. Ubuntu on Windows 10 high CPU usage crashes
  3. When I pull from the sidecar-academy-batch2 repository I get an error
  4. When I try to open jupyter notebook I get an error
  5. When I use the cp command the > sign appears and the command does not execute
  6. When setting up python 3.7 I get an error
  7. Nothing happens when I type my password
  8. I still have a NotImplemented error
  9. I get an error when creating the virtual environment
  10. My problem is not listed here what should I do?
  11. Tutorial videos from Prep Course 2020

When I open Windows Explorer through Ubuntu it goes to a different folder than in the guide

Please make sure:

  • you are running the command explorer.exe . including the dot at the end.
  • you are running Windows 10 version 1909 or newer.

Ubuntu on Windows 10 high CPU usage crashes

  • First please make sure you are running Windows 10 version 1909 or newer.
  • Then, try following these steps

When I pull from the sidecar-academy-batch2 repository I get the error

error: Your local changes to the following files would be overwritten by merge:
<some files>
Please commit your changes or stash them before you merge.
Aborting

git is telling us that changes were made by you to the files on the ~/projects/sidecar-academy-batch2 folder, and is not pulling the changes made by the instructors because they would override the changes that you made there. To fix this do the following:

  1. make sure that any change you made to the files on ~/projects/sidecar-academy-batch2 (that you don't want to lose) is saved in your ~/projects/sidecar-academy-workspace repository (see https://github.com/LDSSA/sidecar-academy-batch2#updates-to-learning-units for how to do this), and if you don't want to keep the changes you made to these files, just continue on to the next step

  2. go to the ~/projects/sidecar-academy-batch2 folder and run:

    cd ~/projects/sidecar-academy-batch2
    git stash
  3. now you can pull from the sidecar-academy-batch2 repository:

    git pull

When I try to open jupyter notebook I get the error

migs-MBP% jupyter notebook
zsh: command not found: jupyter

Before opening jupyter notebook activate your virtual environment:

source ~/.virtualenvs/slu00/bin/activate

When I use the cp command the > sign appears and the command does not execute

cp -r ~/projects/sidecar-academy-batch2/"S01 - Bootcamp and Binary Classification"/"SLU01 - Pandas 101" ~/projects/sidecar-academy-workspace/"S01 - Bootcamp and Binary Classification"
>

Make sure to use this type of quotes " and not these ones .

When setting up python 3.7 I get an error

When I run this command:

sudo add-apt-repository ppa:deadsnakes/ppa

I get this error:

W: GPG error: http://apt.postgresql.org/pub/repos/apt focal-pgdg InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 7FCC7D46ACCC4CF8

Solution: Take the id in front of NO_PUBKEY (in my case its 7FCC7D46ACCC4CF8) and run the following command:

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 7FCC7D46ACCC4CF8

Nothing happens when I type my password

In step two it asks me for the computer password. However, I am not being able to write anything

Solution: When you write your password you might not get any visual feedback and that's okay! Write it as normal and hit enter when you're done!

I still have a NotImplemented error

I've completed the exercise in the Exercise Notebook but when I run the cell I get a NotImplementedError.

Solution: The raise NotImplementedError() are added to the exercise cell as a placeholder for where you're supposed to add your solution/code. It is meant to be removed!

I get an error when creating the virtual environment

I ran python3 -m venv ~/.virtualenvs/slu00, but got the following error:

The virtual environment was not created successfully because ensurepip is not available.

This can happen if either you skipped the installation of python-pip, or the version of the python you're calling doesn't have python pip installed.

As we're using python3.7 for this academy, and if you've followed all the steps in this README correctly, you should be able to create the virtual environment with:

python3.7 -m venv ~/.virtualenvs/slu00

My problem is not listed here what should I do?

If the above steps didn't solve the problem for you, please contact us on Slack or open an issue in this repo.

Tutorial videos from Prep Course 2020

If you want a visual guide, you can look at the tutorial videos from the Prep Course of year 2020.

⚠️ These videos are out of date, and should only be used as a visual guide of what the setup process looks like. The steps you should follow are detailed in this document.

Other

If your problem doesn't fit in any of the previous categories head over to slack and ask. Someone will surely point you in the right direction.

If you're looking for some specific part of our organization head over to the Member Directory and search for the area of responsibility you're looking for.