Writing good Jupyter notebooks

Adapted from an invited lecture presented in Dr. Marques' Introduction to Data Science class - Fall 2020, Answering Questions with Data, bridging the gap between technical analysis and stakeholders' point-of-view with Jupyter notebooks.

How to write well-structured, understandable, resilient, flexible Jupyter notebooks
How to present the results of our investigations to the people who asked the questions, the stakeholders

We start with a Jupyter notebook that produces the right result but lacks good structure and proper coding practices and transform it into a good notebook.

What is a good notebook?

The overall organization is logical
Important assumptions and decisions are spelled out
Code is easy to understand
Code is flexible (easy to modify)
Code is resilient (hard to break)

We will transform the original notebook into a good one, step by step. Each step addresses a set of related items.

Step 1: the original notebook, the one that lacks structure and proper coding practices.
Step 2: add a description, organize into sections, add exploratory data analysis.
Step 3: make data clean-up more explicit, and explain why certain numbers were chosen (the assumptions behind them).
Step 4: make the code more flexible with constants, and make the code more difficult to break (more resilient).
Step 5: make the graphs easier to read.
Step 6: describe the limitations of the conclusion.

Reworked sections are marked with this note:

Invited lecture presentation

The presentation used in the class is on this file.

This blog post is a written, simplified version of the presentation.

Running the notebooks

Clone this repository
cd <folder for the cloned repository>
Create a Python environment: python3 -m venv env
Activate the environment: source env/bin/activate (Mac and Linux), or env\Scripts\activate.bat (Windows)
Update pip: python -m pip install --upgrade pip
Install dependencies (only once): pip install -r requirements.txt
Run the notebooks: jupyter lab

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
data		data
pics		pics
.gitignore		.gitignore
README.md		README.md
presentation.pdf		presentation.pdf
requirements.txt		requirements.txt
salary-discrimination-by-gender-step-1.ipynb		salary-discrimination-by-gender-step-1.ipynb
salary-discrimination-by-gender-step-2.ipynb		salary-discrimination-by-gender-step-2.ipynb
salary-discrimination-by-gender-step-3.ipynb		salary-discrimination-by-gender-step-3.ipynb
salary-discrimination-by-gender-step-4.ipynb		salary-discrimination-by-gender-step-4.ipynb
salary-discrimination-by-gender-step-5.ipynb		salary-discrimination-by-gender-step-5.ipynb
salary-discrimination-by-gender-step-6.ipynb		salary-discrimination-by-gender-step-6.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Writing good Jupyter notebooks

Invited lecture presentation

Running the notebooks

About

Releases 2

Packages

Languages

fau-masters-collected-works-cgarbin/writing-good-jupyter-notebooks

Folders and files

Latest commit

History

Repository files navigation

Writing good Jupyter notebooks

Invited lecture presentation

Running the notebooks

About

Topics

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages