This repository contains documentation and tutorials that can help a JavaScript (or any non-Python) developer get skilled up in data analysis using Python, with the help of AI assistance.
Build content and samples to help non-Python developers navigate data science projects in a self-guided manner such that we:
- Cultivate consistency in development environments with GitHub Codespaces.
- Cultivate curiosity in self-guided learning journeys with GitHub Copilot and Open AI.
- Cultivate collaboration by developing content and code with Jupyter Notebooks and Python.
For some projects (e.g., USACO), the goal is to also foster a culture of contribution to open-source efforts for a new generation of devs.
The project is inspired by some challenges and opportunities I encountered as both a web developer and an educator. You can learn more about my story from this talk at PyData NYC in November 2023.
Here are some of the use cases I wanted to tackle:
-
USA Computing Olympiad - a competitive programming contest for high-schoolers. Currently has Python as one of 3 languages supported for submissions - but its the least-supported in terms of content solutions. Can I use Copilot and Jupyter Notebooks to build a curriculum and interactively explore competitive problem solving challenges with my high-schooler in a discussion-driven approach that cultivates curiosity?
-
Accessibility Test Reports - Web Accessibility compliance continues to be a critical problem with 96% of top sites still showing failures.. Tools like Playwright help generated automated test reports - but analyzing the data in bulk reports is not easy. Can I use Copilot to write Python code that visualizes the data - and by doing this also gain skills and understanding of key tools and techniques for data analysis with Python?
-
Prompt Engineering Exploration - Generative AI is becoming more popular - and prompt construction and engineering are becoming a key skill for developers. Most LLM providers provide a Python SDK and/or API that can be used to interactively explore this topic and gain intuition for usage within application domains of interest. Can I use Jupyter Notebooks with OpenAI API key integration to explore my own intuition around prompt engineering - and document my learnings interactively to build prompt libraries for key needs?
-
Automatic Generation of Visualiztions using LLM - Projects like Microsoft LIDA now allow us to use natural language queries to generate visualizations around our data. They can visualize things based on an explicit query - but can also take the extra step of figuring out visualizations or infographics that may be of interest that you may not have thought of. Can I use such tools interactively and build my own intuition and expertise on data analysis and visualization by learning from generated code and outcomes?
The first challenge is to setup a development environment that I and my collaborators (e.g, my high-school son) can use
- from anywhere (any device)
- with a repeatable runtime so setup effort is minimal
- with a consistent environment for easy debug across users
GitHub Codespaces with Jupyter Notebooks makes this easy!
- Dev Container uses configuration as code for repeatability.
- Default container supports universal dev (Python, Node.js etc.)
- The
requirements.txt
supports auto-updates post-setup. - The
customizations
for VS Code extensions set consistent IDE
The dev container can be used with either GitHub Codespaces (online, in the cloud) or with Docker Desktop (offline, in local device). It is set up for default usage with a Visual Studio Code editor frontend. With this in mind, here are some learning resources for beginners:
- Python Developer Roadmap - step-by-step topics guidance for learning
- Get Started with Python in VS Code - learn fundamentals.
- Data Science in VS Code - skill up on Jupyter Notebooks!
- GitHub Codespaces for Machine Learning - build classifiers & more.
- OpenAI Cookbook - example code, guides for prompt engineering
- Getting Started With GitHub Copilot - understand capabilities
- GitHub Copilot: Fly with Python ... - RealPython community article
- How to use GitHub Copilot: Prompts, tips, use cases - Github blog
- Create Dynamic Data Viz with OpenAI & React - OSS Dashboard
- Chat with your CSV.. - OSS Demo with Langchain/Streamlit
- PandasAI - Analyze data using natural lang (OpenAI, HuggingFace)