Skip to content

Building a Data Pipeline with Apache Cassandra and Python

Notifications You must be signed in to change notification settings

jrwils/sparkifydb_cassandra

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Modeling with Apache Cassandra

This project, part of the Udacity Data Engineering Nanodegree, consists of the creation of an ETL Data Pipeline using music streaming data files and Apache Cassandra. The purpose of this project is to model a Cassandra database structure that will allow for queries outlined in the notebook.

Project Requirements

jupyterlab==3.0.16
pandas==1.3.0
cassandra-driver==3.25.0

You must also have a testing environment that consists of an Apache Cassandra database. This project was tested using a local copy of Cassandra running via Docker.

For information on how to install and run Jupyter Notebooks, click here.

About

Building a Data Pipeline with Apache Cassandra and Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published