Sparkify, a startup, wants to analyze the data they've been collecting on songs and user activity on their new music streaming app. Their data is in in a directory of CSV files on user activity on the app. The goal of this project is to build a data model in Cassandra Keyspace using python that allow analytics team to optimize specific queries that is important to business.
Example query: SELECT artist, song, length FROM session_library WHERE sessionId = 338 AND itemInSession = 4
Goal 2. Get songs information order by listening sequence for a specific user during a specific session.
Example query: SELECT artist, song, firstName, lastName FROM user_library WHERE userId = 10 AND sessionId =182
Example query: SELECT firstName, lastName FROM song_library WHERE song = 'All Hands Against His Own'
1. session_library:
- Partition Key: sessionId
- Clustering Columns: itemInsession
2. user_library:
- Partition Key: userId, sessionId
- Clustering Columns: itemInsession
3. song_library:
- Partition Key: song