Skip to content

Marlowess/spark-exercises

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PySpark - Exercises

This is a collection of exercises for Spark solved in Python (PySpark).

Clone this repository in your local space, then install a virtualenv for your libraries

  • Install virtualenv using pip > pip install virtualenv
  • Create a new virtual environment in this repo > virtualenv env

Install the dependencies by prompting

pip install -r requirements.txt

References

  1. Apache Spark official website: https://spark.apache.org/
  2. Exercises source: https://dbdmg.polito.it/wordpress/teaching/big-data-architectures-and-data-analytics-2019-2020

About

Some exercises to learn Spark. Solved in Python.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages