Skip to content

Naliaka/datastreaming_kafka_Spark_Python_2020_UDACITY

Repository files navigation

Datastreaming_nanodegree_2020_UDACITY

The material and projects for Udacity Data Streaming Nanodegree contains the exercises, projects and the extra curricular material.

Lessons

PART 1: Data Ingestion with Kafka & Kafka Streaming

PART 2: Streaming API Development and Documentation

Project

Overview

  • Data Ingestion with Kafka & Kafka Streaming : A streaming event pipeline around Apache Kafka and its ecosystem. Using public data from the Chicago Transit Authority we will construct an event pipeline around Kafka that allows us to simulate and display the status of train lines in real time. tools: python, Kafka, Faust Stream processor and KSQL.

  • Analyze San Francisco Crime Rate with Apache Spark Streaming : real-world dataset, extracted from Kaggle, on San Francisco crime incidents, and you will provide statistical analyses of the data using Apache Spark Structured Streaming. tools: python, Kafka, Spark Streaming.

Licence

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. Please refer to Udacity Terms of Service for further information.

About

Udacity-Data-Streaming_Nanodegree

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published