This project focuses on real-time cryptocurrency data analysis using Apache Kafka, MongoDB, and Streamlit for data visualization. The system fetches live cryptocurrency data from the CoinCap public API, processes it using Kafka, stores it in MongoDB, and visualizes the data to observe price fluctuations over time.
The project follows a Producer-Consumer pattern:
- Producer: Continuously fetches cryptocurrency data from the API and publishes it to a Kafka topic.
- Consumer: Consumes data from the Kafka topic, processes it using Apache Spark, and stores the processed data in MongoDB for further visualization.
- Python: Data fetching, processing, and programming.
- Apache Kafka: Real-time data streaming.
- MongoDB: NoSQL database for storing historical data.
- CoinCap API: Source for live cryptocurrency data.
- Pandas: Data analysis and manipulation.
- Streamlit: Data visualization and plotting using Matplotlib.
- Docker: For containerization of the application
Run docker-compose up
to start the services defined in the compose file
Follow the steps below to set up the project on your local system:
wget https://downloads.apache.org/kafka/3.8.1/kafka_2.12-3.8.1.tgz
tar -xvf kafka_2.12-3.8.1.tgz
cd kafka_2.12-3.8.1
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
bin/kafka-topics.sh --create --topic crypto-currency --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
bin/kafka-console-producer.sh --topic crypto-currency --bootstrap-server localhost:9092
bin/kafka-console-consumer.sh --topic crypto-currency --bootstrap-server localhost:9092 --from-beginning
brew tap mongodb/brew
brew install mongodb-community
brew services start mongodb/brew/mongodb-community
pip install streamlit
streamlit run streamlit_crypto_data_visualization.py