The purpose of this project is practicing and experimenting with the microservice architecture and Kubernetes. This software finds and downloads news articles from various sources and performs some lingustical analysis on them
This software consists of 12 microservices
The API server is written in Go. It provides a REST API that can be used by the frontend to retrieve data from the databases and serves the frontend
There are two redis instances for storing hashes of already seen articles and links
Microservice implemented in Python which receives links from the link queue and downloads the corresponding article. This article goes on the article queue
Software implemented in Go that visits a set of websites and extracts potentially interesting links and pushes them on the link queue
There are two redis instances used as queues to hold the not yet visited links and the not yet processed articles
Software implemented in Go. It takes articles from the article queue, extracts information like keywords etc and pushes them into the two databases
This part of the software is also implemented in Go and is responsible to calculate the important words for today
This service is implemented in Python and calculates entropy/perplexity for each newsarticle based on a n-gram language model
This analyzer is implemented in Python and extracts frequent n-grams with context tokens from headlines based on https://towardsdatascience.com/how-i-used-natural-language-processing-to-extract-context-from-news-headlines-df2cf5181ca6
An elasticsearch instance to make the articles searchable
A MongoDB, this is the main database of the project
A redis instance for caching the API server
sudo ./scripts/install_minikube.sh
sudo ./scripts/install_kubectl.sh
sudo ./scripts/reinit_minkube.sh
sudo ./scripts/apply_kubernetes.sh
Now minkube is running
sudo ./scripts/start_minkube.sh
sudo minikube stop
sudo ./scripts/reinit_minkube.sh
sudo ./scripts/apply_kubernetes.sh
Now minkube is running
sudo kubectl delete <kind> <name>
Kind is for example deployment or cronjob
sudo kubectl apply -f <filename>
sudo kubectl get pods
sudo kubectl logs <pod name>
minikube service api-server
./connectToPod.sh (podId-as-argument)
mongo --username <someusername> --password <somepassword> --authenticationDatabase admin