GitHub Trends Aggregator is a full-stack real-time web service that aggregates trending GitHub repositories, processes and displays them dynamically, offers filtering by language, sorting by stars, forks, current period stars, interest score.
GitHub Trends Aggregator periodically scrapes the GitHub Trending page using a resilient HTML parser and processes repository data to extract relevant metrics such as stars, forks, and a custom "interest score". The data is stored in an in-memory store and is made accessible via a RESTful API and WebSocket for live updates. An interactive client-side interface built with HTML and CSS presents the data in an user-friendly format, complete with filtering, sorting.
-
Data Aggregation:
- Scrapes GitHub Trending HTML with a resilient parser using goquery. Presented in the module fetcher. Goroutine executing the scraping process at one-minute intervals.
- Extracts essential repository information (author, name, description, language, stars, forks, period stars) and computes an additional interest score.
-
RESTful API Endpoints:
GET /trends
— Retrieve the list of trending repositories with support for filtering (by programming language), sorting (stars, forks, period stars, interest score).GET /trends/{id}
— Get detailed information about a specific repository (presented in a json format).
-
Real-time Updates:
- WebSocket support (
/ws
endpoint) to broadcast updates to connected clients immediately upon data changes.
- WebSocket support (
-
Go Concurrency:
- Goroutines: scheduled data fetching and handling multiple WebSocket connections.
- Mutex Locking: The in-memory data store is protected by sync.RWMutex methods.
-
Backend:
- Written in Go, it consists of a microservice that performs HTML scraping, data processing, and exposes RESTful endpoints.
- Utilizes Gorilla Mux for routing and Gorilla WebSocket for real-time communication.
- In-memory store for rapid development; easily extendable to persistent databases such as PostgreSQL.
-
Frontend:
-
GET /trends
Retrieves trending repositories. Supports query parameters:language
— Filter by programming language.sort_by
— Sort bystars
,forks
,current_period_stars
, orinterest_score
.
-
GET /trends/{id}
Returns detailed information for a specific repository identified byid
presented in a json format. -
WebSocket /ws
Establish a real-time connection to receive live updates whenever new trends are broadcast.
To run the server, you need to have Go 1.18+ installed on Linux.
Run Makefile from the root directory:
make main
The server will start on port 8080
(browser: http://localhost:8080
).
Or run the server manually from the server directory:
go get
go mod tidy
go run main.go
- Get detailed information about a specific repository:
curl http://localhost:8080/trends/1 | jq
jq
will show the json in a readable format.
The response will be something like this:
{
"id": "oumi-ai/oumi",
"secondary": 1,
"author": "oumi-ai",
"name": "oumi",
"url": "https://github.com/oumi-ai/oumi",
"description": "Everything you need to build state-of-the-art foundation models, end-to-end.",
"language": "Python",
"stars": 3549,
"forks": 0,
"current_period_stars": 1350,
"updated_at": "2025-02-03T21:07:22.533614942+03:00",
"interest_score": 3549
}
- Connect to the WebSocket via terminal:
You need to install wscat
first:
npm install -g wscat
Then connect to the WebSocket:
wscat -c ws://localhost:8080/ws
- The other endpoinds have a nice representation in the browser:
To run the server in Docker, you need to have Docker installed on your machine.
From the root directory:
docker image build -t my_image1 ./server
docker container run -it -p 8080:8080 my_image1
After that, you can access the server at http://localhost:8080
.
After you're done, you can remove the container and the image:
docker container rm -f my_container1
docker image rm -f my_image1
- Backend: Go, Gorilla Mux, Gorilla WebSocket, goquery.
- Frontend: HTML, CSS.
Feel free to explore the code. Any questions, feedback or suggestions are always welcome!
Have a good coding day!