The service scrapes all news headlines from nytimes.com and expose them using the GraphQL API.
The articles can be queried using following GraphQL schema:
type News {
title: String,
link: String,
}
type Query {
news: [News!]!
}
Once service is started it will start scraping headlines and will redirect user to GraphQL Playground page.
The service can customised by changing following settings
Setting | Description | Default value |
---|---|---|
SERVER_HTTP_PORT | Server port | 8080 |
DB_NAME | Database name | wardrobe |
DB_HOST | Database server host | localhost |
DB_PORT | Database server port | 5432 |
DB_USER | Database user | user |
DB_PASSWORD | Database user password | 1234 |
DB_WHETHER_CREATE_SCHEMA | Whether to create a database schema on the system run? | true |
NY_TIMES_URL | The URL of the New York Times website | https://www.nytimes.com/ |
SCRAPE_REPEAT_INTERVAL | Scrape repeating interval | every 4 hours |
- Clone project
- Build the project
- Run tests
sbt compile
sbt test
- scala 2.13.6 as the main application programming language
- http4s typeful, functional, streaming HTTP for Scala
- sangria a GraphQL implementation for Scala
- scala-scraper a Scala library for scraping content from HTML pages
- quill compile-time language integrated queries for Scala
- cats to write more functional and less boilerplate code
- cats-effect The Haskell IO monad for Scala
- pureconfig for loading configuration files
- refined for type constraints avoiding unnecessary testing and boilerplate
- circe a JSON library for Scala
- scalatest and ScalaCheck for unit and property based testing
- testcontainers to run system dependant services for Integration Testing purposes