Scrapes real estate website, saves the records and lets me know if anything interesting worth looking at. Runs on a scheduler to scrape periodically and clean up every midnight at least 5 days old properties. Runs on free services (hopefully forever)
Self driving scraper, sometimes emails me.
- Mailjet - mailing
- Heroku - Java + Postgres
- UptimeRobot - polls heroku health endpoint to keep it alive
Uptime pings: https://real-estate-scraper1.herokuapp.com/api/v1/health
Returns last 100 properties with points: https://real-estate-scraper1.herokuapp.com/api/v1/property
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See running for notes on how to run the project on a system.
- Clone the project to your local environment:
git clone https://github.com/indrekru/real-estate-scraper.git
- You need maven installed on your environment:
brew install maven
sudo apt-get install maven
Once you have maven installed on your environment, install the project dependencies via:
mvn install
Run all tests:
mvn test
Once you have installed dependencies, this can be run from the Application.java
main method directly,
or from a command line:
mvn spring-boot:run -Dspring.profiles.active=dev
Open browser and go to http://localhost:8080/api/v1/health and you should see health json response
To deploy new version to heroku:
git push heroku master
Check the prod logs:
heroku logs -tail -a real-estate-scraper1
- Spring Boot - Spring Boot 2
- Maven - Dependency Management
If you have any improvement suggestions please create a pull request and I'll review it.
- Indrek Ruubel - Initial work - Github
See also the list of contributors who participated in this project.
This project is licensed under the MIT License
- Big thanks to Pivotal for Spring Boot framework, love it!
- Also check out my Spring Boot 2 Oauth2 resource server example: https://github.com/indrekru/spring-boot-2-oauth2-resource-server