goobox-nodes-scraper

Scraper for gathering all available nodes for goobox.

Getting started

To run Goobox Nodes Scraper you need previously to install the requirements and you can either use public docker image or build it from sources.

Requirements

Docker: Install it following official docs.

Use public image

You can use public docker image to run the service. E.g. run Storj nodes scraper, collect them and put together into a csv format file:

docker run -v /your/output/dir:/srv/apps/goobox-nodes-service/output goobox/goobox-nodes-scraper:latest scrapy storj_nodes -o output/out.csv -t csv

Build from sources

To build Goobox Nodes Scraper from sources you need to clone this project and build the image.

git clone https://github.com/goobox/goobox-nodes-scraper.git & cd goobox-nodes-scraper
python3.6 make build

Once build is completed you can run the scraper using scrapy command from the entry point.

python3.6 make run scrapy

Help

The entry point has a self-describing help that can be queried.

python3.6 make run -h

Also, each command has its own help.

python3.6 make run scrapy -h

Usage example

To run the scraper for collecting Storj nodes first create a directory to keep the output.

mkdir output

The scraper is going to gather Storj node information, generate a csv file and put it into the previous directory. If you prefer to generate a different kind of export you can use a different format as specified by Scrapy's Feed exports.

python3.6 make run scrapy crawl storj_nodes -o output/out.csv -t csv

Once the scraper has finished you can get the output csv file.

License

GNU GPL v3

Credits

This product includes GeoLite2 data created by MaxMind.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
core		core
goobox_nodes_scraper		goobox_nodes_scraper
sia		sia
storj		storj
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.prospector.yaml		.prospector.yaml
.travis.yml		.travis.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
constraints.txt		constraints.txt
make		make
pyproject.lock		pyproject.lock
pyproject.toml		pyproject.toml
run		run
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

goobox-nodes-scraper

Getting started

Requirements

Use public image

Build from sources

Help

Usage example

License

Credits

About

Releases

Packages

Languages

License

GooBox/goobox-nodes-scraper

Folders and files

Latest commit

History

Repository files navigation

goobox-nodes-scraper

Getting started

Requirements

Use public image

Build from sources

Help

Usage example

License

Credits

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages