bcrawler

A standalone crawling tool for extracting data from the web.

Usage

Usage of ./bcrawler:
  -alsologtostderr
    	log to standard error as well as files
  -conf string
    	dir for parsers conf (default "./conf")
  -dir string
    	the data dir (default "data")
  -log_dir string
    	If non-empty, write log files in this directory
  -logtostderr
    	log to standard error instead of files
  -q string
    	the queue dir (default "q")
  -sleep int
    	in seconds (default -1)
  -start string
    	the parser name for the start url (default "addr_year")

There should be a parsers directory under ./conf, which stores all the parser configurations in json format.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
conf/parsers		conf/parsers
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go
parsers.go		parsers.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bcrawler

Usage

About

Releases

Packages

Languages

License

crawlerclub/bcrawler

Folders and files

Latest commit

History

Repository files navigation

bcrawler

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages