Skip to content

An automated script which browses the Web in a methodical, automated manner.

Notifications You must be signed in to change notification settings

azanbinzahid/web-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

A web crawler is a program or automated script which browses the Web in a methodical, automated manner. Many legitimate sites, in particular search engines, use crawling as a means of providing up-to-date data. Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine, that will index the downloaded pages to provide fast searches. Crawlers can also be used for automating maintenance tasks on a Web site, such as checking links or validating HTML code.

How to run:

  1. Compile and run 'crawler.py' with python3
  2. Enter full site address when prompted or choose from options
  3. This programme basically puts the url into a set (for no dublication) and recursively calls urls with same domain extracted from html parsing of tags and hrefs.

*Tested on W10 and Ubuntu with Python3

About

An automated script which browses the Web in a methodical, automated manner.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages