JCrawl is a basic web crawler implemented in Java, designed to scrape web pages starting from a given URL and extracting links from those pages. Web crawling is the process of navigating and extracting information from web pages, often used by search engines and web scrapers
- Web crawling from a starting URL.
- Specify the number of links to scrape using a breakpoint.
- Extract links from web pages.
- Java Development Kit (JDK) installed on your system.
- Clone or download this repository to your local machine.
- Compile the
JCrawl.java
file usingjavac
:javac JCrawl.java
java JCrawl