Skip to content
#

url-parser

Here are 25 public repositories matching this topic...

A simple Python web crawler that processes URLs from web pages, handles redirects, and skips non-HTML content. It supports HTTP/HTTPS, calculates same-domain link ratios, avoids duplicate URLs, and saves results in a TSV file. Designed for easy scalability and future extensions.

  • Updated Oct 15, 2024
  • Python

Improve this page

Add a description, image, and links to the url-parser topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the url-parser topic, visit your repo's landing page and select "manage topics."

Learn more