scraper

Star

Here are 498 public repositories matching this topic...

cheeriojs / cheerio

Sponsor

Star

The fast, flexible, and elegant library for parsing and manipulating HTML and XML.

html jquery parser scraper dom cheerio selector hacktoberfest htmlparser2 htmlparser

Updated Nov 1, 2024
TypeScript

mendableai / firecrawl

Star

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

markdown crawler data scraper ai html-to-markdown web-crawler scraping webscraping rag llm ai-scraping

Updated Nov 1, 2024
TypeScript

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

nodejs javascript npm crawler scraper automation typescript web-crawler headless scraping crawling web-scraping web-crawling headless-chrome apify puppeteer playwright

Updated Nov 1, 2024
TypeScript

mishushakov / llm-scraper

Star

Turn any webpage into structured data using LLMs

scraper browser ai artificial-intelligence openai llama gpt browser-automation puppeteer playwright gpt-4 llm langchain

Updated Aug 30, 2024
TypeScript

finic-ai / finic

Star

Stealth browsers as a service. Connect your scraper or automation to a fleet of cloud-hosted browsers configured for reliability and stealth.

integrations scraper automation rpa generative-ai

Updated Oct 20, 2024
TypeScript

consumet / api.consumet.org

Sponsor

Star

A Modern Search Engine API for Anime, Movies/TVShows, Books, Light Novels, Manga, etc.

Updated Oct 28, 2024
TypeScript

getmaxun / maxun

Star

Free, open-source no-code web data extraction platform. Build custom robots to automate data scraping [In Beta]

api scraper automation browser spreadsheet web-scraper self-hosted web-scraping browser-automation low-code no-code web-automation rpa robotic-process-automation playwright maxun website-to-api

Updated Nov 1, 2024
TypeScript

maoserr / epublifier

Star

Converts some webnovels to epub format

scraper ui epub extension-chrome extension-firefox

Updated Nov 1, 2024
TypeScript

linvo-io / linvo-scraper

Star

Linkedin Automation Bot with every possible scraping! Valid for 2022 used by Linvo.io

scraper automation linkedin hacktoberfest puppeteer hactoberfest-accepted

Updated Aug 22, 2023
TypeScript

josephlimtech / linkedin-profile-scraper-api

Star

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.

Updated Apr 5, 2024
TypeScript

lmmfranco / nintendo-switch-eshop

Star

Crawler for Nintendo Switch eShop

game crawler scraper nintendo lib price switch eshop nintendo-switch

Updated Nov 5, 2021
TypeScript

jacktuck / unfurl

Star

Metadata scraper with support for oEmbed, Twitter Cards and Open Graph Protocol for Node.js ⚡

nodejs slack metadata scraper microservice open-graph oembed twitter-cards embed micro ogp meta-tags unfurl

Updated Apr 9, 2024
TypeScript

yjl9903 / AnimeGarden

Star

動漫花園镜像站 / 动画 BT 资源聚合站

torrent scraper anime animation bangumi anitomy dmhy animelist anime-tracker animespace animegarden

Updated Nov 1, 2024
TypeScript

consumet / consumet.ts

Sponsor

Star

Nodejs library that provides high-level APIs for obtaining information on various entertainment media such as books, movies, comic books, anime, manga, and so on.