The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
-
Updated
Nov 1, 2024 - TypeScript
The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
Turn any webpage into structured data using LLMs
Stealth browsers as a service. Connect your scraper or automation to a fleet of cloud-hosted browsers configured for reliability and stealth.
A Modern Search Engine API for Anime, Movies/TVShows, Books, Light Novels, Manga, etc.
Free, open-source no-code web data extraction platform. Build custom robots to automate data scraping [In Beta]
Converts some webnovels to epub format
Linkedin Automation Bot with every possible scraping! Valid for 2022 used by Linvo.io
🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON.
Metadata scraper with support for oEmbed, Twitter Cards and Open Graph Protocol for Node.js ⚡
動漫花園 镜像站 / 动画 BT 资源聚合站
Nodejs library that provides high-level APIs for obtaining information on various entertainment media such as books, movies, comic books, anime, manga, and so on.
A simple browser/client-side web scraper.
Add a description, image, and links to the scraper topic page so that developers can more easily learn about it.
To associate your repository with the scraper topic, visit your repo's landing page and select "manage topics."