Visual scraper interface, exports to puppeteer script which you can run anywhere. You can try it out here https://jawa.sh
Jawa allows you to visually click elements of any website and then export selectors as a config that you can run in any node environment to scrape the content when needed.
This repo consists of the:
- web app
- cli
- browser extension
Web app that provides embedded browser for visually selecting elements and creating the scraper config that you can download and run through the CLI or Cloud.
It is now supported to run your scraper config in the cloud directly from web app. Cloud scrapers use the same Jawa CLI. Currently cloud scrapers have limited availability.
If you need more usage you can check out Jawa Pro.
Simple CLI to run configs created and exported from web app. You can run it like this:
npx jawa path/to/scraper/config/file.json
or npx jawa --help
to see all the options.
jawa
package now also exports scrape
function so it can be used outside of CLI in your apps or services:
import { scrape } from 'jawa'
const { scrape } = require('jawa')
Browser extension that runs the embedded browser which powers the visual scraper interface.
It is available on:
- Chrome Web Store
- Chrome extensions also work on all Chromium based browsers like:
- Opera
- Microsoft Edge
- Brave