Web Scraping Project with Node.js and Playwright

Table of Contents

Introduction
Features
Requirements
Installation
Usage
Contributing
License

Introduction

This project scrapes content from a Shopee merchant page and individual product pages. The scraped data is then posted to our server via an API.

Features

Automates login to Shopee
Handles captcha via a Python script
Scrapes data from a Shopee merchant page
Scrapes details from individual product pages
Updates scraped data to the server using an API

Requirements

Node.js
Playwright
Python (for captcha handling)
account to scrap

Installation

Clone this repository:

git clone https://github.com/nsanzimfura-eric/web-scraping.git

Navigate into the project directory:
```
cd web-scraping
```
Install dependencies:
```
npm install
```
Add your .env variables:
```
cp .env.sample .env
```
Update .env with your API endpoints and Shopee account details.

Usage

To start the scraper:
```
npm start
```

Contributing

I, Nsanzimfura Eric contributed to this web-scraping app, and an author.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.