Welcome to the Dynamic Job Scraper project! This tool is designed to streamline your job search by automatically extracting and compiling job listings from various online sources. Whether you are looking for remote opportunities or local positions, this scraper will save you time and effort by consolidating job data into easy-to-use CSV files.
- Multiple Keyword Search: Search for job listings using multiple keywords separated by commas.
- Real-Time Scraping: Scrape job listings from multiple sources in real-time.
- Export to CSV: Save search results as CSV files for easy sharing and analysis.
- User-Friendly Interface: Simple and clean web interface built with PicoCSS.
- In-Memory File Handling: Efficient in-memory file creation and handling.
QuantumScraper/
│
├── extractors/
│ ├── job_data_wwr.py
│ ├── job_data.py
│ ├── wanted_job_search.py
│ └── wwr.py
│
├── templates/
│ ├── export.html
│ ├── home.html
│ └── search.html
│
├── file.py
├── main.py
├── requirements.txt
├── .gitignore
└── .gitattributes
job_data_wwr.py
: Defines theJobDataWWR
class for managing job data sourced from We Work Remotely.job_data.py
: Defines theJobData
class for managing job data sourced from Wanted.wanted_job_search.py
: Implements the scraping logic for the Wanted job site.wwr.py
: Implements the scraping logic for the We Work Remotely job site.
export.html
: Template for exporting job search results.home.html
: Homepage template for the web interface.search.html
: Template for displaying search results.
file.py
: Contains functions to save job data to files.main.py
: The main application script that runs the Flask web server and handles routing.requirements.txt
: Lists the required Python packages for the project.
Lists the required Python packages for the project:
beautifulsoup4==4.9.3
playwright==1.12.2
requests==2.25.1
-
JobDataWWR Class (
job_data_wwr.py
)- Attributes:
title
,company
,position
,region
,link
- Method:
to_list
- Converts job attributes to a list format.
- Attributes:
-
JobData Class (
job_data.py
)- Attributes:
title
,company_name
,reward
,link
- Method:
to_list
- Converts job attributes to a list format.
- Attributes:
-
WantedJobSearch Class (
wanted_job_search.py
)- Manages job searches on the Wanted job site using Playwright and BeautifulSoup.
- Methods:
add_keyword
,add_keywords_from_input
,run_playwright
,scrape_keyword
,save_to_csv
-
WWRJobSearch Class (
wwr.py
)- Manages job searches on the We Work Remotely job site using requests and BeautifulSoup.
- Methods:
add_keyword
,add_keywords_from_input
,scrape_page
,scrape_keyword
,get_pages
,pages_save_to_csv
,keyword_search_save_to_csv
- The Flask app (
main.py
) serves as the web interface for the Dynamic Job Scraper. - Users can perform job searches, view results, and export data to CSV files.
- Python 3.x
- pip (Python package installer)
-
Clone the repository:
git clone https://github.com/your-username/QuantumScraper.git cd QuantumScraper
-
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required packages:
pip install -r requirements.txt
-
Install Playwright browsers:
playwright install
-
Start the Flask server:
python main.py
-
Open your web browser and navigate to http://localhost:8080.
-
Use the home page to search for job listings by entering keywords separated by commas.
-
View and export the search results.
Contributions are welcome! If you have suggestions for improvements or new features, feel free to open an issue or submit a pull request. We appreciate your help in enhancing this project.
For any questions, suggestions, or issues, please contact:
- Name: Minho Song
- Email: hominsong@naver.com
- GitHub: minhosong88