Web Scraper using Selenium and Python to fetch audiobook details and required reviews for user from www.audible.com and convert to csv
Dataset Link Updated on 14-06-2021
- Working Laptop/Desktop ?
- Latest Python version should not be a problem. Disclaimer: Python 3.7.8 is installed on my system.
- Selenium which when I installed was 3.141.0.
- Google Chrome
- Since my Chrome version was 90, So this chromedriver was the most compatible one for me. If you want to download your compatible chromedriver then check you Chrome version and then download the compatible chromedriver
- CSV reader (Microsoft Excel,etc..)
- IDE (PyCharm)
- Chropath.For finding specific xpath if needed.
Please go through the installations as stated in the requirements list above. Set the python path and chromedriver path as well. I am not using virtual environments(not a fan of 🐍)
Click this badge for the process demo ?
- The following robot needs the latest product list link and the audible.com website link
- Reviews_Crawler function takes the number of reviews you want (in multiples of 10)
- The show_more_open_times variable takes the number of times the showmore button should be clicked while taking reviews. Initially before clicking showmore button, there are 10 reviews, so every showmore button will generate 10 reviews. For Example - if I put 3 in show_more_open_times variable in main.py, the robot will click showmore button 2 times, generating 30 reviews(initially there are 10 reviews) and creating 30 review columns separately in csv.
- The csv file is encoded in utf-8
- Unfortunately this version(1.0) has no pause button nor multithreading, So the iteration of 1200 books or above will take some significant amount of time and if stopped it will again iterate from beginning, duplicating files into csv.
- I have not tested the iteration over 1200 books, So any issue please ping me up.