[Aborted] Glassdoor Interview Questions Scrapper

The project is aborted. Many issues are there that prevented scrapping, some of them are resolved but others cannot be:

ChromiumDriver used in Selenium is a test driver and Google doesn't allow proper user sign-in. Tried to run a Chrome instance in a different port and use the same port from the code but that too didn't work (Chrome driver isn't picking the port or the port is not used by Selenium - mismatch).

$ /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222

$ chromedriver_mac_arm64/chromedriver --remote-debugging-port=9515

$/opt/homebrew/bin/chromedriver --remote-debugging-port=9515

Another way was to avoid Google detecting the test Chrome browser by installing undetected_chromedriver driver. This allowed me to sign in to my Google account, but testing became difficult every time with 2-step-verification from Google.

python3 -m pip install undetected_chromedriver

To solve the above problem had to venture into Cookies, which would preserve the session for about 30 mins.

def loadCookies(self):

def saveCookies(self):

The above worked fine however, now the issue is with the HTML. Glassdoor never loads the entire page. And, for some reason, the elements are not identified by Selenium.

Considering the above issues I've aborted the project. Other alternatives would be to write a Chrome Extension in jQuery. And, it is out of scope for this project at the moment (15/04/2024).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

[Aborted] Glassdoor Interview Questions Scrapper

Files

README.md

Latest commit

History

README.md

File metadata and controls

[Aborted] Glassdoor Interview Questions Scrapper