You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
have looked at the conda-forge recipe. appears it should now be straight forward to add windows support to it. (alternatively windows users have to manually install). Solved: Windows support conda-forge/leveldb-feedstock#12
plyvel
i believe we can switch out plyvel for python-leveldb with almost no fuss
python-virtualdriver
this is for running xvfb which won't work on windows anyway, so just need to figure a package management solution / environment.yaml that accomodates both (most likely just making installing python-xvfb a manual step, as install xvfb is manual anyway -- maybe moving to pip will workaround)
Make some tweaks in deploy_firefox so we're not manually making paths by concatenating strings
Also suggest making some tweaks in deploy_firefox so that we let geckodriver set a profile path and we then read off it. this will help in goal of restoring stateful crawls and will make it easier to work here.
Find a replacement for the log interceptor that uses mkfifo which is unix only. This stack overflow thread has something that maybe we can drop in as a replacement. Alternatively, I used a different approach in faust-selenium and created something to constantly "tail" geckodriver.log (https://github.com/birdsarah/faust-selenium/blob/master/crawler/geckodriver_log_reader.py). Alternatively again, we just save the geckodriver.log at the end and don't weave it into our logging. @englehardt - what is the motivation for interleaving the geckodriver logs?
First step could be to skip geckodriver logs for windows platform - they're not crawl essential as best as I can tell.
Future (open issues):
Add CircleCI tests and test on Win, OSX, and Linux (at least once per PR - or once a week).
The text was updated successfully, but these errors were encountered:
An alternate version of openwpm was created as a proof of concept and has done windows crawls with openwpm. It uses basically the same openwpm instrumentation extension, but replaces the socket with a websocket, and uses kafka for orchestrating the crawl: https://github.com/birdsarah/faust-selenium
Path, I hope, to supporting Windows. There may be some limitations, but first step.
ToDo:
Future (open issues):
The text was updated successfully, but these errors were encountered: