Sweet python tiny site parser.
[new] Now with CloudFlare bypass
<selector> [a@ <selector>] [% <python_code>]
Note: python_code
relative to last tag. Use .
(dot) to get attribute or call method.
Ensure that you have installed (for lxml):
[!] for Termux without sudo
sudo apt-get install libxml2 libxslt
Before use, install requirements:
pip3 install -r requirements.txt
for page in parser / '.links a@':
for p in page / 'h3:-soup-contains("Some title")+p':
print(p.text)
./parser.py http://site.org 'a@a@title%.text'
./parser.py http://site.org 'a%.get("href")'