A simple tool to find and visualise links between two sets of websites built with Scrapy and Graphviz
linkminer
uses the power of Scrapy to build a higher-level network graph based on two sets of URLs which is then visualised with Graphviz. We are using this tool internally for Competitive Intelligence, i.e. when we want to find out which customers
have some kind of relationship with specific competitors.
Install via PyPi:
pip install linkminer
Install via Git:
git clone https://github.com/INNOVINATI/linkminer.git
cd linkminer-master
virtualenv venv #Optional
source venv/bin/activate #Optional
pip setup.py install
Extract links from 2 given sets of URLs:
from linkminer.miner import LinkMiner
source_urls = [...]
target_urls = [...]
m = LinkMiner(source_urls, target_urls)
m.extract()
Render the graph:
m.render('testfile')
Export graph and data as JSON file:
m.export_json('testfile')