Skip to content

Latest commit

 

History

History
39 lines (35 loc) · 1014 Bytes

README.md

File metadata and controls

39 lines (35 loc) · 1014 Bytes

linkminer

A simple tool to find and visualise links between two sets of websites built with Scrapy and Graphviz

About

linkminer uses the power of Scrapy to build a higher-level network graph based on two sets of URLs which is then visualised with Graphviz. We are using this tool internally for Competitive Intelligence, i.e. when we want to find out which customers have some kind of relationship with specific competitors.

Getting started

Install via PyPi:

pip install linkminer

Install via Git:

git clone https://github.com/INNOVINATI/linkminer.git
cd linkminer-master
virtualenv venv #Optional
source venv/bin/activate #Optional
pip setup.py install

Usage

Extract links from 2 given sets of URLs:

from linkminer.miner import LinkMiner

source_urls = [...]
target_urls = [...]

m = LinkMiner(source_urls, target_urls)
m.extract()

Render the graph:

m.render('testfile')

Export graph and data as JSON file:

m.export_json('testfile')