Automatically identify and highlight targets (Russian soldiers, right-wing extremists etc) in visualized social networks
UlyssesNYC https://medium.com/@DataSalo
This sections discusses the purpose and motivation for the tool, and how it addresses a tool need you've identified.
-
Make sure you have Python version 3.8 or greater installed
-
Download the tool's repository using the command:
git clone https://github.com/DataSalo/SocNet_Dynamic_Image_Search.git
-
Move to the tool's directory and install the necessary requirement
cd SocNet_Dynamic_Image_Search pip install -r requirements.txt
-
For those users who also wish to carry out a Selenium driven friend-of-a-friend VK search, please follow the instructions here for Selenium driver installation
- Go to
SocNet_Dynamic_Image_Search/code
and runpython socnet_app.py
. - Go to http://127.0.0.1:8050/ in your browser to display the precomputed network whose id is stored
cached_network_id.txt
(how to compute and cache new networks is dicussed later. - The network is visualized but none of the people nodes are labeled. Run a an image search on the upper-left corner of the screen for a photo category of interest such as "soldier", "guns", "confederate flag" or "man in cowboy hat".
- The nodes with match photographs are now filled in with those photos.
- Use the mouse to drag the network and zoom into the network cluster of interest.
- Click any node to display the associated person's name, photograph, and social media profile link.
- Go to
SocNet_Dynamic_Image_Search/code
and runvk_foaf_crawler.py
. - Follow prompt to specify the VK id of the user whose friend-of-a-friend network we wish to crawl. The example id used in the demo was
414930480
. - Follow prompts to enter the email and password of an accessible VK account.
- The script will proceed to identify all friends of the user and then obtain existing links between friends. It will also download all recent posted images associated with these users.
- The network will be cached locally. It will be assigned a network id associated with the specified VK account.
- Altering the id in the
cached_network_id.txt
file will ensure that this cached network gets displayed when we activate the Socnet Visualization App.
Enter VK id (example: 414930480):
414930480
Enter VK email:
ulyssesnycc@gmail.com
Enter VK password:
password123
Scrapping central target
https://vk.com/id414930480
Scrapping friends of target
Identified 94 friends.
Scraping friend-of-a-friend network
Downloading images associated with the target and friends
Computing and caching the downloaded image embeddings.
Saving friend-of-a-friend network.
- My choice for utilizing Cytoscape.js was partially driven by its ability to handle embedded images within nodes; as well as easy callbacks between graph interactions and html surrounding the network (which allowed me display enlarged photos / profile info with each click). Other, more sophisticated tools don't always allow for this level of interaction.
- Right now, the visualization tool requires that the network of interest be cached and stored locally under a specific network id. This is because the associated images and searchable image embeddings must also be stored locally. Currently, the cached network id must be specified within the
cached_network_id.txt
config file prior to app launch. Eventually, I'd like to make the cached networks accessible directly from the app; so that the user can seamlessly switch between multiple cached networks of choice. - Eventually, I'd like to split the VK friend-of-a-friend crawler into a separate repo. For those who are wondering why the crawler depends on something as cumbersome as Selenium, it is because it's much harder to crawl user-friends at scale using more streamlined tools like the Selenium API.
- Eventually, I'd like to expand the repertoire of crawlers to other social networks (including Twitter / Instagram) in order to better align with investigator use-cases.