Skip to content

Telegram scraping tool for researching mis-/disinformation and investigating shade goings on.

Notifications You must be signed in to change notification settings

TechRahul20/TelegramScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TelegramScraper

A toolkit for scraping Telegram to investigate shady goings on.

Installation

  1. Download all files and save to directory of choice.

  2. Ensure pandas and telethon are installed.

pip install pandas
pip install telethon
  1. Obtain your Telegram API details from my.telegram.org (further instructions to be added here).

  2. In terminal, navigate to the installation directory (eg, desktop) and run setup.py

cd Desktop
python3 setup.py
  1. Executing the setup.py file will walk you through the Telegram API login and prepare the toolkit with your details.

n.b: Currently there is no easy installation, however, I'm working on properly packaging everything to make this straightforward for non-technical users.

Usage

Upon installation completion, you will be able to launch the toolkit from launcher.py

cd Desktop
python3 launcher.py

The launcher will guide you through each of the tools. Here is an overview.

  1. Scrape group members Scrapes all group members from a Telegram group you are part of. Exports as a .CSV containing the username (when available), user id, name, group name and group ID. The file is named after the group.

  2. Scrape forwards from chats you are in Scrapes all forwards from a chat you are following. Saves from, from ID, to and to ID to forwards_data.csv. It can then scrape forwards from all the discovered channels for a larger network map. This second feature takes a long time to run, but is worthwhile for a broader analysis.

  3. Scrape forwards from a channel Scrapes all forwards from any channel you specify. It can then scrape forwards from all the discovered channels for a larger network map. This second feature takes a long time to run, but is worthwhile for a broader analysis.

Currently only scrapes from user and to user then saves to ef_edgelist.csv.

Upcoming updates

  1. An option to export all data (from user, from user ID, to user, and to ID) OR simply exporting an edgelist for direct analysis.

  2. Updating all save files to generate unique names for each group/chat scraped.

  3. Tool to archive all messages and media from a chat.

Known bugs

  1. Sometimes, when using scrape group members, returning to the launcher, then selecting scrape forwards from chats you are in, the toolkit will crash. This is an API error and can be avoided by restarting the launcher.

  2. Scrape forwards from chats you are in displays an error message when you try to pull from Groups rather than Channels. Working on a fix to omit groups from the generated list.

Feedback

Please send all feedback either to (@jordanwildon) on Twitter, or to jordanwildon@protonmail.com

License

This project is still being tested and is not currently licensed. Please contact (@jordanwildon) on Twitter, or email jordanwildon@protonmail.com for usage information and restrictions.

Credits

All tools created by Jordan Wildon (@jordanwildon) and Alex Newhouse (@AlexBNewhouse).

About

Telegram scraping tool for researching mis-/disinformation and investigating shade goings on.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages