A script that takes WhatsApp messages in .txt format, pareses the DateTime, Sender and Message and outputs them as a CSV file for furtur analysis/visualisation.
Set-up instructions have been divided into 'New User Set-up', directly below, and 'Experienced User Set-up', furthur down the page.
-
Click 'Clone or Download', select 'Download as .ZIP'
-
Unzip the file, put the 'whatsapp-scraper-master' into your 'Documents' directory.
-
Download Anaconda, a Python distribution, here - https://www.anaconda.com/distribution/
-
Choose your OS, and download the the 'Python 3.7 version'.
-
Once it has finished downloading, install it, choosing the default options.
-
Open the 'Anaconda Navigator" application.
-
Click the 'Environments' tab on the left.
-
Click the 'Import' button at the bottom.
-
Click the folder button next to "Specification File'. Choose the 'environment.yml' file that's in the whatsapp_scraper_master directory.
-
Click 'Import'
-
Once the environment is imported it will appear in the list of environments. It will be called 'whatsapp-conda-env'
-
Click the play button next to the 'whatsapp-conda-env' environment, and choose 'Open Terminal'.
-
In the terminal type -
cd Documents
then
cd whatsapp-scraper-master
then
python whatsapp_scraper.py
The script will run, creating a new directory for each chat txt file, inside it is the original .txt file along with a sorted .csv file.
To import the data in to Excel 365 -
- Open Excel,
- Click the 'Data' panel.
- Click 'From Text/CSV'
- Choose the .csv you want to import,
- Set 'File Origin" to '--None--' (at the top of the list), 'Delimiter' to 'Comma', Data Type Detection to 'Based on entire dataset'
- Click 'Load'. The data will now be imported to the open worksheet.
To import the data in to Excel 2016 -
- Open Excel,
- Click the 'Data' panel.
- Click 'New Query'
- Click 'From File', then 'From CSV'
- Choose the .csv you want to import,
- Set 'File Origin" to '--None--' (at the top of the list), 'Delimiter' to 'Comma', Data Type Detection to 'Based on entire dataset'
- Click 'Load'. The data will now be imported to the open worksheet.
-
Open 'Anaconda Navigator'
-
Click the 'Environments' tab on the left.
-
Repeat the steps in 4. Running the Script.
Python 3.7+
virtualenv
Git Bash
Command Line Interface
NOTE: All commands that use "git" are done in Git Bash. It lets you use MinGW/Linux tools with Git at the command line.
Python 3.7+
virtualenv
Git
Command Line Interface
git clone git@github.com:UoMResearchIT/whatsapp-scraper.git
cd whatsapp-scraper
$ virtualenv <virtualenv_name>
$ <virtualenv_name>\Scripts\activate
$ pip install -r requirements.txt
$ virtualenv <virtualenv_name>
$ source <virtualenv_name>/bin/activate
$ pip install -r requirements.txt
- Place the .txt files you want to process into the
whatsapp-scraper
directory. - In the command line run
whatsapp_scraper.py
- The scripts will process all the .txt files in
whatsapp-scraper
and create new folders for each file, conataining the original .txt file and the new .csv file.
To import the data in to Excel -
- Open Excel,
- Click the 'Data' panel.
- Click 'From Text/CSV'
- Choose the .csv you want to import,
- Set 'File Origin" to '--None--' (at the top of the list), 'Delimiter' to 'Comma', Data Type Detection to 'Based on entire dataset'
- Click 'Load'. The data will now be imported to the open worksheet.