This script is meant to read backup files created by the messaging app Signal.
Non-standard Python libraries:
-
$ pip install cryptography
-
Google's Protocol Buffers
$ pip install protobuf
A Signal backup file consists of a number of consecutive frames. The first frame, the header, is unencrypted. The subsequent frames are encrypted with keys that are derived from the backup file's password and from the salt and initialization vector contained in the header.
The structure of the frames is defined in Google's Protocol Buffers
specification language and can be found in the local copy of backups.proto
(source)
Frames can be reconstituted from their raw byte code by compiling the specification file with Google's protocol buffer compiler (protoc
), which result in a Python library (backups_pb2.py
) that can be employed to unpack/unmarshall a frame.
- The
header
frame is the single first frame in a Signal backup file. - The
version
frame is the second frame in a Signal backup file, recording its version number. - A
statement
frame is a frame containing an SQLite statement, representing conversational data, as Signal stores conversational data in an SQLite database. - An
attachment
frame is a frame containing an attachment file for a message. - A
preference
frame is a generic frame format that can encapsulate certain preferences for the Signal app. - The
end
frame is the final frame in a Signal backup file, signalling the end of the file. - A
avatar
frame is a frame containing a contact's avatar image. - A
sticker
frame is a frame containing a sticker image.
- https://github.com/signalapp/: the codebase for Signal on Github.
- https://github.com/xeals/signal-back: a Signal backup file reader implemented in Go.
- https://github.com/bepaald/signalbackup-tools: Suite of tools to work with Signal backup files.
- https://github.com/GjjvdBurg/signal2html: Script to generate HTML files from a Signal backup file (depends on signalbackup-tools).
This code was intended for private use, but I am releasing this code in case it is useful for someone else. I have only tested it on some backup files generated by the Android app, the oldest backup file being version 34. I have not expended much effort for script robustness / error handling.
$ python signal_backup_reader.py --help
usage: signal_backup_reader.py [-h] [--no-attachments] [--no-sqlite]
BackupFile Password OutputFolder
Read and extract data from a Signal backup file.
positional arguments:
BackupFile Signal backup file.
Password Password.
OutputFolder Output folder.
optional arguments:
-h, --help show this help message and exit
--no-attachments Disable the extraction of attachments.
--no-sqlite Disable executing the SQL commands on an SQLite database.