A tool for analyzing Android malware source code capabilities.
android-malware-capabilities-analyzer
- Table of Contents
- Introduction
- Context
- Motivation
- Proposed Solution
- Features
- Requirements
- Usage
- File Structure
- ToDos
android-malware-capabilities-analyzer is a tool to collect information about the source code of Android malware to help the researcher infer the capabilities of the malware. I will explain how it works and how it was developed, and subsequently test it on some malware samples that I was able to collect.
To give some context, Cyber Threat Intelligence focuses on gathering data related to past, current and future cyber-attacks in order to ascertain useful knowledge from them. This insight into attack trends and techniques can be used to improve organizations' security, reduce their risk and make pen-tests and simulation attacks more realistic, ensuring their readiness when faced with genuine attacks.
A key step in the Cyber Threat Intelligence gathering process is to identify the capabilities of a malware deployed by a threat actor. Having information on what functionality a malware has, what it has access to, how it behaves and what its targets are can help detect, contain and eliminate it, as it can be used to derive Indicators of Compromise (IoC) and Tactics, Techniques and Procedures (TTP), as well as knowing what tools they use. All of this data is important, and is shared between organizations in a collaborative effort to improve resilience against these attacks.
Since I was going to be analyzing Android malware source code samples, I wanted a tool that would automatically extract some useful information to aid in the analysis. The analysis can be found in this repository.
A Python script that reads Android source code in search of the AndroidManifest.xml file and Java or Kotlin code and extracts relevant information for an analysis of the sample, like package name, permissions, imports and actions. It can print that information and save it to a file using different formats. It can also aggregate the result to generate graphs of frequency. It can add useful information to permissions, actions and imports using a database that contains a description and other information useful for the researcher.
- Highly modular
- Highly customizable
- Highly flexible
- Extract Package names
- Extract Permissions
- Extract Actions
- Extract Imports
- Specify the information to be extracted
- Show Permissions information (description, deprecated, protection level, permission group, permission group description)
- Show Actions information (description)
- Show Imports information (description)
- Specify folder depth to analyze several projects at once
- Tree-like terminal output
- Hide terminal output
- Save results to TXT
- Save results to JSON
- Save frequency results to TXT
- Generate a graph of frequency results
- Specify the type of graph
- Limit the number of graph columns
- Python
- PIP
- Python libraries
- Project Folder
To install Python go to the Downloads web page. For Linux you can also install it using the terminal:
# Debian-based distros
$ sudo apt install python3
To install PIP you can follow its Installation Manual. PIP is used to install Python library packages that may be required by this project, like numpy or matplotlib.
$ pip install <python-library-package-name>
Lastly, download the code of the Project.
Make sure you comply with the Requirements.
The tool is a Python script that allows certain functionalities, which can be accessed by adding several arguments to the command when the script is executed. One of those arguments ('-h', '--help') prints all options that can be used with this script as well as information about their use:
usage: capability-analyzer.py [-h] [-s {package,permissions,actions,imports}] [-d DEPTH] [-i] [-n] [-t TXT] [-j JSON] [-f FREQUENCY] [-g {barplot,horizontal_barplot}] path
Analyze Android source code capabilities.
positional arguments:
path
path to the folder containing the source code. It can be a folder containing subfolders
options:
-h, --help
show this help message and exit
-s {package,permissions,actions,imports}, --search {package,permissions,actions,imports}
specifies what will be analyzed from the application code
-d DEPTH, --depth DEPTH
path depth to aggregate results. A depth of 1 aggregates the results to the selected folder, a depth of 2 aggregates the results to the immediate subfolders, etc.
-i, --info
append description information to the found capabilities
-n, --no-print
hide terminal output
-t TXT, --txt TXT
save results to <TXT>.txt
-j JSON, --json JSON
save results to <JSON>.json
-f FREQUENCY, --frequency FREQUENCY
save frequency results to <FREQUENCY>.txt
-g {barplot,horizontal_barplot}, --graph {barplot,horizontal_barplot}
generate a specific type of graph to graphically show the result
-l LIMIT, --limit LIMIT
only <LIMIT> number of columns will be shown on graphs
# Important project components
.
├── capability-analyzer.py - # Main script
├── capabilities.json - # Database of permissions, imports, etc.
├── logic - # Main logic of the script
│ ├── parser.py - # File scanning functions
│ └── utils.py - # Helper functions
├── input - # Read from files
│ └── read_from_file.py - # Reads data from files
└── output - # Display & save the result
├── graph.py - # Create graphs
├── print.py - # Print a tree with the results
└── save_to_file.py - # Save the result into files
- Add descriptions to the capabilities database