AI Security Analyzer

🤖 AI Security Analyzer is a powerful tool that leverages AI to automatically generate comprehensive security documentation for your projects, including security design, threat modeling, attack surface analysis, and more.

🎥 Demo:

Overview

AI Security Analyzer is a Python-based tool that analyzes your project's codebase and automatically generates detailed security documentation. It supports multiple analysis types:

🔒 Security Design Documentation
🎯 Threat Modeling
🔍 Attack Surface Analysis
⚠️ Threat Scenarios
🌳 Attack Tree Analysis

The tool supports multiple project types and utilizes advanced language models (LLMs) to create insightful security documentation tailored to your project's specific needs.

Key Features

🔍 Intelligent Analysis: Automatically analyzes codebases for security considerations
📝 Multiple Document Types: Generates various security documentation types
🤖 Multi-LLM Support: Works with OpenAI, OpenRouter, Anthropic, and Google models
🔄 Project Type Support: Python, Go, Java, Android, JavaScript, and generic projects
📊 Mermaid Diagram Validation: Built-in validation for Mermaid diagrams
🎛️ Flexible Configuration: Extensive file filtering and customization options
🌐 Cross-Platform: Runs on Windows, macOS, and Linux

Prerequisites

Python 3.11
Node.js: Required for validating Mermaid diagrams in Markdown.
Poetry: For managing Python dependencies.

Installation

From Source

Clone the repository and install dependencies using the provided script:

git clone git@github.com:xvnpw/ai-security-analyzer.git
cd ai-security-analyzer
./build.sh  # Installs Python and Node.js dependencies
poetry run python ai_security_analyzer/app.py --help

Using Docker

You can run the application using Docker without installing Python or Node.js locally.

In PowerShell (Windows):

docker run -v C:\path\to\your\project:/target `
           -e OPENAI_API_KEY=$Env:OPENAI_API_KEY `
           ghcr.io/xvnpw/ai-security-analyzer:latest `
           dir -v -t /target -o /target/security_design.md

In Bash (Linux/macOS):

docker run -v ~/path/to/your/project:/target \
           -e OPENAI_API_KEY=$OPENAI_API_KEY \
           ghcr.io/xvnpw/ai-security-analyzer:latest \
           dir -v -t /target -o /target/security_design.md

Token Usage and Cost Management ⚠️

Understanding Token Consumption

In dir mode this application may consume a significant number of tokens due to its workflow:

Each file is processed and sent to LLM
Multiple rounds of analysis for comprehensive documentation
Additional tokens for markdown validation and fixes
Large codebases can lead to substantial token usage

Cost Control Best Practices 💰

Always Start with Dry Run

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    --dry-run

This will show you:

Total number of tokens to be processed
List of files that will be analyzed
No actual API calls will be made

Optimize File Selection
- Use --exclude to skip non-essential files:
```
--exclude "**/tests/**,**/docs/**,LICENSE,*.md"
```
- Focus on security-relevant files with --filter-keywords:
```
--filter-keywords "security,auth,crypto,password,secret,token"
```

Recommendations

For dir mode start with --dry-run to assess token usage
Use file filtering options to reduce scope
Consider running on smaller, security-critical portions first
Test on smaller codebases before analyzing large projects
Keep track of your API usage limits and costs

Architecture

To help you understand how the application works, we've included an application flow diagrams.

Application Flow for `dir` mode

stateDiagram-v2
    [*] --> Configure_Application
    Configure_Application --> Load_Project_Files
    Load_Project_Files --> Apply_Filters
    Apply_Filters --> Split_into_Chunks
    Split_into_Chunks --> Initial_Draft
    Initial_Draft -->  Update_Draft
    Update_Draft --> Update_Draft: Process More Docs
    Update_Draft --> Validate_Markdown: All Docs Processed
    Validate_Markdown --> Editor: Invalid Markdown
    Editor --> Validate_Markdown: Fix Formatting
    Validate_Markdown --> [*]: Valid Markdown

The application follows these high-level steps:

Configure Application: Parses command-line arguments and sets up the configuration.
Load Project Files: Loads files from the specified target directory, applying include/exclude rules.
Apply Filters: Sorts and filters documents based on specified keywords and patterns.
Split into Chunks: Splits documents into smaller chunks that fit within the LLM's context window.
Create Initial Draft: Uses the LLM to generate an initial security document based on the first batch of documents.
Process More Docs: Iteratively updates the draft by processing additional document batches.
Validate Markdown: Checks the generated markdown for syntax and Mermaid diagram correctness.
Fix Formatting: If validation fails, uses the editor LLM to fix markdown formatting issues.
Completion: Finalizes the security documentation.

Application Flow for `github` mode

stateDiagram-v2
    [*] --> Configure_Application
    Configure_Application --> Create_Initial_Draft
    Create_Initial_Draft --> Refine_Draft
    Refine_Draft --> Refine_Draft: More Refinements Needed
    Refine_Draft --> Validate_Markdown: All Refinements Done
    Validate_Markdown --> Editor: Invalid Markdown
    Editor --> Validate_Markdown: Fix Formatting
    Validate_Markdown --> [*]: Valid Markdown

The application follows these high-level steps:

Configure Application: Parses command-line arguments and sets up the configuration.
Create Initial Draft: Uses the LLM to generate an initial security document based on the GitHub repository URL.
Refine Draft: Iteratively refines the draft to improve its quality (number of iterations configurable via --refinement-count).
Validate Markdown: Checks the generated markdown for syntax and Mermaid diagram correctness.
Fix Formatting: If validation fails, uses the editor LLM to fix markdown formatting issues.
Completion: Finalizes the security documentation.

Application Flow for `file` mode

stateDiagram-v2
    [*] --> Configure_Application
    Configure_Application --> Load_File
    Load_File --> Create_Initial_Draft
    Create_Initial_Draft --> Refine_Draft
    Refine_Draft --> Refine_Draft: More Refinements Needed
    Refine_Draft --> Validate_Markdown: All Refinements Done
    Validate_Markdown --> Editor: Invalid Markdown
    Editor --> Validate_Markdown: Fix Formatting
    Validate_Markdown --> [*]: Valid Markdown

The application follows these high-level steps:

Configure Application: Parses command-line arguments and sets up the configuration.
Load File: Loads the specified file for analysis.
Create Initial Draft: Uses the LLM to generate an initial security document based on the file content.
Refine Draft: Iteratively refines the draft to improve its quality (number of iterations configurable via --refinement-count).
Validate Markdown: Checks the generated markdown for syntax and Mermaid diagram correctness.
Fix Formatting: If validation fails, uses the editor LLM to fix markdown formatting issues.
Completion: Finalizes the security documentation.

Configuration

The application accepts various command-line arguments to tailor its behavior.

General Options

mode: Required. Operation mode (dir, github, file):
- dir: Analyze a local directory (will send all files from directory to LLM)
- github: Analyze a GitHub repository (will use model knowledge base to generate documentation)
- file: Analyze a single file
-h, --help: Show help message and exit.
-v, --verbose: Enable verbose logging.
-d, --debug: Enable debug logging.

Input/Output Options

-t, --target: Required. Target based on mode:
- For dir mode: Directory path to analyze
- For github mode: GitHub repository URL (must start with 'https://github.com/')
- For file mode: File path to analyze
-o, --output-file: Output file for the security documentation. Default is stdout.
-p, --project-type: For dir mode only. Type of project (python, generic, go, java, android, javascript). Default is python.
--exclude: For dir mode only. Comma-separated list of patterns to exclude from analysis using python glob patterns (e.g., LICENSE,**/tests/**).
--exclude-mode: For dir mode only. How to handle the exclude patterns (add to add to default excludes, override to replace). Default is add.
--include: For dir mode only. Comma-separated list of patterns to include in the analysis using python glob patterns (e.g., **/*.java).
--include-mode: For dir mode only. How to handle the include patterns (add to add to default includes, override to replace). Default is add.
--filter-keywords: For dir mode only. Comma-separated list of keywords. Only files containing these keywords will be analyzed.
--dry-run: For dir mode only. Perform a dry run. Prints configuration and list of files to analyze without making API calls.

Agent Configuration

--agent-provider: LLM provider for the agent (openai, openrouter, anthropic, google). Default is openai.
--agent-model: Model name for the agent. Default is gpt-4o.
--agent-temperature: Sampling temperature for the agent model (between 0 and 1). Default is 0.
--agent-preamble-enabled: Enable preamble in the output.
--agent-preamble: Preamble text added to the beginning of the output.
--agent-prompt-type: Prompt to use in agent (default: sec-design). Options are:
- sec-design: Generate a security design document for the project.
- threat-modeling: Perform threat modeling for the project.
- attack-surface: Perform attack surface analysis for the project.
- threat-scenarios: Perform threat scenarios analysis for the project using Daniel Miessler's prompt.
- attack-tree: Perform attack tree analysis for the project.
--refinement-count: For github and file modes only. Number of iterations to refine the generated documentation (default: 1). Higher values may produce more detailed and polished output but will increase token usage.
--files-context-window: For dir mode only. Maximum token size for LLM context window. Automatically determined if not set.
--files-chunk-size: For dir mode only. Chunk size in tokens for splitting files. Automatically determined if not set.

Editor Configuration

--editor-provider: LLM provider for the editor (openai, openrouter, anthropic, google). Default is openai.
--editor-model: Model name for the editor. Default is gpt-4o.
--editor-temperature: Sampling temperature for the editor model. Default is 0.
--editor-max-turns-count: Maximum number of attempts the editor will try to fix markdown issues. Default is 3.
--node-path: Path to the Node.js binary. Attempts to auto-detect if not provided.

Environment Variables

Set one of the following environment variables based on your chosen LLM provider:

OPENAI_API_KEY
OPENROUTER_API_KEY
ANTHROPIC_API_KEY
GOOGLE_API_KEY

Usage Examples

Basic Usage Examples

Generate a security design document for python project (default):

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    -o security_design.md

Generate a threat model for python project:

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    -o threat_model.md \
    --agent-prompt-type threat-modeling

Analyze a GitHub repository:

poetry run python ai_security_analyzer/app.py \
    github \
    -t https://github.com/user/repo \
    -o security_analysis.md

Analyze a single file:

poetry run python ai_security_analyzer/app.py \
    file \
    -t examples/FLASK-o1-preview.md \
    -o attack_tree.md \
    --agent-prompt-type attack-tree

Advanced Configuration Examples

Custom file filtering with specific focus:

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    -o security_design.md \
    -p generic \
    --exclude "**/tests/**,**/docs/**" \
    --include "**/*.py,**/*.java" \
    --filter-keywords "security,auth,crypto,password"

This example:

Excludes test files, documentation, and LICENSE
Only includes Python and Java source files
Focuses on files containing security-related keywords

Using Anthropic's Claude model with custom temperature:

export ANTHROPIC_API_KEY=your_key_here
poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    -o security_design.md \
    --agent-provider anthropic \
    --agent-model claude-3-5-sonnet-20240620 \
    --agent-temperature 0.7 \
    --editor-provider anthropic \
    --editor-model claude-3-5-sonnet-20240620

Attack surface analysis with custom refinement count:

poetry run python ai_security_analyzer/app.py \
    github \
    -t https://github.com/user/repo \
    -o attack_surface.md \
    --agent-prompt-type attack-surface \
    --refinement-count 3

Using Google's Gemini model:

export GOOGLE_API_KEY=your_key_here
poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    -o security_design.md \
    --agent-provider google \
    --agent-model gemini-2.0-flash-thinking-exp \
    --agent-temperature 0 \
    --editor-provider google \
    --editor-model gemini-2.0-flash-thinking-exp

Project-Specific Examples

Java/Android project analysis:

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/android/project \
    -o security_design.md \
    --project-type android \
    --exclude "**/build/**,**/.gradle/**" \
    --include "**/*.xml"

JavaScript/Node.js project:

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/node/project \
    -o security_design.md \
    --project-type javascript \
    --exclude "**/node_modules/**" \
    --include "**/*.json" \
    --filter-keywords "auth,jwt,cookie,session"

Performance Optimization Examples

Dry run with token estimation:

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    --dry-run \
    --exclude "**/tests/**,**/docs/**" \
    --filter-keywords "security,auth"

Custom context window and chunk size:

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    -o security_design.md \
    --files-context-window 70000 \
    --files-chunk-size 50000

Verbose logging and debugging:

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    -o security_design.md \
    -v \
    -d

Output Customization Examples

Custom preamble for generated content:

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    -o security_design.md \
    --agent-preamble-enabled \
    --agent-preamble "# Security Analysis (AI Generated on $(date))"

Attack tree analysis with stdout output:

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    --agent-prompt-type attack-tree

Real World Examples

Check examples for real world examples, e.g. flask framework, requests library, etc.

Supported Project Types - for `dir` mode only

Python
Go
Java
Android
JavaScript
More to come...

In case you want to use a project type that is not supported, please use the generic project type with --include, --include-mode, --exclude, --exclude-mode options.

Example:

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    -o security_design.md \
    --project-type generic \
    --include "**/*.java"

Troubleshooting

Common Issues

Chunk Size Longer Than Specified

You may encounter a warning like:

langchain_text_splitters.base - WARNING - Created a chunk of size 78862, which is longer than the specified 70000

This warning indicates that some document chunks exceed the LLM's context window size. To resolve this, ensure that --files-chunk-size is lower than --files-context-window.

Example:

poetry run python ai_security_analyzer/app.py \
    dir \
    -t /path/to/your/project \
    --files-chunk-size 50000 \
    --files-context-window 70000

Node.js Not Found

If you receive an error indicating that Node.js is not found:

FileNotFoundError: Node.js binary not found. Please install Node.js.

Ensure that Node.js is installed and added to your system's PATH, or provide the path using the --node-path option.

OpenAI API Key Not Set

If you get an error about OPENAI_API_KEY:

Error: OPENAI_API_KEY not set in environment variables.

Make sure you've set the OPENAI_API_KEY environment variable:

export OPENAI_API_KEY=your_openai_api_key

Supported LLM Providers

OpenAI - Industry standard.
OpenRouter - Multi-model gateway.
Anthropic - Claude models.
Google - Gemini models.

Contributing

Contributions are welcome! Please open issues and pull requests. Ensure that you follow the existing code style and include tests for new features.

License

This project is licensed under the MIT License. You are free to use, modify, and distribute this software as per the terms of the license.

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.github/workflows		.github/workflows
ai_security_analyzer		ai_security_analyzer
examples		examples
images		images
node_src		node_src
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
build.sh		build.sh
entrypoint.sh		entrypoint.sh
package-lock.json		package-lock.json
package.json		package.json
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
validateMermaid.js		validateMermaid.js

License

xvnpw/ai-security-analyzer

Folders and files

Latest commit

History

Repository files navigation

AI Security Analyzer

Overview

Key Features

Prerequisites

Installation

From Source

Using Docker

In PowerShell (Windows):

In Bash (Linux/macOS):

Token Usage and Cost Management ⚠️

Understanding Token Consumption

Cost Control Best Practices 💰

Recommendations

Architecture

Application Flow for dir mode

Application Flow for github mode

Application Flow for file mode

Configuration

General Options

Input/Output Options

Agent Configuration

Editor Configuration

Environment Variables

Usage Examples

Basic Usage Examples

Advanced Configuration Examples

Project-Specific Examples

Performance Optimization Examples

Output Customization Examples

Real World Examples

Supported Project Types - for dir mode only

Troubleshooting

Common Issues

Chunk Size Longer Than Specified

Node.js Not Found

OpenAI API Key Not Set

Supported LLM Providers

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases 8

Packages 0

Languages

Application Flow for `dir` mode

Application Flow for `github` mode

Application Flow for `file` mode

Supported Project Types - for `dir` mode only

Packages