Skip to content

Enterprise-grade SQL Agent using PydanticAI and Azure OpenAI for secure natural language to SQL conversion with comprehensive safety validation

Notifications You must be signed in to change notification settings

abbaouiAchraf/advanced-sqlagent-cookbook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– Advanced PydanticAI SQL Agent

Python 3.9+ License: MIT Docker PydanticAI

A production-ready, enterprise-grade SQL Agent that converts natural language questions into safe SQL queries using PydanticAI and Azure OpenAI. Built with comprehensive security mechanisms, type-safe responses, and multi-interface support.

✨ Key Features

πŸ›‘οΈ Security & Safety

  • Multi-layered SQL injection prevention with pattern matching and validation
  • Dangerous operation blocking (INSERT, UPDATE, DELETE, DROP, etc.)
  • Query complexity analysis and performance optimization
  • Rate limiting (30 queries/minute) with abuse prevention
  • Comprehensive audit logging with security monitoring

🎯 Type Safety & Validation

  • Type-safe responses with Pydantic models (Success, InvalidRequest, DangerousQuery)
  • Structured output validation ensuring consistent API responses
  • Input sanitization and query normalization
  • Comprehensive error handling with user-friendly messages

πŸ”§ Enterprise Ready

  • Azure OpenAI integration with retry logic and failover
  • Multi-database support (SQLite, PostgreSQL, MySQL, SQL Server)
  • Docker deployment with production-ready configurations
  • Connection pooling and performance optimization
  • Environment-based configuration with secrets management

πŸ–₯️ Multiple Interfaces

  • Interactive CLI with natural language processing
  • Streamlit web interface with real-time query visualization
  • REST API ready architecture for integration
  • Comprehensive examples and usage patterns

πŸ“Š Advanced Analytics

  • Structured analytics queries with pre-built templates
  • Dynamic query generation based on business metrics
  • Data visualization support with export capabilities
  • Sample data with realistic e-commerce scenarios

πŸš€ Quick Start

Option 1: Python Installation

Prerequisites

  • Python 3.9 or higher
  • Azure OpenAI account with deployment

Setup

# 1. Clone the repository
git clone https://github.com/abbaouiAchraf/advanced-sqlagent-cookbook.git
cd advanced-sqlagent-cookbook

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -e .

# 4. Configure environment
cp .env.example .env
# Edit .env with your Azure OpenAI credentials

# 5. Run the application
python main.py

Option 2: Docker Installation (Recommended)

Prerequisites

  • Docker and Docker Compose
  • Azure OpenAI credentials

Quick Start

# 1. Clone and configure
git clone https://github.com/abbaouiAchraf/advanced-sqlagent-cookbook.git
cd advanced-sqlagent-cookbook
cp .env.example .env
# Edit .env with your Azure OpenAI credentials

# 2. Start with Docker
docker-compose up --build

# 3. Access applications
# - Web Interface: http://localhost:8502
# - Interactive CLI: docker exec -it advanced-sql-agent python main.py

βš™οΈ Configuration

Required Environment Variables

# Azure OpenAI Configuration (Required)
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your-api-key-here
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4
AZURE_OPENAI_API_VERSION=2024-06-01

# Database Configuration (Optional - defaults to SQLite)
DATABASE_URL=sqlite:///./sql_agent.db

# Security Configuration (Optional)
ENABLE_SAFETY_CHECKS=true
RATE_LIMIT_QPM=30
MAX_RETRIES=5

# Logging Configuration (Optional)
LOG_LEVEL=INFO

Database Options

# SQLite (Default - no setup required)
DATABASE_URL=sqlite:///./data/sql_agent.db

# PostgreSQL
DATABASE_URL=postgresql://username:password@localhost:5432/database_name

# SQL Server
DATABASE_URL=mssql+pyodbc://username:password@server/database?driver=ODBC+Driver+17+for+SQL+Server

# MySQL
DATABASE_URL=mysql+pymysql://username:password@localhost:3306/database_name

πŸ’» Usage Examples

Interactive CLI Mode

python main.py

# Choose option 1 for interactive mode
# Example questions:
# - "How many customers do we have?"
# - "What is the total revenue by country?"
# - "Show me the top 5 products by sales"
# - "Which customers haven't placed orders recently?"

Web Interface

streamlit run examples/streamlit_app.py
# Or with Docker: http://localhost:8502

# Features:
# - Real-time query execution
# - Visual results with charts
# - Query history and export
# - Security metrics dashboard

Programmatic Usage

from src import EnhancedPydanticSQLAgent, DatabaseManager, settings

# Initialize
db_manager = DatabaseManager(settings.database_url)
agent = EnhancedPydanticSQLAgent(db_manager, user_id="demo")

# Query
result = await agent.query("How many orders were placed this month?")

if isinstance(result, Success):
    print(f"SQL: {result.sql_query}")
    print(f"Data: {result.data}")
    print(f"Rows: {result.row_count}")

Sample Queries You Can Try

# Business Intelligence
"What's our total revenue this year?"
"Show me top customers by order value"
"Which products are running low on stock?"

# Analytics
"Average order value by country"
"Monthly sales trends"
"Customer retention analysis"

# Security Test (These will be blocked)
"DELETE FROM customers"  # πŸ›‘οΈ Blocked
"DROP TABLE orders"      # πŸ›‘οΈ Blocked

🐳 Docker Deployment

Development Environment

# Start all services with SQLite
docker-compose up --build

# Access points:
# - Web UI: http://localhost:8502
# - CLI: docker exec -it advanced-sql-agent python main.py

Production with PostgreSQL

# Start with PostgreSQL backend
docker-compose --profile postgres up -d --build

# Services:
# - PostgreSQL: localhost:5432
# - Web UI: http://localhost:8503
# - API ready for integration

Available Services

Service Port Description
sql-agent 8501 Interactive CLI application
streamlit-web 8502 Web interface (SQLite)
postgres 5432 PostgreSQL database
sql-agent-postgres 8503 Web interface (PostgreSQL)

For detailed Docker documentation, see docker/README.md.

πŸ“ Project Structure

Advanced-SqlAgent-cookbook/
β”œβ”€β”€ src/                          # Core application code
β”‚   β”œβ”€β”€ agents/                   # SQL Agent implementations
β”‚   β”‚   └── sql_agent.py         # Main PydanticAI agent
β”‚   β”œβ”€β”€ config/                   # Configuration management
β”‚   β”‚   β”œβ”€β”€ azure_config.py      # Azure OpenAI setup
β”‚   β”‚   └── settings.py          # Environment settings
β”‚   β”œβ”€β”€ database/                 # Database management
β”‚   β”‚   β”œβ”€β”€ connection.py        # Connection pooling
β”‚   β”‚   β”œβ”€β”€ manager.py           # Database operations
β”‚   β”‚   └── schema.py            # Schema introspection
β”‚   β”œβ”€β”€ models/                   # Pydantic models
β”‚   β”‚   β”œβ”€β”€ database_models.py   # Database schemas
β”‚   β”‚   β”œβ”€β”€ request_models.py    # Request validation
β”‚   β”‚   └── response_models.py   # Response types
β”‚   β”œβ”€β”€ safety/                   # Security & validation
β”‚   β”‚   β”œβ”€β”€ security.py          # Audit & rate limiting
β”‚   β”‚   └── validators.py        # SQL safety checks
β”‚   └── utils/                    # Utilities
β”‚       β”œβ”€β”€ helpers.py           # SQL formatting & analysis
β”‚       └── logging.py           # Logging configuration
β”œβ”€β”€ examples/                     # Usage examples
β”‚   β”œβ”€β”€ basic_usage.py           # Simple queries demo
β”‚   β”œβ”€β”€ advanced_analytics.py   # Analytics examples
β”‚   └── streamlit_app.py         # Web interface
β”œβ”€β”€ scripts/                      # Database initialization
β”‚   β”œβ”€β”€ init_database.py         # SQLite sample data
β”‚   └── init_postgres.sql        # PostgreSQL setup
β”œβ”€β”€ docker/                       # Docker documentation
β”‚   └── README.md                # Deployment guide
β”œβ”€β”€ tests/                        # Test suite
β”œβ”€β”€ main.py                       # Interactive application
β”œβ”€β”€ docker-compose.yml           # Multi-service setup
β”œβ”€β”€ Dockerfile                   # Container definition
β”œβ”€β”€ pyproject.toml               # Dependencies & config
└── .env.example                 # Environment template

πŸ—οΈ Architecture Overview

Core Components

  • EnhancedPydanticSQLAgent: Main agent using PydanticAI framework
  • DatabaseManager: Handles connections and query execution
  • SQLSafetyValidator: Multi-layer security validation
  • Response Models: Type-safe response handling (Success, InvalidRequest, DangerousQuery)

Security Layers

  1. Input Validation: Pydantic model validation
  2. Pattern Matching: Dangerous operation detection
  3. SQL Parsing: Syntax and structure validation
  4. Rate Limiting: Abuse prevention (30 queries/minute)
  5. Audit Logging: Security monitoring and tracking

Data Flow

User Question β†’ Input Validation β†’ Azure OpenAI β†’ SQL Generation β†’ 
Safety Validation β†’ Database Execution β†’ Response Formatting β†’ User

πŸ› οΈ Development

Setup Development Environment

# Install development dependencies
pip install -e .[dev]

# Run tests
pytest --cov=src --cov-report=term-missing

# Code formatting
black src/ tests/ examples/
isort src/ tests/ examples/

# Type checking
mypy src/

# Linting
ruff check src/ tests/ examples/

Sample Data

The project includes comprehensive sample data:

  • 15 customers from various countries
  • 25 products across 5 categories (Electronics, Footwear, Clothing, Furniture, Appliances)
  • 40 orders with realistic statuses and pricing
  • E-commerce relationships with proper foreign keys

🚨 Troubleshooting

Common Issues

Azure OpenAI Connection

# Error: "Azure OpenAI connection failed"
# Solution: Verify your .env configuration
cat .env | grep AZURE_OPENAI

# Check endpoint format (should end with /)
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/

Database Connection

# SQLite permission error
chmod 644 ./data/sql_agent.db

# PostgreSQL connection timeout
# Solution: Increase connection timeout in settings
CONNECTION_POOL_TIMEOUT=60

Docker Issues

# Port already in use
docker-compose down  # Stop existing services
lsof -i :8502        # Find process using port
kill -9 <PID>        # Kill the process

# Container build fails
docker system prune -a  # Clean Docker cache
docker-compose build --no-cache

Rate Limiting

# Error: "Rate limit exceeded"
# Solution: Wait 1 minute or adjust rate limit
RATE_LIMIT_QPM=60  # Increase from default 30

Getting Help

  1. Check the logs: docker-compose logs sql-agent
  2. Verify configuration: Ensure .env file has correct values
  3. Test database: Run python scripts/init_database.py
  4. Security metrics: Check python main.py β†’ option 1 β†’ security status

πŸ“š Additional Resources

Documentation

Examples and Tutorials

  • examples/basic_usage.py - Simple query examples
  • examples/advanced_analytics.py - Business intelligence queries
  • examples/streamlit_app.py - Web interface implementation

Security

  • SQL injection prevention patterns
  • Rate limiting implementation
  • Audit logging and monitoring
  • Query complexity analysis

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes and add tests
  4. Run the test suite: pytest
  5. Format code: black . && isort .
  6. Commit changes: git commit -m 'Add amazing feature'
  7. Push to branch: git push origin feature/amazing-feature
  8. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

πŸ“§ Contact

ABBAOUI Achraf


⭐ Star this repository if you find it useful! ⭐

About

Enterprise-grade SQL Agent using PydanticAI and Azure OpenAI for secure natural language to SQL conversion with comprehensive safety validation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published