SQLTraceBench is an innovative open-source project that transforms real SQL traces and database schemas into comprehensive, cross-database benchmark workloads. Our mission is to enable seamless performance comparison and validation across different database systems through intelligent trace analysis, schema conversion, and workload generation.
- Cross-Database Migration Challenges: Organizations struggle to validate performance when migrating between database systems (StarRocks ↔ ClickHouse, MySQL → TiDB, etc.)
- Lack of Real-World Benchmarks: Traditional benchmarks like TPC-H don't reflect your actual workload patterns
- Manual Effort in Performance Testing: Converting schemas and adapting queries across databases is time-intensive and error-prone
- Inconsistent Load Testing: Difficulty in generating realistic, parameterized workloads that mirror production traffic
SQLTraceBench addresses these pain points by:
✅ Automated Cross-Database Schema Conversion - Transform schemas between StarRocks, ClickHouse, Doris, MySQL, PostgreSQL, and more
✅ Intelligent SQL Trace Analysis - Parse real SQL traces and extract meaningful patterns
✅ Template-Based Workload Generation - Convert queries into parameterized templates with realistic data distributions
✅ Controllable Load Simulation - Adjust QPS, concurrency, hotspot ratios, and selectivity parameters
✅ Comprehensive Validation Framework - Compare generated benchmarks against original traces with detailed deviation analysis
- Multi-Database Support: StarRocks, ClickHouse, Doris, MySQL, PostgreSQL, TiDB, OceanBase, MongoDB
- Trace-Driven Analysis: Convert real SQL traces into reproducible benchmark workloads
- Schema Translation: Automated conversion of database schemas across different systems
- Parameterization Engine: Extract parameter distributions from real traces for realistic data generation
- Load Control: Fine-tune QPS, concurrency, and hotspot distribution
- Validation & Reporting: Comprehensive comparison between original and synthetic workloads
- Plugin Architecture: Extensible framework for adding new database support
- Data Synthesis: Generate realistic datasets based on actual data characteristics
- Performance Metrics: Track QPS distribution, latency percentiles, row counts, and hotspot coverage
- Deviation Analysis: Identify and minimize differences between real and synthetic workloads
- Integration Ready: Built-in support for existing benchmark tools and frameworks
SQLTraceBench follows a modular, plugin-based architecture designed for extensibility and maintainability. For detailed technical architecture, see our Architecture Documentation.
graph LR
A[SQL Traces + Schema] --> B[Parser Engine]
B --> C[Template Generator]
C --> D[Parameter Modeler]
D --> E[Schema Converter]
E --> F[Workload Generator]
F --> G[Benchmark Executor]
G --> H[Validation Reporter]
go install github.com/turtacn/SQLTraceBench/cmd/sql_trace_bench@latest
# Download from releases
curl -LO https://github.com/turtacn/SQLTraceBench/releases/latest/download/sql_trace_bench_linux_amd64.tar.gz
tar -xzf sql_trace_bench_linux_amd64.tar.gz
sudo mv sql_trace_bench /usr/local/bin/
git clone https://github.com/turtacn/SQLTraceBench.git
cd SQLTraceBench
make build
# Convert StarRocks traces to ClickHouse benchmark
sql_trace_bench convert \
--source-db starrocks \
--target-db clickhouse \
--schema ./examples/tpcc_schema.sql \
--traces ./examples/tpcc_traces.jsonl \
--output ./output/
# Generate synthetic workload with custom parameters
sql_trace_bench generate \
--template-dir ./output/templates/ \
--param-model ./output/param_model.json \
--qps 100 \
--duration 5m \
--hotspot-ratio 0.8 \
--output ./workload/
# Execute benchmark and validate results
sql_trace_bench run \
--workload-dir ./workload/ \
--db-config ./config/clickhouse.yaml \
--validate \
--report ./results/
Input Schema (TPC-C Example):
-- examples/tpcc_schema.sql
CREATE TABLE warehouse (
w_id INT PRIMARY KEY,
w_name VARCHAR(10),
w_street_1 VARCHAR(20),
w_city VARCHAR(20),
w_state CHAR(2),
w_zip CHAR(9),
w_tax DECIMAL(4,2),
w_ytd DECIMAL(12,2)
) ENGINE=OLAP
DISTRIBUTED BY HASH(w_id);
Input Trace:
{"timestamp": "2025-08-15T10:00:01Z", "query": "SELECT w_name, w_tax FROM warehouse WHERE w_id = 1", "execution_time_ms": 2.5, "rows_returned": 1}
{"timestamp": "2025-08-15T10:00:02Z", "query": "SELECT COUNT(*) FROM warehouse WHERE w_state = 'NY'", "execution_time_ms": 15.0, "rows_returned": 1}
Generated Output:
-- Output: ClickHouse Schema
CREATE TABLE warehouse (
w_id Int32,
w_name String,
w_street_1 String,
w_city String,
w_state FixedString(2),
w_zip FixedString(9),
w_tax Decimal(4,2),
w_ytd Decimal(12,2)
) ENGINE = MergeTree()
ORDER BY w_id;
Run make demo
to generate this demonstration or see demo/README.md for creating your own demo.
Database | Schema Conversion | Query Translation | Status |
---|---|---|---|
StarRocks | ✅ | ✅ | Stable |
ClickHouse | ✅ | ✅ | Stable |
Apache Doris | ✅ | ✅ | Beta |
MySQL | ✅ | ✅ | Beta |
PostgreSQL | ✅ | ✅ | Planning |
TiDB | ✅ | ✅ | Planning |
OceanBase | 🔄 | 🔄 | Development |
MongoDB | 🔄 | 🔄 | Planning |
We welcome contributions from the community! SQLTraceBench is built by developers, for developers.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
# Clone and setup development environment
git clone https://github.com/turtacn/SQLTraceBench.git
cd SQLTraceBench
make setup-dev
# Run tests
make test
# Run linting
make lint
- 🔧 Database Plugins: Add support for new database systems
- 📊 Query Analyzers: Improve SQL parsing and template extraction
- 🎯 Load Generators: Enhance workload generation strategies
- 📚 Documentation: Help us improve docs and examples
- 🧪 Testing: Add test cases and improve test coverage
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
SQLTraceBench builds upon and integrates with several excellent open-source projects:
- StarRocks SQLTransformer for SQL translation capabilities
- ClickHouse TPC-DS for benchmark methodology
- ANTLR for SQL parsing infrastructure
- 💬 Discussions: GitHub Discussions
- 🐛 Issues: GitHub Issues
- 📧 Email: sqltracebench@turtacn.com
- 🌟 Star us on GitHub if SQLTraceBench helps you!