This repository contains the replication package for our empirical study analyzing performance evolution patterns in Java projects at the method level. Our research provides comprehensive insights into how code changes impact performance across 15 mature open-source Java projects.
Performance is a critical quality attribute in software development, yet the impact of method-level code changes on performance evolution remains poorly understood. We conducted a large-scale empirical study analyzing performance evolution in 15 mature open-source Java projects hosted on GitHub. Our analysis encompassed 739 commits containing 1,499 method-level code changes, using Java Microbenchmark Harness (JMH) for precise performance measurement and rigorous statistical analysis to quantify both the significance and magnitude of performance variations.
Our study addresses four key research questions:
- RQ1: What are the patterns of performance changes in Java projects over time?
- RQ2: How do different types of code changes correlate with performance impacts?
- RQ3: How do developer experience and code change complexity relate to performance impact magnitude?
- RQ4: Are there significant differences in performance evolution patterns across different domains or project sizes?
- 32.7% of method-level changes result in measurable performance impacts
- Performance regressions occur 1.3 times more frequently than improvements (18.5% vs 14.2%)
- No significant differences in performance impact distributions across code change categories
- Algorithmic changes demonstrate the highest improvement potential (25.6%) but carry substantial regression risk (33.9%)
- Senior developers produce more stable changes with fewer extreme variations
- Domain-size interactions reveal significant patterns, with web server + small projects exhibiting the highest performance instability (42.2%)
βββ jperfevo/ # Main analysis package
β βββ core/ # Core analysis components
β β βββ agreement_analyzer.py # Inter-rater agreement analysis (Cohen's ΞΊ)
β β βββ code_diff_generator.py # Code difference visualization
β β βββ code_pair_generator.py # Method pair extraction from Git history
β β βββ code_pair_inserter.py # Database insertion utilities
β β βββ github_author_experience.py # Developer experience quantification
β β βββ method_complexity_analyzer.py # Code change complexity scoring
β β βββ method_mapper.py # Method mapping between commit versions
β β βββ performance_diff_significance.py # Statistical significance testing
β βββ models/ # Data models
β β βββ code_pair.py # Code pair data structure
β βββ rq/ # Research question analysis modules
β β βββ rq1.py # RQ1: Temporal performance patterns
β β βββ rq2.py # RQ2: Code change type analysis
β β βββ rq3.py # RQ3: Developer experience & complexity
β β βββ rq4.py # RQ4: Domain and size analysis
β βββ services/ # Utility services
β βββ db_service.py # MongoDB database operations
β βββ similarity_service.py # Code similarity analysis
βββ jphb-performance-data/ # Raw performance measurement data
β βββ chronicle-core/ # Performance data per project
β βββ client-java/
β βββ objenesis/
β βββ protostuff/
β βββ performance_data.json # JMH benchmark execution results
βββ results/ # Processed analysis results
β βββ [project-name]/ # Per-project analysis results
β β βββ author_experiences.json # Developer experience scores
β β βββ method_complexities.json # Code change complexity metrics
β β βββ method_mappings.json # Method version mappings
β β βββ labelings.json # Code change type classifications
β βββ apm-agent-java/ # 15 projects total with results
β βββ chronicle-core/
β βββ client-java/
β βββ feign/
β βββ hdrhistogram/
β βββ jctools/
β βββ jdbi/
β βββ jersey/
β βββ jetty/
β βββ netty/
β βββ objenesis/
β βββ protostuff/
β βββ simpleflatmapper/
β βββ zipkin/
βββ projects.json # Study dataset configuration
βββ statistics.json # Project statistics and metadata
βββ requirements.txt # Python dependencies
βββ README.md # This file
Our study analyzes 15 mature open-source Java projects across diverse domains:
Project | Domain | KLOC | Method Changes | Benchmarks | Results Available |
---|---|---|---|---|---|
jetty | Web Server | 339.06 | 2,472 | 12,720 | β |
netty | Networking | 216.98 | 4,241 | 7,669 | β |
fastjson2 | Data Processing | 178.5 | 1,726 | 3,725 | β |
apm-agent-java | Monitoring | 80.22 | 891 | 2,984 | β |
jersey | Web Server | 158.91 | 310 | 781 | β |
simpleflatmapper | Data Processing | 51.79 | 911 | 1,969 | β |
protostuff | Data Processing | 42.29 | 448 | 1,354 | β |
jctools | System Programming | 31.48 | 339 | 1,042 | β |
jdbi | Data Processing | 28.49 | 1,266 | 1,919 | β |
client-java | Monitoring | 27.38 | 155 | 667 | β |
zipkin | Monitoring | 23.51 | 656 | 2,726 | β |
feign | Web Server | 17.42 | 351 | 1,384 | β |
chronicle-core | System Programming | 13.25 | 780 | 3,170 | β |
hdrhistogram | Monitoring | 8.89 | 158 | 317 | β |
objenesis | Testing | 2.69 | 107 | 784 | β |
Total: 1,499 method-level changes across 739 commits
This replication package includes complete processed datasets, making it immediately usable for:
- β Verification of statistical analyses without recomputation
- β Extension studies using our methodology on new projects
- β Comparative analyses across different project characteristics
- β Educational use for performance analysis techniques
All results in the results/
directory are derived from ~80 machine days of computation, providing reviewers with immediate access to the complete study dataset.
- Python 3.8+
- Java 8+ (for benchmark execution)
- MongoDB (for data storage)
- Git (for repository cloning)
- Clone the repository:
git clone https://github.com/mooselab/empirical-java-performance-evolution
cd empirical-java-performance-evolution
- Create and activate virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
Create a
.env
file in the root directory:
DB_NAME=your_db_name
DB_URL=your_mongodb_url
CLOUD_DB_URL=your_cloud_mongodb_url # Optional
GITHUB_TOKEN=your_github_token # For API access
The repository includes processed results for immediate analysis. You can directly run research question analyses on the provided datasets:
# Analyze temporal performance patterns using processed data
python -m jperfevo.rq.rq1
# Examine code change type impacts
python -m jperfevo.rq.rq2
# Explore developer experience and complexity relationships
python -m jperfevo.rq.rq3
# Investigate domain and size patterns
python -m jperfevo.rq.rq4
- Microbenchmarking: Java Microbenchmark Harness (JMH) with 3 forks and 5 iterations (15 iterations in total)
- Instrumentation: Custom Java bytecode instrumentation for method-level metrics (see JIB)
- Statistical Analysis: Mann-Whitney U-test (p < 0.05) and Cliff's Delta effect size (|Ξ΄| β₯ 0.147)
We categorize method-level changes into seven types:
- ALG: Algorithmic Change
- CF: Control Flow
- DS: Data Structure & Variable
- REF: Refactoring & Code Cleanup
- ER: Exception & Return Handling
- CON: Concurrency
- API: API/Library Call
Multi-dimensional experience quantification based on:
- GitHub account age (20% weight)
- Project-specific contributions (30% weight)
- Total contributions across projects (25% weight)
- Code review participation (25% weight)
The analysis generates comprehensive visualizations and statistics:
- Temporal patterns across project lifecycles
- Code change impact distributions by category
- Developer experience correlations with performance outcomes
- Domain-specific and size-based patterns
- Statistical significance tests for all major findings
Results are saved in the plots/
directory organized by research question.
This replication package provides complete processed datasets including:
- Study analysis dataset: The dataset containing all the information aggregately in
dataset/dataset.csv
- Performance measurements: Raw JMH benchmark execution results in
jphb-performance-data/
- Method mappings: Version tracking across 1,499 method changes in
results/*/method_mappings.json
- Developer experience scores: Quantified contributor expertise in
results/*/author_experiences.json
- Code change complexity metrics: Weighted complexity scores in
results/*/method_complexities.json
- Code change classifications: Manual labels with inter-rater agreement (ΞΊ = 0.96) in
results/*/labelings.json
- Statistical test results: All significance tests and effect sizes embedded in analysis scripts
Each project in results/
contains:
project-name/
βββ author_experiences.json # Developer experience quantification
βββ method_complexities.json # Code change complexity analysis
βββ method_mappings.json # Method version mappings with performance data
βββ labelings.json # Code change type classifications (when available)
We welcome contributions to improve the analysis tools and extend the study:
- Fork the repository
- Create a feature branch (
git checkout -b feature/improvement
) - Commit your changes (
git commit -am 'Add improvement'
) - Push to the branch (
git push origin feature/improvement
) - Create a Pull Request
- Kaveh Shahedi - kaveh.shahedi@polymtl.ca
- Heng Li (Corresponding Author) - heng.li@polymtl.ca
Department of Computer Engineering and Software Engineering
Polytechnique MontrΓ©al, Canada
This project is licensed under the MIT License - see the LICENSE file for details.
- Natural Sciences and Engineering Research Council of Canada (NSERC) - Grant #RGPIN-2021-03900
- Open-source projects and their maintainers for providing high-quality benchmarks
- Research community for foundational work in performance analysis