Skip to content

kavehshahedi/java-performance-evolution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

42 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

An Empirical Study on Method-Level Performance Evolution in Open-Source Java Projects

DOI License Python

This repository contains the replication package for our empirical study analyzing performance evolution patterns in Java projects at the method level. Our research provides comprehensive insights into how code changes impact performance across 15 mature open-source Java projects.

πŸ“– Abstract

Performance is a critical quality attribute in software development, yet the impact of method-level code changes on performance evolution remains poorly understood. We conducted a large-scale empirical study analyzing performance evolution in 15 mature open-source Java projects hosted on GitHub. Our analysis encompassed 739 commits containing 1,499 method-level code changes, using Java Microbenchmark Harness (JMH) for precise performance measurement and rigorous statistical analysis to quantify both the significance and magnitude of performance variations.

🎯 Research Questions

Our study addresses four key research questions:

  • RQ1: What are the patterns of performance changes in Java projects over time?
  • RQ2: How do different types of code changes correlate with performance impacts?
  • RQ3: How do developer experience and code change complexity relate to performance impact magnitude?
  • RQ4: Are there significant differences in performance evolution patterns across different domains or project sizes?

πŸ” Key Findings

  • 32.7% of method-level changes result in measurable performance impacts
  • Performance regressions occur 1.3 times more frequently than improvements (18.5% vs 14.2%)
  • No significant differences in performance impact distributions across code change categories
  • Algorithmic changes demonstrate the highest improvement potential (25.6%) but carry substantial regression risk (33.9%)
  • Senior developers produce more stable changes with fewer extreme variations
  • Domain-size interactions reveal significant patterns, with web server + small projects exhibiting the highest performance instability (42.2%)

πŸ—οΈ Repository Structure

β”œβ”€β”€ jperfevo/                          # Main analysis package
β”‚   β”œβ”€β”€ core/                          # Core analysis components
β”‚   β”‚   β”œβ”€β”€ agreement_analyzer.py      # Inter-rater agreement analysis (Cohen's ΞΊ)
β”‚   β”‚   β”œβ”€β”€ code_diff_generator.py     # Code difference visualization
β”‚   β”‚   β”œβ”€β”€ code_pair_generator.py     # Method pair extraction from Git history
β”‚   β”‚   β”œβ”€β”€ code_pair_inserter.py      # Database insertion utilities
β”‚   β”‚   β”œβ”€β”€ github_author_experience.py # Developer experience quantification
β”‚   β”‚   β”œβ”€β”€ method_complexity_analyzer.py # Code change complexity scoring
β”‚   β”‚   β”œβ”€β”€ method_mapper.py           # Method mapping between commit versions
β”‚   β”‚   └── performance_diff_significance.py # Statistical significance testing
β”‚   β”œβ”€β”€ models/                        # Data models
β”‚   β”‚   └── code_pair.py              # Code pair data structure
β”‚   β”œβ”€β”€ rq/                           # Research question analysis modules
β”‚   β”‚   β”œβ”€β”€ rq1.py                    # RQ1: Temporal performance patterns
β”‚   β”‚   β”œβ”€β”€ rq2.py                    # RQ2: Code change type analysis
β”‚   β”‚   β”œβ”€β”€ rq3.py                    # RQ3: Developer experience & complexity
β”‚   β”‚   └── rq4.py                    # RQ4: Domain and size analysis
β”‚   └── services/                     # Utility services
β”‚       β”œβ”€β”€ db_service.py             # MongoDB database operations
β”‚       └── similarity_service.py     # Code similarity analysis
β”œβ”€β”€ jphb-performance-data/             # Raw performance measurement data
β”‚   β”œβ”€β”€ chronicle-core/               # Performance data per project
β”‚   β”œβ”€β”€ client-java/                  
β”‚   β”œβ”€β”€ objenesis/                    
β”‚   └── protostuff/                   
β”‚       └── performance_data.json     # JMH benchmark execution results
β”œβ”€β”€ results/                          # Processed analysis results
β”‚   β”œβ”€β”€ [project-name]/              # Per-project analysis results
β”‚   β”‚   β”œβ”€β”€ author_experiences.json   # Developer experience scores
β”‚   β”‚   β”œβ”€β”€ method_complexities.json  # Code change complexity metrics
β”‚   β”‚   β”œβ”€β”€ method_mappings.json      # Method version mappings
β”‚   β”‚   └── labelings.json           # Code change type classifications
β”‚   β”œβ”€β”€ apm-agent-java/              # 15 projects total with results
β”‚   β”œβ”€β”€ chronicle-core/              
β”‚   β”œβ”€β”€ client-java/                 
β”‚   β”œβ”€β”€ feign/                       
β”‚   β”œβ”€β”€ hdrhistogram/                
β”‚   β”œβ”€β”€ jctools/                     
β”‚   β”œβ”€β”€ jdbi/                        
β”‚   β”œβ”€β”€ jersey/                      
β”‚   β”œβ”€β”€ jetty/                       
β”‚   β”œβ”€β”€ netty/                       
β”‚   β”œβ”€β”€ objenesis/                   
β”‚   β”œβ”€β”€ protostuff/                  
β”‚   β”œβ”€β”€ simpleflatmapper/            
β”‚   └── zipkin/                      
β”œβ”€β”€ projects.json                     # Study dataset configuration
β”œβ”€β”€ statistics.json                   # Project statistics and metadata
β”œβ”€β”€ requirements.txt                  # Python dependencies
└── README.md                        # This file

πŸ“Š Dataset

Our study analyzes 15 mature open-source Java projects across diverse domains:

Project Domain KLOC Method Changes Benchmarks Results Available
jetty Web Server 339.06 2,472 12,720 βœ…
netty Networking 216.98 4,241 7,669 βœ…
fastjson2 Data Processing 178.5 1,726 3,725 βœ…
apm-agent-java Monitoring 80.22 891 2,984 βœ…
jersey Web Server 158.91 310 781 βœ…
simpleflatmapper Data Processing 51.79 911 1,969 βœ…
protostuff Data Processing 42.29 448 1,354 βœ…
jctools System Programming 31.48 339 1,042 βœ…
jdbi Data Processing 28.49 1,266 1,919 βœ…
client-java Monitoring 27.38 155 667 βœ…
zipkin Monitoring 23.51 656 2,726 βœ…
feign Web Server 17.42 351 1,384 βœ…
chronicle-core System Programming 13.25 780 3,170 βœ…
hdrhistogram Monitoring 8.89 158 317 βœ…
objenesis Testing 2.69 107 784 βœ…

Total: 1,499 method-level changes across 739 commits

Data Completeness

This replication package includes complete processed datasets, making it immediately usable for:

  • βœ… Verification of statistical analyses without recomputation
  • βœ… Extension studies using our methodology on new projects
  • βœ… Comparative analyses across different project characteristics
  • βœ… Educational use for performance analysis techniques

All results in the results/ directory are derived from ~80 machine days of computation, providing reviewers with immediate access to the complete study dataset.

πŸš€ Setup and Installation

Prerequisites

  • Python 3.8+
  • Java 8+ (for benchmark execution)
  • MongoDB (for data storage)
  • Git (for repository cloning)

Installation

  1. Clone the repository:
git clone https://github.com/mooselab/empirical-java-performance-evolution
cd empirical-java-performance-evolution
  1. Create and activate virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up environment variables: Create a .env file in the root directory:
DB_NAME=your_db_name
DB_URL=your_mongodb_url
CLOUD_DB_URL=your_cloud_mongodb_url  # Optional
GITHUB_TOKEN=your_github_token       # For API access

πŸ“ˆ Usage

Quick Start with Processed Data

The repository includes processed results for immediate analysis. You can directly run research question analyses on the provided datasets:

# Analyze temporal performance patterns using processed data
python -m jperfevo.rq.rq1

# Examine code change type impacts
python -m jperfevo.rq.rq2

# Explore developer experience and complexity relationships
python -m jperfevo.rq.rq3

# Investigate domain and size patterns
python -m jperfevo.rq.rq4

πŸ”¬ Methodology

Performance Measurement

  • Microbenchmarking: Java Microbenchmark Harness (JMH) with 3 forks and 5 iterations (15 iterations in total)
  • Instrumentation: Custom Java bytecode instrumentation for method-level metrics (see JIB)
  • Statistical Analysis: Mann-Whitney U-test (p < 0.05) and Cliff's Delta effect size (|Ξ΄| β‰₯ 0.147)

Code Change Classification

We categorize method-level changes into seven types:

  • ALG: Algorithmic Change
  • CF: Control Flow
  • DS: Data Structure & Variable
  • REF: Refactoring & Code Cleanup
  • ER: Exception & Return Handling
  • CON: Concurrency
  • API: API/Library Call

Developer Experience Scoring

Multi-dimensional experience quantification based on:

  • GitHub account age (20% weight)
  • Project-specific contributions (30% weight)
  • Total contributions across projects (25% weight)
  • Code review participation (25% weight)

πŸ“Š Results and Visualizations

The analysis generates comprehensive visualizations and statistics:

  • Temporal patterns across project lifecycles
  • Code change impact distributions by category
  • Developer experience correlations with performance outcomes
  • Domain-specific and size-based patterns
  • Statistical significance tests for all major findings

Results are saved in the plots/ directory organized by research question.

πŸ”„ Reproducibility

Data Availability

This replication package provides complete processed datasets including:

  • Study analysis dataset: The dataset containing all the information aggregately in dataset/dataset.csv
  • Performance measurements: Raw JMH benchmark execution results in jphb-performance-data/
  • Method mappings: Version tracking across 1,499 method changes in results/*/method_mappings.json
  • Developer experience scores: Quantified contributor expertise in results/*/author_experiences.json
  • Code change complexity metrics: Weighted complexity scores in results/*/method_complexities.json
  • Code change classifications: Manual labels with inter-rater agreement (ΞΊ = 0.96) in results/*/labelings.json
  • Statistical test results: All significance tests and effect sizes embedded in analysis scripts

Processed Results Structure

Each project in results/ contains:

project-name/
β”œβ”€β”€ author_experiences.json    # Developer experience quantification
β”œβ”€β”€ method_complexities.json   # Code change complexity analysis  
β”œβ”€β”€ method_mappings.json       # Method version mappings with performance data
└── labelings.json            # Code change type classifications (when available)

🀝 Contributing

We welcome contributions to improve the analysis tools and extend the study:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/improvement)
  3. Commit your changes (git commit -am 'Add improvement')
  4. Push to the branch (git push origin feature/improvement)
  5. Create a Pull Request

πŸ“§ Contact

Department of Computer Engineering and Software Engineering
Polytechnique MontrΓ©al, Canada

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Natural Sciences and Engineering Research Council of Canada (NSERC) - Grant #RGPIN-2021-03900
  • Open-source projects and their maintainers for providing high-quality benchmarks
  • Research community for foundational work in performance analysis

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Languages