-
Notifications
You must be signed in to change notification settings - Fork 33
Compile boost (minimal) from source, add more documentation, updated results #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
HFTrader
wants to merge
21
commits into
rust-leipzig:master
Choose a base branch
from
HFTrader:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Added more info about the new libraries added.
…ng, cleanup spreadsheet script
- Fix typo in project name (RegexPeformance -> RegexPerformance) - Fix inconsistent C++ standard (changed -std=c++11 to -std=c++20) - Add Excel temporary files to .gitignore (.~lock.*#, *.tmp)
- Enhanced CMakeLists.txt with -mtune=native alongside existing -march=native - Updated build_deps_simple.sh with -march=native -mtune=native for all dependencies - Added comprehensive documentation for modern Clang 19.1.6 toolchain build process - All 11 regex engines now built with CPU-specific optimizations - Performance improvements observed across all regex engines - Rust components automatically use -C target-cpu=native via Cargo
src/CMakeLists.txt: - Fix RE2 linking with proper CMake target-based approach - Use find_package(re2) and re2::re2 target instead of manual linking - Resolve complex RE2/Abseil dependency issues src/main.c: - Add robust file loading with proper error handling and bounds checking - Implement memory usage tracking and reporting via /proc/self/status - Add comprehensive statistical analysis with outlier detection - Include 95% confidence intervals and measurement stability indicators - Add cross-engine result validation to detect discrepancies - Implement JIT engine warmup cycles for better performance accuracy - Enhanced CSV export with memory usage data - Add memory vs speed analysis with trade-off calculations src/main.h: - Extend result structure with memory tracking fields - Add statistical confidence interval fields - Include function declarations for new utility functions These changes significantly improve the reliability, accuracy, and analytical capabilities of the regex performance benchmarking tool.
- Change default library inclusion from "local" to "system" - Replace complex ExternalProject configurations with simple find_library calls - Simplify build process to use pre-built dependencies from vendor/local/ - Add explicit git executable specification for better toolchain compatibility - Remove legacy Boost, Hyperscan, Oniguruma, RE2, TRE, PCRE2, CTRE, and YARA build scripts - Use modern CMake approach with locally built static libraries This refactoring aligns with the new build_deps_simple.sh approach where dependencies are pre-built with native optimizations and discovered via CMake's find_library mechanism.
- Implement configurable timeout mechanism with 1-second default - Add timeout checking to all regex engine timing loops - Integrate Hyperscan build support in dependency script - Fix YARA CMake configuration for proper linking - Add command line option for timeout configuration (-t flag)
- Remove hyperscan, oniguruma, re2, and tre from Git tracking - Add vendor dependency directories to .gitignore - Dependencies will be rebuilt by build scripts as needed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A lot of criticism on Reddit about boost being ancient so I added an ExternalProject to compile boost from source. We are not compiling nor downloading the entire boost though, only libs/regex and a handful of its dependencies so download and build is kept to minimum.
Added a note on what to install on Ubuntu 20.04 as guidance.
CMake supports multi-line commands so I lined up the vendor file to 80 columns for better readability without changing the commands themselves.
Got a top Intel AWS C6i (IceLake) machine for a couple hours to run the benchmarks and added the results to the end as well.