✨[Feature] Evaluate using Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) #2511

zamazan4ik · 2023-12-02T13:55:53Z

Is your feature request related to a problem? Please describe.

Not a problem. An idea about how the TensorRT performance can be improved.

I checked Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) improvements on multiple projects. The results are available here. According to the tests, these optimizations can help with achieving better performance in many cases for many applications: compilers and interpreters, static analysis, databases, networking, etc. Since this, I think optimizing TensorRT (its C++ part) with PGO and PLO would be a good idea.

Describe the solution you'd like

I can suggest the following things:

Perform PGO benchmarks on TensorRT. If it shows improvements - add a note to the documentation about possible improvements in TensorRT performance with PGO.
Providing an easier way (e.g. a build option) to build scripts with PGO can be helpful for the end-users and maintainers since they will be able to optimize TensorRT according to their workloads.
Optimize pre-built TensorRT binaries

Additional context

As an additional optimization step after PGO, I can suggest Post-Link Optimization (PLO) with a tool like LLVM BOLT. I think it's still worth evaluating it only after the PGO integration into TensorRT.

Here I collected several PGO-related links (more PGO-related materials available at https://github.com/zamazan4ik/awesome-pgo/).

Examples of how PGO optimization is integrated into other projects:

Rustc: a CI script for the multi-stage build
GCC:
- Official docs, section "Building with profile feedback" (even AutoFDO build is supported)
- A part in a "wonderful" configure script
Clang: Docs
Python:
- CPython: README
- Pyston: README
Go: Bash script
V8: Bazel flag
ChakraCore: Scripts
Chromium: Script
Firefox: Docs
- Thunderbird has PGO support too
PHP - Makefile command and old Centminmod scripts
MySQL: CMake script
YugabyteDB: GitHub commit
FoundationDB: Script
Zstd: Makefile
Foot: Scripts
Windows Terminal: GitHub PR
Pydantic-core: GitHub PR
file.d: GitHub PR
OceanBase: CMake flag

I have some examples of how PGO information looks in the documentation:

ClickHouse: https://clickhouse.com/docs/en/operations/optimizing-performance/profile-guided-optimization
Databend: https://databend.rs/doc/contributing/pgo
Vector: https://vector.dev/docs/administration/tuning/pgo/
Nebula: https://docs.nebula-graph.io/3.5.0/8.service-tuning/enable_autofdo_for_nebulagraph/
GCC: Official docs, section "Building with profile feedback" (even AutoFDO build is supported)
Clang:
- https://llvm.org/docs/HowToBuildWithPGO.html
- https://llvm.org/docs/AdvancedBuilds.html
tsv-utils: https://github.com/eBay/tsv-utils/blob/master/docs/BuildingWithLTO.md

Regarding LLVM BOLT integration, I have the following examples:

Rustc:
- Rustc itself (GitHub PR)
- LLVM in Rustc (Reddit)
CPython: GitHub PR
YDB: GitHub comment
Clang:
LDC: GitHub comment
HHVM, Proxygen and others: Facebook paper
NodeJS: Blog
Chromium: Blog
MySQL, MongoDB, memcached, Verilator: Paper

The text was updated successfully, but these errors were encountered:

narendasan · 2023-12-02T20:54:22Z

Do you think this is more geared towards TensorRT itself or the PyTorch extension? This might be more relevant to open in https://github.com/nvidia/pytorch

zamazan4ik · 2023-12-02T21:06:32Z

This might be more relevant to open in https://github.com/nvidia/pytorch

For this page, I get HTTP 404. Does it have some special access requirements or just the link is wrong?

narendasan · 2023-12-02T21:47:17Z

Sorry wrong url https://github.com/nvidia/tensorrt

zamazan4ik added the feature request New feature or request label Dec 2, 2023

zamazan4ik assigned narendasan Dec 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨[Feature] Evaluate using Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) #2511

✨[Feature] Evaluate using Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) #2511

zamazan4ik commented Dec 2, 2023

narendasan commented Dec 2, 2023

zamazan4ik commented Dec 2, 2023 •

edited

Loading

narendasan commented Dec 2, 2023

✨[Feature] Evaluate using Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) #2511

✨[Feature] Evaluate using Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) #2511

Comments

zamazan4ik commented Dec 2, 2023

narendasan commented Dec 2, 2023

zamazan4ik commented Dec 2, 2023 • edited Loading

narendasan commented Dec 2, 2023

zamazan4ik commented Dec 2, 2023 •

edited

Loading