-
Notifications
You must be signed in to change notification settings - Fork 655
Performance Profiling with Tracy
https://github.com/wolfpld/tracy
A real time, nanosecond resolution, remote telemetry, hybrid frame and sampling profiler for games and other applications.
![](https://user-images.githubusercontent.com/1389729/97106613-832f0100-16cb-11eb-8452-267e406bceb9.png)
We can't know how good/bad our performance is until we measure it.
Tracy
is made up of two parts:
- The client - which you build into your program and will broadcast your performance information.
- The server - an external program (available in the Tracy Release) which will receive the information and allow you to analyze it.
If building on the command line, add -DTRACY_ENABLE=ON
to your configuration arguments. It will download the Tracy client and server, and build the Tracy
client into xi_map
for you.
If building from Visual Studio, select one of the -Tracy
build configurations and build as normal.
1> Working directory: C:\ffxi\server\build\x64-Release-Tracy
1> [CMake] -- C:/ProgramData/chocolatey/bin/ccache.exe found and enabled
1> [CMake] -- CMAKE_SOURCE_DIR: C:/ffxi/server
1> [CMake] -- CMAKE_SIZEOF_VOID_P == 8: 64-bit build
1> [CMake] -- ENABLE_FAST_MATH: ON
1> [CMake] -- TRACY_ENABLE: ON
1> [CMake] -- Downloading Tracy development library
1> [CMake] x tracy-0.8.2/
1> [CMake] x tracy-0.8.2/.github/
1> [CMake] x tracy-0.8.2/.github/FUNDING.yml
1> [CMake] x tracy-0.8.2/.github/sponsor.png
1> [CMake] x tracy-0.8.2/.github/workflows/
...
1> [CMake] -- Downloading Tracy client
1> [CMake] x capture.exe
1> [CMake] x csvexport.exe
1> [CMake] x import-chrome.exe
1> [CMake] x Tracy.exe
1> [CMake] x update.exe
1> [CMake] -- Modifying C:/ffxi/server/ext/tracy/tracy-0.8.2/client/TracyProfiler.hpp
...
1> [CMake] -- Configuring done
1> [CMake] -- Generating done
1> [CMake] -- Build files have been written to: C:/ffxi/server/build/x64-Release-Tracy
Tracy.exe
will be placed in your repo root.
Run your Tracy-enabled xi_map.exe
and then launch Tracy.exe
. You will see it connect and start profiling. You can launch Tracy.exe
before or after xi_map.exe
, it isn't important.
It is usually better to wait until startup has completed before you attach Tracy, as the startup routine isn't a good indicator of the server's runtime performance.
Once connected, you should see something like this:
If you want to record a trace for later use you can click on the Wifi symbol
and you'll be given the option to save the current trace.
WARNING Traces can be very large! Plan accordingly!
If you need to capture a trace without launching the GUI (on a remote VM, a resource constrained system, etc.), Tracy comes with capture.exe
.
You can capture a trace using a command line utility contained in the capture directory. To use it you may
provide the following parameters:
• -o output.tracy – the file name of the resulting trace (required).
• -a address – specifies the IP address (or a domain name) of the client application (uses localhost if
not provided).
• -p port – network port which should be used (optional).
• -f – force overwrite, if output file already exists.
• -s seconds – number of seconds to capture before automatically disconnecting (optional).
If no client is running at the given address, the server will wait until it can make a connection. During the
capture, the utility will display the following information:
You can launch it from the command line:
PS C:\ffxi\server> .\capture.exe -o trace.tracy -f -s 60
Connecting to 127.0.0.1:8086...
Queue delay: 0 ns
Timer resolution: 100 ns
1.32 Kbps /138.5% = 0.00 Mbps | Tx: 41.34 MB | 330.28 MB | 1:32.9
Frames: 26
Time span: 1:32.9
Zones: 941,349
Elapsed time: 1:00.1
Saving trace... done!
Trace size 40.59 MB (24.26% ratio)
PS C:\ffxi\server>
You can open the resulting Trace in the Tracy.exe
GUI at a later time.
Searchable statistics are in the Statistics
header, log messages are in Messages
. You can click and drag and zoom around the main timeline window for information about whats going on. You can "re-attach" to the most active frames by clicking on the Pause/Resume
header and using the options there.
If you click on the entries in the Statistics
menu, you can drill down into that function and look at it in more detail.
Remember that there are a lot of things that can affect performance.
- Platform (Windows, Linux, OSX)
- Architecture (x86, x86_64)
- Type of build (Debug, RelWithDebugInfo, Release, MinSizeRel)
- Compiler (MSVC, Clang, GCC)
- Your system specs (CPU Speed, Available Memory, Memory Latency, HDD R/W speed etc.)
- Other programs using your system's resources
- Virtualization/Containerization (VMWare, WSL, Docker)
If you're performing before/after testing, try as hard as you can to make sure the conditions are the same for both runs and change as little as possible for each change. It is also helpful to take multiple readings and many samples per reading to try and get an accurate view of performance.
- Expensive pathing and navmesh access... all the time... every tick... every mob... everywhere...
-
parse
routine is slow