Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle dynamic TLS descriptors on aarch64 #1350

Open
wkozaczuk opened this issue Feb 27, 2025 · 0 comments · May be fixed by #1351
Open

Handle dynamic TLS descriptors on aarch64 #1350

wkozaczuk opened this issue Feb 27, 2025 · 0 comments · May be fixed by #1351

Comments

@wkozaczuk
Copy link
Collaborator

Currently, the aarch64 port only supports so-called static TLS descriptors. This means that thread local variables in the application executable and libraries can be accessed as long as the ELF objects they are in are relocated during the initial dynamic linker loading phase. In other words, they are located in the memory blocks allocated as part of the so-called static TLS block.

Now, some applications, like java, dynamically open their libraries (using dlopen()), and the TLS variables must be accessed using so-called dynamic TLS descriptors. On x86_64 such dynamic access is supported by __tls_get_addr(), but on aarch64 the default is to use TLS descriptors (see this paper for details - https://www.fsfla.org/~lxoliva/writeups/TLS/paper-lk2006.pdf)

wkozaczuk added a commit to wkozaczuk/osv that referenced this issue Feb 27, 2025
Currently, the aarch64 port only supports so-called static TLS descriptors.
This means that thread local variables in the application executable and libraries
can be accessed as long as the ELF objects they are in are relocated during
the initial dynamic linker loading phase. In other words, they are located
in the memory blocks allocated as part of the so-called static TLS block.

Now, some applications, like java, dynamically open their libraries (using dlopen()),
and the TLS variables must be accessed using so-called dynamic TLS descriptors.
On x86_64 such dynamic access is supported by __tls_get_addr(), but on aarch64
the default is to use TLS descriptors (see this paper for details -
https://www.fsfla.org/~lxoliva/writeups/TLS/paper-lk2006.pdf)

To that extent, this patch implements the necessary changes to the logic
of handling the `R_AARCH64_TLSDESC` relocations. Specifically, we modify the
`object::arch_relocate_tls_desc()` to detect if we need to use dynamic
or static TLS descriptor. If the former, we setup the relocation entry
to use new `__tlsdesc_dynamic` resolver function and a pointer to a module_and_offset
struct holding module index and TLS offset. The address of the module_and_offset
will be passed as a sole argument to the resolver function when code
accessing the relevant thread local variable is run.

We also add the implementation of the `__tlsdesc_dynamic` resolver
function in assembly (see tlsdesc.s). In essence, it implements
the fast path - when TLS block has been already setup for given thread,
and a slow path - when TLS block has been already setup. The slow path
calls new function - `__tls_dynamic_setup` - that allocates new TLS block.
In order to access an address of the corresponding TLS block from assembly,
we define simple DTV descriptor (see arch/aarch/arch-cpu.hh) that points
to the `thread::_tls` variable of type `std::vector`.

Finally, we modify the misc-tls.cc to measure performance of the dynamic
TLS access when variable is located in the `dlopen()`-ed shared library.

Based on this test run the Apple M2 Mac Mini, the dynamic TLS access
is ~50% slower than the static one:

```
OSv v0.57.0-276-g8e57effc
eth0: 192.168.122.15
Booted up in 20.64 ms
Cmdline: /tests/misc-tls.so
var_global iteration (ns): 2.09624
var_tls iteration (ns): 1.79472
var_lib_tls iteration (ns): 2.92301
```

Fixes cloudius-systems#1350

Signed-off-by: Waldemar Kozaczuk <jwkozaczuk@gmail.com>
@wkozaczuk wkozaczuk linked a pull request Feb 27, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant