Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent GCC using FP registers for pointer storage on arm64 Linux #193

Merged
merged 2 commits into from
Jan 24, 2025

Conversation

szegedi
Copy link

@szegedi szegedi commented Jan 23, 2025

What does this PR do?:
Prohibits use of arm64 floating point (FP) registers in some specific functions. For some reason, GCC compiling for arm64 on Linux will sometimes use FP registers to store/load pointers.

Motivation:
We see evidence of customer crashes on arm64 Linux, and the only thing they have in common is that they always seem to happen in code that invoked a method by loading this of the invoked method in x0 from some floating point register. We speculate that GCC uses that as an optimization (hey, more free registers!) – as long as it only stores/loads 64-bit values, it's fine. The crashes might be due to some further invoked functions not restoring these.

We can't generally prohibit use of floating point registers in our code (as it'd completely disable FP arithmetic and that pops up in unusual places, like a float load factor for inlined std::map constructors etc.), but we identified all methods where this "FP reg as pointer storage" tactics were employed by GCC and marked them separately with __attribute__((target("general-regs-only"))) to prohibit it from using FP regs in them.

Since some of these methods did in fact try to manipulate floating-point values either directly or indirectly, we also had to either move those parts out of those functions (such is ContextsByNode creation because of aforementioned inlined float load factor in std::map constructor, or creation of v8::Number instances from ints), or converted async_id from double to long, as it will always be an integral value.

How to test the change?:
We'll release a dev build from this PR, download it, and check whether the use of fp has gone. In local testing on a Linux arm64 VM GCC sadly doesn't reproduce the issue.

JIRA: PROF-11193

@szegedi szegedi requested a review from nsavoire as a code owner January 23, 2025 14:08
Copy link

github-actions bot commented Jan 23, 2025

Overall package size

Self size: 9.56 MB
Deduped: 9.93 MB
No deduping: 9.93 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | source-map | 0.7.4 | 226 kB | 226 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | p-limit | 3.1.0 | 7.75 kB | 13.78 kB | | delay | 5.0.0 | 11.17 kB | 11.17 kB | | node-gyp-build | 3.9.0 | 8.81 kB | 8.81 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

@szegedi szegedi added the semver-patch Bug or security fixes, mainly label Jan 23, 2025
@pr-commenter
Copy link

pr-commenter bot commented Jan 23, 2025

Benchmarks

Benchmark execution time: 2025-01-23 15:10:46

Comparing candidate commit 532062b in PR branch szegedi/general-regs-only with baseline commit e28fc06 in branch main.

Found 2 performance improvements and 0 performance regressions! Performance is the same for 89 metrics, 29 unstable metrics.

scenario:profiler-idle-no-wall-profiler-18

  • 🟩 cpu_user_time [-4.460ms; -0.774ms] or [-8.947%; -1.553%]

scenario:profiler-idle-with-wall-profiler-18

  • 🟩 cpu_user_time [-9.046ms; -2.754ms] or [-12.393%; -3.773%]

array = Array::New(isolate);
contextsByNode[sample] = {array, 1};
(*contextsByNode)[sample] = {array, 1};
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this clean enough, or is there a better idiom to call the [] operator on the target of a pointer? I can also replace it with

contextsByNode->emplace(sample, (struct NodeInfo){array, 1});

@@ -24,7 +24,7 @@ namespace dd {
v8::Local<v8::Value> TranslateTimeProfile(
const v8::CpuProfile* profile,
bool includeLineInfo,
ContextsByNode* contextsByNode = nullptr,
std::shared_ptr<ContextsByNode> contextsByNode = nullptr,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you could have kept a raw pointer for contextsByNode in translate-time-profile.cc/hh

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but figured why not be consistent. If I already have a safe shared pointer around I might as well use it.

@szegedi szegedi merged commit 7b9cc3b into main Jan 24, 2025
63 checks passed
@szegedi szegedi deleted the szegedi/general-regs-only branch January 24, 2025 10:39
szegedi added a commit that referenced this pull request Jan 24, 2025
* Prohibit use of floating-point registers where gcc on arm64 Linux tends to use them to store pointers
@szegedi szegedi mentioned this pull request Jan 24, 2025
szegedi added a commit that referenced this pull request Jan 27, 2025
* Prohibit use of floating-point registers where gcc on arm64 Linux tends to use them to store pointers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-patch Bug or security fixes, mainly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants