Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default runtime-id tag #415

Merged
merged 2 commits into from
Jul 11, 2024
Merged

Default runtime-id tag #415

merged 2 commits into from
Jul 11, 2024

Conversation

r1viollet
Copy link
Collaborator

@r1viollet r1viollet commented Jul 5, 2024

What does this PR do?

Provide a default runtime-id tag

Motivation

Ensure that the timeline data can be discovered for native.

Additional Notes

The runtime_id should come from tracing
When we do not find it, we should still generate a default runtime-id.

How to test the change?

I have a minor test to check the format.
I did not add an end to end test for now.

@r1viollet
Copy link
Collaborator Author

image

@r1viollet r1viollet changed the title Default runtime_id tag Default runtime-id tag Jul 5, 2024
@r1viollet r1viollet force-pushed the r1viollet/uuid_generation branch from 4f101d6 to 8899207 Compare July 5, 2024 07:43
@r1viollet r1viollet marked this pull request as ready for review July 5, 2024 07:44
@pr-commenter
Copy link

pr-commenter bot commented Jul 5, 2024

Benchmark results for collatz

Parameters

Baseline Candidate
config baseline candidate
profiler-version ddprof 0.17.1+15af4f13.38380914 ddprof 0.17.1+00eeec48.38912182

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean execution_time
scenario:ddprof -S bench-collatz --preset cpu_only collatz_runner.sh same

@pr-commenter
Copy link

pr-commenter bot commented Jul 5, 2024

Benchmark results for BadBoggleSolver_run

Parameters

Baseline Candidate
config baseline candidate
profiler-version ddprof 0.17.1+15af4f13.38380914 ddprof 0.17.1+00eeec48.38912182

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean execution_time
scenario:ddprof -S bench-bad-boggle-solver BadBoggleSolver_run work 1000 same

src/uuid.cc Outdated Show resolved Hide resolved
src/uuid.cc Outdated Show resolved Hide resolved
src/uuid.cc Outdated Show resolved Hide resolved
src/uuid.cc Outdated Show resolved Hide resolved
src/uuid.cc Outdated Show resolved Hide resolved
sanchda
sanchda previously approved these changes Jul 5, 2024
Copy link
Collaborator

@sanchda sanchda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You make this observation in a comment, but expanding on it here for no particular reason.

As far as I know, runtime-id is used to fingerprint a precise instance of a service. The noted strategy works perfectly fine for the other profilers, since they are indeed attached to a single instance of a runtime and they can hook into runtime faculties to change the runtime-id if that changes (e.g., atfork handlers).

On the other hand, this concept is slightly problematic for ddprof due to the way we track process lineage/lifetime. The noted procedure is more of a profiler-id. I just want to note this because our concept of what it means to be a runtime-id was greatly influenced by how the existing profilers work, but this will absolutely need to be revisited in the future.

My take:

  • This solution is fine because it's needed for timeline
  • A more durable solution will be computed in the backend by hashing PID, container-id, and host. Why? Because this will uniquely identify processes, and that's what matters.
  • Instead of PID, a better solution might be to compute something like comm name parity. In other words, if the parent/child PID sequence looks like bash -> bash -> python -> bash -> apache, the change sequence is 0 -> 0 -> 1 -> 2 -> 3 instead of 0 -> 0 -> 1 -> 0 -> 2. Something like that, at least. Why? Because the affected profilers can start in the middle of execution and may be restarted, so having a stable characterization is valuable.

In conclusion:

  • This PR is fine, the current situation isn't our fault and the average ddprof user is basically just starting the profiler once anyway
  • Some careful thought should be put into adapting this procedure for a profiler with scope of more than one service, and I think that would require some work from the backend.
  • LGTM aside from the minor fixups in a separate PR. Approving now because I have nothing novel to add and I trust those comments will be addressed.
  • Thank you!

@r1viollet r1viollet force-pushed the r1viollet/uuid_generation branch 2 times, most recently from ce42d47 to df90080 Compare July 8, 2024 07:35
@r1viollet r1viollet requested a review from nsavoire July 8, 2024 07:42
@r1viollet
Copy link
Collaborator Author

You make this observation in a comment, but expanding on it here for no particular reason.

As far as I know, runtime-id is used to fingerprint a precise instance of a service. The noted strategy works perfectly fine for the other profilers, since they are indeed attached to a single instance of a runtime and they can hook into runtime faculties to change the runtime-id if that changes (e.g., atfork handlers).

On the other hand, this concept is slightly problematic for ddprof due to the way we track process lineage/lifetime. The noted procedure is more of a profiler-id. I just want to note this because our concept of what it means to be a runtime-id was greatly influenced by how the existing profilers work, but this will absolutely need to be revisited in the future.

My take:

  • This solution is fine because it's needed for timeline
  • A more durable solution will be computed in the backend by hashing PID, container-id, and host. Why? Because this will uniquely identify processes, and that's what matters.
  • Instead of PID, a better solution might be to compute something like comm name parity. In other words, if the parent/child PID sequence looks like bash -> bash -> python -> bash -> apache, the change sequence is 0 -> 0 -> 1 -> 2 -> 3 instead of 0 -> 0 -> 1 -> 0 -> 2. Something like that, at least. Why? Because the affected profilers can start in the middle of execution and may be restarted, so having a stable characterization is valuable.

In conclusion:

  • This PR is fine, the current situation isn't our fault and the average ddprof user is basically just starting the profiler once anyway
  • Some careful thought should be put into adapting this procedure for a profiler with scope of more than one service, and I think that would require some work from the backend.
  • LGTM aside from the minor fixups in a separate PR. Approving now because I have nothing novel to add and I trust those comments will be addressed.
  • Thank you!

Thanks for sharing. I agree, this is far from perfect.
It is true that we could focus on the reconciliation within the backend. Though I think there are also cases where there are several runtime-ids for a given PID (thinking of .NET for instance).
There could be a way we can establish some standard IPC mechanisms with the runtimes. We will experiment on this next quarter.

@r1viollet r1viollet force-pushed the r1viollet/uuid_generation branch 4 times, most recently from 5c62fab to 64cd1d6 Compare July 9, 2024 09:32
Provide a default runtime_id
This is to make it easier to browse the timeline data
@r1viollet r1viollet force-pushed the r1viollet/uuid_generation branch from 64cd1d6 to 00eeec4 Compare July 11, 2024 07:39
src/uuid.cc Outdated
Comment on lines 13 to 34
std::random_device rd;
std::array<int, std::mt19937::state_size> seed_data;
std::generate_n(seed_data.data(), seed_data.size(), std::ref(rd));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
std::mt19937 gen(seq);
std::uniform_int_distribution<int> dis(0, 15);
std::uniform_int_distribution<int> dis2(8, 11);
for (int i = 0; i < 12; ++i) {
data[i] = dis(gen);
}

// Set the version to 4 (UUID version 4)
data[k_version_position] = k_version;
for (int i = 13; i < 16; ++i) {
data[i] = dis(gen);
}
// variant
data[16] = dis2(gen);

for (int i = 17; i < 32; ++i) {
data[i] = dis(gen);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::random_device rd;
std::array<int, std::mt19937::state_size> seed_data;
std::generate_n(seed_data.data(), seed_data.size(), std::ref(rd));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
std::mt19937 gen(seq);
std::uniform_int_distribution<int> dis(0, 15);
std::uniform_int_distribution<int> dis2(8, 11);
for (int i = 0; i < 12; ++i) {
data[i] = dis(gen);
}
// Set the version to 4 (UUID version 4)
data[k_version_position] = k_version;
for (int i = 13; i < 16; ++i) {
data[i] = dis(gen);
}
// variant
data[16] = dis2(gen);
for (int i = 17; i < 32; ++i) {
data[i] = dis(gen);
}
static std::random_device rd;
using int_type = std::random_device::result_type;
for (int i = 0; i < 16; i += sizeof(int_type)) {
uint32_t x = rd();
memcpy(&data[i], &x, sizeof(int_type));
}
// variant must be 10xxxxxx
data[8] &= 0xBF;
data[8] |= 0x80;
// version must be 0100xxxx
data[6] &= 0x4F;
data[6] |= 0x40;
return uuid;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to review the proposal during a live session.

[[nodiscard]] int get_version() const { return data[k_version_position]; }
[[nodiscard]] std::string to_string() const;
// We could make this smaller, as it is hexadecimal and 32 characters
// but we are keeping one byte per element for simplicity
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how this simplifies things, I would say either store chars and keeps it 32 bytes or store numbers and make it 16 bytes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it easier to reason on the values of every element.

@r1viollet r1viollet merged commit e68d321 into main Jul 11, 2024
2 checks passed
@r1viollet r1viollet deleted the r1viollet/uuid_generation branch July 11, 2024 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants