Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julia uses start_task as the top frame #292

Merged
merged 1 commit into from
Aug 2, 2023
Merged

Julia uses start_task as the top frame #292

merged 1 commit into from
Aug 2, 2023

Conversation

aristotelhs
Copy link
Contributor

Julia uses the start_task as an indicator for the top frame. Adding this to the list of successful unwinding symbols.

Copy link
Collaborator

@sanchda sanchda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL! Thank you for the contribution!

Are stacks rooted on start_task in any way special? Anything we should be considering for the future?

@aristotelhs
Copy link
Contributor Author

I am not an expert in Julia Lang (maybe @vchuravy can provide a better overview). What I have seen is that JuliaLang uses Tasks (https://github.com/JuliaLang/julia/blob/master/src/task.c) to achieve concurrent programming and parallelism, which resemble a bit the Go routines (or Windows fibers) . Each task that executes has this symbol at the top of the frame. There might be some extra nuances that the compiler does regarding adding some extra fake stack frames, but at least if we reach this then we can assume that we have unwinded the stack correctly.
In our profiles we had a lot of [Incomplete] profiles because the unwinder tried to go past this symbol and was failing obviously, so at least now we can distinguish between cases that the profiles are too deep (resulting in [Incomplete]) and cases that we did have a successful unwinding.

@aristotelhs
Copy link
Contributor Author

Is it normal for dd-gitlab/report_gitlab_CI_status take that long?

@vchuravy
Copy link

vchuravy commented Aug 2, 2023

The other thing we will need to check if this is sufficient.

We might need to treat interpreter frames specially.

@sanchda
Copy link
Collaborator

sanchda commented Aug 2, 2023

Is it normal for dd-gitlab/report_gitlab_CI_status take that long?
Uh, no. Investigating.

JuliaLang uses Tasks ... to achieve concurrent programming
Ah, indeed. I actually asked the question to segue into a discussion on this.

In other languages supported by this tool (Rust, C++), the use of concurrency falls into a general pattern.

  1. An operation is dispatched.
  2. That operation is later scheduled and executed by some kind of async/concurrent computing/threading/whatever agent or runtime.

ddprof does a fine job unwinding such operations when they are observed on-CPU (... uh, minus the fact that frame assemblies which aren't localized to the kernel-recognized "stack" are problematic for us at present), with the caveat that the context we provide is 100% aligned with 2, with no details from 1 (that aren't already placed on the stack by the language somehow). When the operation is sufficiently generic, it becomes impossible to associate that concurrent operation with its dispatch site--you lose the ability to say "foo() called async_bar(), so the total compute utilization of foo() is cost_of(foo) + cost_of(async_bar)".

So my questions (and please forgive my ignorance, coming from a background in scientific computing, my experience with Julia was much-loved and wonderful, but predated the 1.0 release)

  1. Does this problem of not being able to associate tasks to the context of their dispatch sound like a big deal?
  2. Do you think you might be seeing this in your current application(s)?
  3. Any other thoughts you might have on rationalizing how tasks fit into the overall flamegraph in a more competent way.

@sanchda
Copy link
Collaborator

sanchda commented Aug 2, 2023

Is it normal for dd-gitlab/report_gitlab_CI_status take that long?

OK, I figured it out. One of the requirements for our gitlab workflow is that the incoming branch can't be from a fork. It isn't clear to me whether this can be changed, or if so how to do it.

I went ahead and created #293 to verify that this is the case and to run it through CI. If CI comes back clear, I'll force-merge this PR (since the single base commit is the same). Obviously I don't anticipate any errors.

Sorry about the speed bump, and thanks again for the contribution!

@sanchda
Copy link
Collaborator

sanchda commented Aug 2, 2023

Looks like GitHub is smart enough to notice the test against the merged branch corresponds to a passage on this one. I'm going to go ahead and hit the big green button, since that way I can keep track of CI right away, but in the future please feel free to press it yourself if everything is good--no need to wait for a maintainer. :)

@sanchda sanchda merged commit 1af1111 into DataDog:main Aug 2, 2023
@aristotelhs aristotelhs deleted the aristotelhs/julia_top_stack_frame branch August 3, 2023 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants