Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support exceptions #50

Closed
c-cube opened this issue Jan 11, 2024 · 2 comments · Fixed by #63
Closed

support exceptions #50

c-cube opened this issue Jan 11, 2024 · 2 comments · Fixed by #63
Labels
enhancement New feature or request

Comments

@c-cube
Copy link
Member

c-cube commented Jan 11, 2024

https://opentelemetry.io/docs/specs/semconv/exceptions/exceptions-spans/

@c-cube c-cube added the enhancement New feature or request label Jan 11, 2024
@tatchi
Copy link
Contributor

tatchi commented Sep 18, 2024

Would something like below be reasonable ?

let[@inline] record_exception (scope : t) (exn : unit -> exn) : unit =
  if Collector.has_backend () then (
    let ev =
      Event.make "exception"
        ~attrs:
          [
            "message", `String (Printexc.to_string (exn ()));
            "type", `String (Printexc.exn_slot_name (exn ()));
            "stacktrace", `String (Printexc.get_backtrace ());
          ]
    in
    scope.events <- ev :: scope.events
  )

semgrep-ci bot pushed a commit to semgrep/semgrep that referenced this issue Sep 18, 2024
completes SAF-1569

Originally I just wanted to report why semgrep-core was killed in the
traces, and send any traces that may have been mid flight when it
happened. Since Otel the spec [doesn't support partial
traces](open-telemetry/opentelemetry-specification#373),
I decided to simply raise the kill signal as an error like we already do
when we receive Ctrl c. Otel libraries are supposed to report
[exceptions in spans and then exit
them](https://opentelemetry.io/docs/specs/semconv/exceptions/exceptions-spans/),
but it turns out the otel library [does not do
this](imandra-ai/ocaml-opentelemetry#50). So
this entire time if there was EVER an error, including file and memory
time outs, in a span, we would never send the trace!

So this PR makes it so we exit a span whenever an error is caught (and
then we reraise it), and the error attributes the library is /supposed/
to add, we add instead!

So now we can see what spans have errors, their error message,
stacktrace, and we actually see them.

## Test plan
```bash
semgrep --pro --trace --config p/default --trace-endpoint semgrep-dev --max-memory 10 --interfile-timeout 1 --timeout 1 --timeout-threshold=1
# wait a bit
# then ctrl c or kill -15 <pid>
```
then check jaeger to see traces have error info, and sent traces after
ctrl c or kill:

[example](https://jaeger-dev2.corp.semgrep.dev/trace/2a86f622e267e7164c29ea0dd37c5286)

synced from Pro e8114ab0dccd079b51a0381e4168d5383ab87bbe
semgrep-ci bot pushed a commit to semgrep/semgrep that referenced this issue Sep 19, 2024
completes SAF-1569

Originally I just wanted to report why semgrep-core was killed in the
traces, and send any traces that may have been mid flight when it
happened. Since Otel the spec [doesn't support partial
traces](open-telemetry/opentelemetry-specification#373),
I decided to simply raise the kill signal as an error like we already do
when we receive Ctrl c. Otel libraries are supposed to report
[exceptions in spans and then exit
them](https://opentelemetry.io/docs/specs/semconv/exceptions/exceptions-spans/),
but it turns out the otel library [does not do
this](imandra-ai/ocaml-opentelemetry#50). So
this entire time if there was EVER an error, including file and memory
time outs, in a span, we would never send the trace!

So this PR makes it so we exit a span whenever an error is caught (and
then we reraise it), and the error attributes the library is /supposed/
to add, we add instead!

So now we can see what spans have errors, their error message,
stacktrace, and we actually see them.

## Test plan
```bash
semgrep --pro --trace --config p/default --trace-endpoint semgrep-dev --max-memory 10 --interfile-timeout 1 --timeout 1 --timeout-threshold=1
# wait a bit
# then ctrl c or kill -15 <pid>
```
then check jaeger to see traces have error info, and sent traces after
ctrl c or kill:

[example](https://jaeger-dev2.corp.semgrep.dev/trace/2a86f622e267e7164c29ea0dd37c5286)

synced from Pro e8114ab0dccd079b51a0381e4168d5383ab87bbe
nmote pushed a commit to semgrep/semgrep that referenced this issue Sep 19, 2024
completes SAF-1569

Originally I just wanted to report why semgrep-core was killed in the
traces, and send any traces that may have been mid flight when it
happened. Since Otel the spec [doesn't support partial
traces](open-telemetry/opentelemetry-specification#373),
I decided to simply raise the kill signal as an error like we already do
when we receive Ctrl c. Otel libraries are supposed to report
[exceptions in spans and then exit
them](https://opentelemetry.io/docs/specs/semconv/exceptions/exceptions-spans/),
but it turns out the otel library [does not do
this](imandra-ai/ocaml-opentelemetry#50). So
this entire time if there was EVER an error, including file and memory
time outs, in a span, we would never send the trace!

So this PR makes it so we exit a span whenever an error is caught (and
then we reraise it), and the error attributes the library is /supposed/
to add, we add instead!

So now we can see what spans have errors, their error message,
stacktrace, and we actually see them.

## Test plan
```bash
semgrep --pro --trace --config p/default --trace-endpoint semgrep-dev --max-memory 10 --interfile-timeout 1 --timeout 1 --timeout-threshold=1
# wait a bit
# then ctrl c or kill -15 <pid>
```
then check jaeger to see traces have error info, and sent traces after
ctrl c or kill:

[example](https://jaeger-dev2.corp.semgrep.dev/trace/2a86f622e267e7164c29ea0dd37c5286)

synced from Pro e8114ab0dccd079b51a0381e4168d5383ab87bbe
@c-cube
Copy link
Member Author

c-cube commented Sep 23, 2024

I think this should rather look like:

let record_exception (scope:t) (exn:exn) (bt:Printexc.raw_backtrace) : unit =
  if

reason is, the exception is never going to be lazy; and afaik bt has to be collected first thing when the exception occurs, or else the stack frame might be clobbered by intermediate computations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants