[MLOB-1524] feat(llmobs): Introduce LLM Observability SDK #4742

sabrenner · 2024-09-30T20:27:50Z

What does this PR do?

Introduces an LLM Observability SDK into the tracer, with its own tracing API extending the core tracer, and span tagging and processing functionality to flush LLM Observability spans to LLM Observability.

A corp/public documentation link will be added here once the SDK has been released and docs published, for visibility.

The following PRs have been merged into the branch for this PR, and have been reviewed and approved:

(config) #4696 APM: @ida613 LLMObs: @Yun-Kim
(writer) #4699 APM: @rochdev LLMObs: @Yun-Kim
(tagger) #4718 APM: @rochdev LLMObs @Yun-Kim
(span processing) #4738 APM: @rochdev LLMObs: @lievan
(sdk api) #4773 APM: @rochdev LLMObs: @Yun-Kim @lievan @Kyle-Verhoog

Submitting OpenAI spans will be a direct follow-up PR.

Motivation

Introduce core LLM Observability SDK logic into the tracer.

add llmobs config

LLM Observability writers

LLM Observability tagger

…r/llmobs-sdk-release

.github/workflows/llmobs.yml

github-actions · 2024-09-30T20:28:34Z

Overall package size

Self size: 7.8 MB
Deduped: 64.66 MB
No deduping: 65 MB

Dependency sizes

| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/native-appsec | 8.2.1 | 19.18 MB | 19.19 MB | | @datadog/native-iast-taint-tracking | 3.2.0 | 13.9 MB | 13.91 MB | | @datadog/pprof | 5.4.1 | 9.76 MB | 10.13 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.5.0 | 2.51 MB | 2.65 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 2.0.0 | 898.77 kB | 1.3 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | path-to-regexp | 0.1.10 | 6.38 kB | 6.38 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

_{🤖 This report was automatically generated by heaviest-objects-in-the-universe}

* span processor * tests * remove agent exporter log and do not stringify tags * remove llmobs from exporter tests * add in default unserializable value * review comments * warning log for metric * todo-ify * remove some duplicate logic * decouple llmobs span processing with a channel * use a static weakmap to store llmobs tags/annotations instead of span tags * do not register span in map if it does not have an llmobs span kind * span is passed on an object from sp publisher * re-clarify TODOs * only send span in publish * log multiple warnings and return conditional undefined * update error logic

pr-commenter · 2024-10-09T21:14:25Z

Benchmarks

Benchmark execution time: 2024-10-29 18:43:40

Comparing candidate commit 845c840 in PR branch sabrenner/llmobs-sdk-release with baseline commit c53c395 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 260 metrics, 6 unstable metrics.

* wip * type definitions * active + try/catch eval metric writer append * test ts * use tagger map and processor as a channel subscriber * change decorate and add in dev changes * try some api changes * add decorate to noop * fix breaking proxy tests * experimental decorators for TS docs * api changes, fix unit + e2e tests * try removing global log mocks * add some util tests * remove logger mocks * add module tests + do not enable when not specified * fix eval metric integration test * wip * memoize getFunctionArguments * move any subscriber and global writer to the module enablement level instead of sdk * should fix TS tests * add ts integration test and fix decorator * devex for ts versions * add noop typescript test * remove startSpan * remove unneeded change * dedup decorator code * Update index.d.ts Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * map metrics names * change validKind to validateKind and throw * tagger for metrics follow-up * review feedback * add some tests for not auto-annotating in certain cases --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>

packages/dd-trace/src/llmobs/channels.js

packages/dd-trace/src/llmobs/constants.js

packages/dd-trace/src/config.js

packages/dd-trace/test/llmobs/sdk/index.spec.js

Kyle-Verhoog

reviewed it to the best of my noob js and mlobs ability. Great job Sam 👏 🚢

Let's try to get some tests into the shared testing framework ASAP to assert the similarity with the Python interface!

Kyle-Verhoog · 2024-10-29T02:31:54Z

packages/dd-trace/src/llmobs/tagger.js

+        this._setTag(span, key, data)
+      } else {
+        try {
+          this._setTag(span, key, JSON.stringify(data))


lol we definitely gotta make a pitch to APM to better support this in the future

Kyle-Verhoog · 2024-10-29T02:34:14Z

packages/dd-trace/src/llmobs/tagger.js

+  }
+
+  // any public-facing LLMObs APIs using this tagger should not soft fail
+  // auto-instrumentation should soft fail


codecov · 2024-10-29T14:11:20Z

Codecov Report

Attention: Patch coverage is 11.80258% with 411 lines in your changes missing coverage. Please review.

Project coverage is 70.23%. Comparing base (fd0f570) to head (9ac9172).
Report is 35 commits behind head on master.

Files with missing lines	Patch %	Lines
packages/dd-trace/src/llmobs/sdk.js	10.84%	148 Missing ⚠️
packages/dd-trace/src/llmobs/tagger.js	4.08%	141 Missing ⚠️
packages/dd-trace/src/llmobs/util.js	3.19%	91 Missing ⚠️
packages/dd-trace/src/llmobs/noop.js	6.45%	29 Missing ⚠️
packages/dd-trace/src/proxy.js	71.42%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #4742      +/-   ##
==========================================
+ Coverage   68.58%   70.23%   +1.65%     
==========================================
  Files          12      329     +317     
  Lines         818    14775   +13957     
==========================================
+ Hits          561    10377    +9816     
- Misses        257     4398    +4141

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

* [MLOB-1540] add llmobs configuration to global tracer config (#4696) add llmobs config * [MLOB-1555] LLM Observability writers (#4699) LLM Observability writers * [MLOB-1556] LLM Observability tagger (#4718) LLM Observability tagger * [MLOB-1560] LLMObs Span Processor (#4738) * span processor * tests * remove agent exporter log and do not stringify tags * remove llmobs from exporter tests * add in default unserializable value * review comments * warning log for metric * todo-ify * remove some duplicate logic * decouple llmobs span processing with a channel * use a static weakmap to store llmobs tags/annotations instead of span tags * do not register span in map if it does not have an llmobs span kind * span is passed on an object from sp publisher * re-clarify TODOs * only send span in publish * log multiple warnings and return conditional undefined * update error logic * [MLOB-1561] LLM Observability SDK API (#4773) * wip * type definitions * active + try/catch eval metric writer append * test ts * use tagger map and processor as a channel subscriber * change decorate and add in dev changes * try some api changes * add decorate to noop * fix breaking proxy tests * experimental decorators for TS docs * api changes, fix unit + e2e tests * try removing global log mocks * add some util tests * remove logger mocks * add module tests + do not enable when not specified * fix eval metric integration test * wip * memoize getFunctionArguments * move any subscriber and global writer to the module enablement level instead of sdk * should fix TS tests * add ts integration test and fix decorator * devex for ts versions * add noop typescript test * remove startSpan * remove unneeded change * dedup decorator code * Update index.d.ts Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * map metrics names * change validKind to validateKind and throw * tagger for metrics follow-up * review feedback * add some tests for not auto-annotating in certain cases --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * hard fail instead of soft fail, except for `wrap` span name * add ml-observability codeowners * resolve ts test * update auto-annotation check * tagger can soft fail * using custom ASL instance and scope activation * fix test comments and remove * address review comments * remove llmobs.apiKey config, only rely on global * fix evaulations test * make llmobs storage accessible --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>

sabrenner and others added 4 commits September 18, 2024 14:18

[MLOB-1540] add llmobs configuration to global tracer config (#4696)

54c8eec

add llmobs config

[MLOB-1555] LLM Observability writers (#4699)

5b215f6

LLM Observability writers

[MLOB-1556] LLM Observability tagger (#4718)

feeeb89

LLM Observability tagger

Merge branch 'master' of github.com:DataDog/dd-trace-js into sabrenne…

bddfa3d

…r/llmobs-sdk-release

sabrenner added do-not-merge/WIP semver-minor labels Sep 30, 2024

datadog-datadog-prod-us1 bot reviewed Sep 30, 2024

View reviewed changes

.github/workflows/llmobs.yml Show resolved Hide resolved

.github/workflows/llmobs.yml Show resolved Hide resolved

.github/workflows/llmobs.yml Show resolved Hide resolved

Merge branch 'master' into sabrenner/llmobs-sdk-release

7ba7ab8

sabrenner mentioned this pull request Oct 16, 2024

[MLOB-1561] LLM Observability SDK API #4773

Merged

sabrenner and others added 3 commits October 16, 2024 15:42

Merge branch 'master' into sabrenner/llmobs-sdk-release

b6452ad

Merge branch 'master' into sabrenner/llmobs-sdk-release

4370228

sabrenner marked this pull request as ready for review October 24, 2024 16:02

sabrenner requested a review from a team as a code owner October 24, 2024 16:02

sabrenner changed the title ~~[MLOB-1524] feat(llmobs): introduce LLM Observability SDK~~ [MLOB-1524] feat(llmobs): Introduce LLM Observability SDK Oct 24, 2024

sabrenner added 3 commits October 24, 2024 14:58

hard fail instead of soft fail, except for wrap span name

a76f5ad

add ml-observability codeowners

96aa49a

resolve ts test

d1ee649

sabrenner removed the do-not-merge/WIP label Oct 24, 2024

sabrenner added 3 commits October 28, 2024 11:33

update auto-annotation check

18b0491

tagger can soft fail

adf8d67

using custom ASL instance and scope activation

5b4f88b

rochdev reviewed Oct 28, 2024

View reviewed changes

packages/dd-trace/src/llmobs/channels.js Outdated Show resolved Hide resolved

rochdev reviewed Oct 28, 2024

View reviewed changes

packages/dd-trace/src/llmobs/constants.js Outdated Show resolved Hide resolved

rochdev requested changes Oct 28, 2024

View reviewed changes

Kyle-Verhoog previously approved these changes Oct 29, 2024

View reviewed changes

fix test comments and remove

05533cb

address review comments

9ac9172

sabrenner dismissed Kyle-Verhoog’s stale review via 9ac9172 October 29, 2024 14:09

sabrenner added 3 commits October 29, 2024 11:30

remove llmobs.apiKey config, only rely on global

3e6945b

fix evaulations test

cc6fec5

make llmobs storage accessible

845c840

rochdev approved these changes Oct 29, 2024

View reviewed changes

sabrenner merged commit 1c0958e into master Oct 29, 2024
205 checks passed

sabrenner deleted the sabrenner/llmobs-sdk-release branch October 29, 2024 19:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MLOB-1524] feat(llmobs): Introduce LLM Observability SDK #4742

[MLOB-1524] feat(llmobs): Introduce LLM Observability SDK #4742

sabrenner commented Sep 30, 2024 •

edited

Loading

github-actions bot commented Sep 30, 2024 •

edited

Loading

pr-commenter bot commented Oct 9, 2024 •

edited

Loading

Kyle-Verhoog left a comment

Kyle-Verhoog Oct 29, 2024

Kyle-Verhoog Oct 29, 2024

codecov bot commented Oct 29, 2024

[MLOB-1524] feat(llmobs): Introduce LLM Observability SDK #4742

[MLOB-1524] feat(llmobs): Introduce LLM Observability SDK #4742

Conversation

sabrenner commented Sep 30, 2024 • edited Loading

What does this PR do?

Motivation

github-actions bot commented Sep 30, 2024 • edited Loading

Overall package size

pr-commenter bot commented Oct 9, 2024 • edited Loading

Benchmarks

Kyle-Verhoog left a comment

Choose a reason for hiding this comment

Kyle-Verhoog Oct 29, 2024

Choose a reason for hiding this comment

Kyle-Verhoog Oct 29, 2024

Choose a reason for hiding this comment

codecov bot commented Oct 29, 2024

Codecov Report

sabrenner commented Sep 30, 2024 •

edited

Loading

github-actions bot commented Sep 30, 2024 •

edited

Loading

pr-commenter bot commented Oct 9, 2024 •

edited

Loading