From 7033a9cd6ba5dfb5e42f83ec22a3b059f8f41722 Mon Sep 17 00:00:00 2001
From: Joe McIlvain <joe.eli.mac@gmail.com>
Date: Thu, 30 May 2024 13:23:57 -0700
Subject: [PATCH] feat: add optional `additionalData` field to `KurtResult`
 (final stream event)

This is the first step to solving for parallel tool calls (see issue #33).

This commit adds a new optional field to `KurtResult` called `additionalData`
which can hold an array of additional structured data entries, of the same
type `D` that is used for the mandatory `data` field.

This solution has a few nice properties:

- it is a non-breaking change, both for applications and for LLM-specific adapters

- it extends the general case of any method that returns a structured data result
  (i.e. it applies to both `generateStructuredData` and `generateWithOptionalTools`
  without needing to create new methods that overlap partially with those)

- it gives an "at least one" data entry (type system) guarantee, so that the application
  doesn't need to have paranoid code that worries about the case of zero entries.

- it is easy for applications to opt out of dealing with: if they want to choose
  to only handle the first tool call, they can pay attention to only the `data`
  field and ignore the `additionalData` field. (note that if I had went the path
  of having multi-value methods separate from the current single-value methods,
  I would have needed to update the single-value methods to silently drop all
  additional tool calls, and that seems less ideal than explicitly giving all
  of the tool calls to the application, and letting it decide to drop or not drop)

You could argue that it would be "nicer" to have all the tool entries in one array,
but I think the above benefits are reason enough to prefer this design.

Note that after this commit is merged, the next step is to update the adapters
to start handling the multi-value case, making use of this new field.
That work will actually be what is needed to fully resolve the #33 ticket.
---
 packages/kurt/src/KurtStream.ts | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)
diff --git a/packages/kurt/src/KurtStream.ts b/packages/kurt/src/KurtStream.ts
index 3426dfc..7f851c9 100644
--- a/packages/kurt/src/KurtStream.ts
+++ b/packages/kurt/src/KurtStream.ts
@@ -43,8 +43,31 @@ export type KurtResult<D = undefined> = {
    *
    * It will never be `undefined` for type instantiations wherein the
    * `D` type parameter doesn't include the `undefined` type as a possibility.
+   *
+   * Note that sometimes an LLM may generate additional structured data entries,
+   * and if you want to access those, you should look at the `additionalData`
+   * field for any entries that were generated beyond the first entry.
    */
   data: D
+
+  /**
+   * Additional structured data entries generated by the underlying LLM.
+   *
+   * This mechanism is used to represent "parallel tool calls" supported by
+   * some LLMs, which helps optimize for the case where the LLM has enough
+   * information to specify multiple tool calls in one step.
+   *
+   * The first call/entry generated by the LLM will be in the `data` field,
+   * and this field will only hold further entries generated beyond the first.
+   *
+   * This field is optional, because not all LLMs may support parallel calls.
+   *
+   * The first entry is a separate field so that the application can have a
+   * strongly-typed guarantee that at least one data entry will be generated.
+   * Also, an application that doesn't want to support parallel calls can
+   * easily ignore this field and only look at the `data` field.
+   */
+  additionalData?: D[]
 }
 
 /**