From b43476108eae117677336c0e01e13aef99725249 Mon Sep 17 00:00:00 2001
From: Travis Wilson <35748617+trrwilson@users.noreply.github.com>
Date: Tue, 5 Nov 2024 10:58:24 -0800
Subject: [PATCH] README updates for beta.2 (#73)

---
 dotnet/samples/README.md                   | 102 ++++++++++++++++++---
 dotnet/samples/console-from-file/README.md |   6 +-
 2 files changed, 92 insertions(+), 16 deletions(-)

diff --git a/dotnet/samples/README.md b/dotnet/samples/README.md
index 2509a2b..50c835b 100644
--- a/dotnet/samples/README.md
+++ b/dotnet/samples/README.md
@@ -4,7 +4,7 @@ This folder contains samples that use the `/realtime` API with the OpenAI .NET S
 
 | | |
 |---|---|
-| Last updated for | Azure.AI.OpenAI.2.1.0-beta.1 |
+| Last updated for | Azure.AI.OpenAI.2.1.0-beta.2 |
 
 ## General patterns
 
@@ -19,6 +19,8 @@ AzureOpenAIClient topLevelClient = new(
 RealtimeConversationClient client = topLevelClient.GetRealtimeConversationClient("my-gpt-4o-realtime-preview-deployment");
 ```
 
+If connecting to OpenAI's `/v1/realtime` endpoint, substitute use of `OpenAIClient` or construct a `RealtimeConversationClient` directly. All other usage is identical.
+
 ### Session setup
 
 To begin a `/realtime` session, call `StartConversationSessionAsync()` on a configured `RealtimeConversationClient` instance. Note that `RealtimeConversationSession` implements `IDisposable` and consider employing the `using` keyword to ensure prompt connection cleanup.
@@ -45,6 +47,10 @@ ConversationSessionOptions options = new()
 await session.ConfigureSessionAsync(options);
 ```
 
+**Input audio transcription** (an approximation of what was said in user-provided input audio) is not enabled by default; to enable it, populate the `InputTranscriptionOptions` property as above.
+
+By default, **turn detection** will use server voice activity detection (VAD). To disable this or customize the behavior of server VAD, provide a value to the `TurnDetectionOptions` property -- `ConversationTurnDetectionOptions.CreateDisabledTurnDetectionOptions()` will provide an instance that turns VAD off, enabling push-to-talk or a custom client-side VAD implementation to be used.
+
 ### Sending data
 
 **Audio**:
@@ -53,41 +59,111 @@ For simplicity, samples here will often use a "fire and forget" pattern with an
 
 ```csharp
 using Stream audioInputStream = File.OpenRead("..\\audio_hello_world.wav");
-_ = session.SendAudioAsync(audioInputStream);
+_ = session.SendInputAudioAsync(audioInputStream);
 ```
 
-This `Stream`-based method will automatically read and chunk data from the stream 
+This `Stream`-based method will automatically read and chunk data from the stream. If finer granularity or otherwise push-style control is needed, the `SendInputAudioAsync(BinaryData)` method signature can be used to send chunks individually.
+
+**Text and other non-audio data**:
 
-**Text**:
+Text input, tool responses, conversation history, and other information are supplied to the session via the `AddItemAsync()` method. The `ConversationItem` type provides various static factory methods to instantiate items including role-based chat messages and function tool outputs, among others. For example:
 
-Text input, tool responses, conversation history, and other information are supplied to the session via the `AddItemAsync()` method. The `ConversationItem` type provides various static factory methods to instantiate items including role-based chat messages and function tool outputs, among others. 
+- `ConversationItem.CreateUserMessage()` creates a user-role conversation item reflecting one or more content parts that can feature text input.
+- `ConversationItem.CreateFunctionCallOutput()` creates a conversation item that responds to a received function call.
+- `ConversationItem.CreateAssistantMessage()` and `ConversationItem.CreateFunctionCall()` facilitate the creation of items that form or restore a conversation history.
+
+```csharp
+await session.AddItemAsync(
+    ConversationItem.CreateUserMessage(["Hello, assistant! Can you help me today?"]));
+``` 
 
 **Manual messages**
 
-Only a subset of the full `/realtime` protocol is currently represented; if sending an explicit message is desired, the generic `conversation.SendMessageAsync(data)` allows an arbitrary message to be sent:
+If sending an explicit message is desired, the generic `session.SendCommandAsync(BinaryData)` allows an arbitrary message to be sent:
 
 ```csharp
-await conversation.SendMessageAsync(BinaryData.FromString("""
+await session.SendCommandAsync(BinaryData.FromString("""
 {
-  "event": "create_conversation",
-  "label": "my_second_conversation"
+  "event": "session.update",
+  "session": {
+  }
 }
 """);
 ```
 
 ### Receiving data
 
-Incoming message receipt is pumped via the `IAsyncEnumerable<ConversationUpdate>` provided by `session.ReceiveUpdatesAsync()`. In addition to being downcastable into derived types that encapsulate command-specific data, each `ConversationUpdate` also exposes a generic `BinaryData` instance via the `GetRawContent()` method, which will provide the direct JSON payload present in the message.
+Incoming message receipt is pumped via the `IAsyncEnumerable<ConversationUpdate>` provided by `session.ReceiveUpdatesAsync()`. Each incoming `ConversationUpdate` has an enumerated `Kind` value that maps directly to a WebSocket server event type (like `session.started`) and, depending on the type, each update will be downcastable to a derived type of `ConversationUpdate` with additional data specific to the event.
+
+As an example: upon connection, the session will receive a `session.updated` server event that's received as a `ConversationSessionStartedUpdate` via `ReceiveUpdatesAsync()`. That will expose a `SessionStarted` enumeration value on its `Kind` property and be accessible via downcast:
 
 ```csharp
 await foreach (ConversationUpdate update in conversation.ReceiveUpdatesAsync())
 {
-    Console.WriteLine(message.GetRawContent().Content.ToString());
+    // update.Kind == ConversationUpdateKind.SessionStarted (session.started)
     if (update is ConversationSessionStartedUpdate sessionStartedUpdate)
     {
-        // ...
+        Console.WriteLine($"New session started, id = {sessionStartedUpdate.SessionId}");
     }
 }
 ```
 
-`ConversationUpdate` also exposes a `Kind` property with a enum value that directly maps to an associated WebSocket command `type`.
\ No newline at end of file
+**Session-wide updates**
+
+The following all provide information pertaining the session itself or to the shared information persisted across responses in the session:
+
+| Derived type | Kind value(s) | WebSocket event | Description |
+|---|---|---|---|
+| `ConversationSessionStartedUpdate` | `SessionStarted` | `session.created` | Raised upon successful connection. Provides *default* session configuration values that do not reflect any changes made via `ConfigureSessionAsync()`. |
+| `ConversationSessionConfiguredUpdate` | `SessionConfigured` | `session.updated` | Raised upon receipt of a `session.update` command via `ConfigureSessionAsync()`. Provides *updated* session configured values reflecting the requested changes. Response-level changes will take effect beginning with the next response. |
+| `ConversationInputSpeechStartedUpdate` | `InputSpeechStarted` | `input_audio_buffer.speech_started` | With server-side voice activity detection enabled (also default), this is raised when the audio provided via `SendInputAudioAsync()` has speech detected. |
+| `ConversationInputSpeechFinishedUpdate` | `InputSpeechFinished` | `input_audio_buffer.speech_stopped` | With server-side voice activity detection enabled (also default), this is raised when the audio provided via `SendInputAudioAsync()` ceases to detect active speech. |
+| `ConversationInputAudioCommittedUpdate` | `InputAudioCommitted` | `input_audio_buffer.committed` | Raised when input audio is committed as conversation input. This will occur automatically when server-side voice activity detection is enabled, upon end of speech detection. Without server VAD, an explicit call to `CommitInputAudioAsync()` is required. |
+| `ConversationInputAudioClearedUpdate` | `InputAudioCleared` | `input_audio_buffer.cleared` | Raised when input audio is cleared via a call to `ClearInputAudioAsync()`. |
+| `ConversationRateLimitsUpdate` | `RateLimitsUpdated` | `rate_limits.updated` | Periodically raised to reflect the latest rate limit information for tokens and requests. |
+
+**Response-level updates**
+
+| Derived type | Kind value(s) | WebSocket event | Description |
+|---|---|---|---|
+| `ConversationResponseStartedUpdate` | `ResponseStarted` | `response.created` | Raised when the model begins generating a new response, snapshotting current input state. This occurs automatically with end of speech when server voice activity detection is enabled and can be requested manually via `StartResponseAsync()`. |
+| `ConversationResponseFinishedUpdate` | `ResponseFinished` | `response.done` | Raised when all response data is complete. |
+
+**Item-level updates**
+
+| Derived type | Kind value(s) | WebSocket event | Description |
+|---|---|---|---|
+| `ConversationItemCreatedUpdate` | `ItemCreated` | `conversation.item.created` | |
+| `ConversationItemDeletedUpdate` | `ItemDeleted` | `conversation.item.deleted` | |
+| `ConversationItemTruncatedUpdate` | `ItemTruncated` | `conversation.item.truncated` | |
+| `ConversationInputTranscriptionFinishedUpdate` | `InputTranscriptionFinished` | `conversation.item.input_audio_transcription.completed` | |
+| `ConversationInputTranscriptionFailedUpdate` | `InputTranscriptionFailed` | `conversation.item.input_audio_transcription.failed` | |
+
+**Item streaming updates**
+
+| Derived type | Kind value(s) | WebSocket event | Description |
+|---|---|---|---|
+| `ConversationItemStreamingStartedUpdate` | `ItemStreamingStarted` | `response.output_item.added` | Received when a new output item is opened for the response and begins receiving streamed information. This will be followed by some number of `ConversationItemStreamingPartDeltaUpdate` instances providing the streamed data before a `ConversationItemStreamingFinishedUpdate` signals the end of all streamed incremental information. |
+| `ConversationItemStreamingFinishedUpdate` | `ItemStreamingFinished` | `response.output_item.done` | Received when a new output item has finished receiving all streamed information. Includes the accumulated data of the delta updates. |
+| `ConversationItemStreamingPartDeltaUpdate` | * | * | This update is received when incremental streamed data is available for an in-progress response output item. It combines several server event types, with the specific payload inferrable from which properties are populated or the value of `Kind` on the update. Some streamed conversation items can consistent of multiple content parts; in this situation, the `ContentPartIndex` will distinguish between inner content parts and individual `ConversationItemStreamingPartFinishedUpdates` instances will be raised per content part. |
+| | `ItemContentPartStarted` | `response.content_part.added` | |
+| | `ItemStreamingPartAudioDelta` | `response.audio.delta` | |
+| | `ItemStreamingPartAudioTranscriptionDelta` | `response.audio_transcript.delta` | |
+| | `ItemStreamingPartTextDelta` | `response.text.delta` | |
+| | `ItemStreamingFunctionCallArgumentsDelta` | `response.function_call_arguments.delta` | |
+| `ConversationItemStreamingPartFinishedUpdate` | * | * | Received when an individual component of a streamed conversation item, such as a content part, has finished receiving all streamed data. In many circumstances, using the superset of information available in `ConversationItemStreamingFinishedUpdate` is adequate; this update simply provides further granularities in instances where multiple item components are streamed. |
+| | `ItemStreamingFunctionCallArgumentsFinished` | `response.function_call_arguments.done` | |
+| | `ItemContentPartFinished` | `response.content_part.done` | |
+
+**Raw/protocol update usage**
+
+In addition to being downcastable into derived types that encapsulate command-specific data, each `ConversationUpdate` also exposes a generic `BinaryData` instance via the `GetRawContent()` method, which will provide the direct JSON payload present in the message.
+
+```csharp
+await foreach (ConversationUpdate update in conversation.ReceiveUpdatesAsync())
+{
+    Console.WriteLine(message.GetRawContent().Content.ToString());
+}
+```
+
+Together with the use of `SendCommandAsync(BinaryData)`, 
\ No newline at end of file
diff --git a/dotnet/samples/console-from-file/README.md b/dotnet/samples/console-from-file/README.md
index 9c931ba..5e36507 100644
--- a/dotnet/samples/console-from-file/README.md
+++ b/dotnet/samples/console-from-file/README.md
@@ -54,12 +54,12 @@ A `/realtime` connection session is managed via the `RealtimeConversationSession
 
 Calling `AddItemAsync()` on `RealtimeConversationSession` allows adding non-audio (e.g. text) content as well as establishing conversation history or few-shot examples for model inference to use. As demonstrated further in the sample, this method is also the mechanism used to provide responses to tool calls.
 
-`RealtimeConversationSession`'s `SendAudioAsync(Stream)` method will automatically chunk and transmit audio data from a provided stream. Alternatively, the `SendAudioAsync(BinaryData)` method allows individual audio message transmissions. Because commands are sent and received in parallel, it's not necessary to `await` or otherwise block on audio transmission; the sample application goes directly into the message receipt processing.
+`RealtimeConversationSession`'s `SendInputAudioAsync(Stream)` method will automatically chunk and transmit audio data from a provided stream. Alternatively, the `SendInputAudioAsync(BinaryData)` method allows individual audio message transmissions. Because commands are sent and received in parallel, it's not necessary to `await` or otherwise block on audio transmission; the sample application goes directly into the message receipt processing.
 
-`RealtimeConversationSession`'s `ReceiveUpdatesAsync()` method provides an `IAsyncEnumerable` of `ConversationUpdate` instances, each representing a single received command from the `/realtime` endpoint. The `ConversationUpdateKind` enumeration on the `UpdateKind` property of the `ConversationUpdate` type maps directly to the corresponding `type` in the wire protocol; these, in turn, also have a down-cast, concrete derived type of the abstract `ConversationUpdate`, e.g. `ConversationResponseStartedUpdate` for `response.created` and `ConversationItemFinishedUpdate` for `conversation.item.done`. These down-cast types can be cast via `as` or `is` to gain access to command-specific data, e.g. `(update as ConversationAudioTranscriptDeltaUpdate).Delta`.
+`RealtimeConversationSession`'s `ReceiveUpdatesAsync()` method provides an `IAsyncEnumerable` of `ConversationUpdate` instances, each representing a single received command from the `/realtime` endpoint. The `ConversationUpdateKind` enumeration on the `Kind` property of the `ConversationUpdate` type maps directly to the corresponding `type` in the wire protocol; these, in turn, also have a down-cast, concrete derived type of the abstract `ConversationUpdate`, e.g. `ConversationResponseStartedUpdate` for `response.created`.
 
 ## Advanced use
 
-The strongly typed surface for `RealtimeConversationSession` is under active development and may not adequately expose all details of the wire protocol, particularly as commands continue to evolve. It supports passthrough use of request messages via `SendCommandAsync(BinaryData)` (allowing arbitrary JSON to be sent) and the raw JSON of each message may be retrieved by serializing each `ConversationUpdate` instance via `System.ClientModel.Primitives.ModelReaderWriter.Write(update)`. In this manner, `RealtimeConversationSession` may be treated as a low-level WebSocket message client for `/realtime`.
+The strongly typed surface for `RealtimeConversationSession` is under active development and may not yet accurately reflect every detail of the wire protocol. It supports passthrough use of request messages via `SendCommandAsync(BinaryData)` (allowing arbitrary JSON to be sent) and the raw JSON of each message may be retrieved by serializing each `ConversationUpdate` instance via `ConversationUpdate.GetRawContent()` or `System.ClientModel.Primitives.ModelReaderWriter.Write(update)`. In this manner, `RealtimeConversationSession` may be treated as a low-level WebSocket message client for `/realtime`.
 
 For direct observability of WebSocket traffic as it's sent and received, `RealtimeConversationClient` provides `OnSendingCommand` and `OnReceivingCommand` event handlers.
\ No newline at end of file