[Response Ops][Actions] Allow streaming responses from Generative AI connector #161676

ymao1 · 2023-07-11T16:06:36Z

Summary

This PR modifies the test subaction of the Generative AI connector to accept a stream parameter (default: false) that allows for a streaming response.

The Generative AI connector is basically a pass-through to the Open AI/Azure OpenAI APIs, where the stream parameter is passed in via the body of the request. This means that with the existing connector, users could specify stream: true in the body which would lead to unexpected results when the action is unprepared to return streaming results. This PR sanitizes the body that is passed in the run subaction to prevent the stream parameter from being set to true and explicitly sets the stream parameter for the test subaction.

In order to test the streaming response, I created an example plugin that prompts users to create a Generative AI connector if one does not exist and then executes actions using the connector with stream set to true. This borrows liberally from @dgieselaar's existing work from #158678

Jul-12-2023.19-16-43.mp4

To Verify

Navigate to https://localhost:5601/app/GenAiStreamingResponseExample
Set up a Generative AI connector
Open the network console. Enter a prompt and click Stream Response
You should see the chat response return streaming results.

…fix'

elasticmachine · 2023-07-12T23:19:58Z

Pinging @elastic/response-ops (Team:ResponseOps)

lcawl · 2023-07-13T00:31:38Z

If I'm understanding this PR correctly, it seems like there's a test subaction that we'll need to add to the open API specification for the _execute path (e.g. in https://github.com/elastic/kibana/blob/main/x-pack/plugins/actions/docs/openapi/paths/s%40%7Bspaceid%7D%40api%40actions%40connector%40%7Bconnectorid%7D%40_execute.yaml) with details about the new stream property

dgieselaar · 2023-07-13T09:43:49Z

x-pack/examples/gen_ai_streaming_response_example/server/plugin.ts

+            await actionsClient.execute({
+              actionId: request.body.connector_id,
+              params: {
+                subAction: 'test',


I'm not sure if I follow why we need to call test here? What does test do vs run?

Typically we expect the run API to be used for scheduled actions (actions run via task manager) while the test API should be used for ad hoc action executions that are run immediately and not via a scheduled task.

OK! just to be sure: if I want a streaming response, I should use test, and there are no other implications other than that it looks weird? I've also got a followup question: what happens when we start calling more endpoints, or add more LLMs? Do we all handle that in this connector?

if I want a streaming response, I should use test, and there are no other implications other than that it looks weird?

Yes, that's correct! You should use the test subaction

what happens when we start calling more endpoints, or add more LLMs

The generative AI connector is right now basically a pass through to the 3rd party APIs, so the body that you pass to the connector executor is basically the body that's passed to the API, so it's up to the calling function to format the body correctly for the connector provider. Currently, if we want to add a new endpoint for an existing provider, the connector allows you to specify the api URL in the connector config. If we want to add more LLMs, we would need to update the connector to allow for a new provider.

x-pack/plugins/stack_connectors/server/connector_types/gen_ai/lib/utils.ts

dgieselaar

This looks great @ymao1, thank you very much :) just have a couple of questions

…g-response

…nto gen-ai-streaming-response

pmuellr

A little concerned about us returning the response from the service directly, so posting a review early - just made it a bit of the way through before stumbling on the pipeStreamingResponse thread ...

pmuellr · 2023-07-13T15:00:49Z

x-pack/plugins/stack_connectors/server/connector_types/gen_ai/gen_ai.ts

@@ -49,8 +58,8 @@ export class GenAiConnector extends SubActionConnector<GenAiConfig, GenAiSecrets

    this.registerSubAction({
      name: SUB_ACTION.TEST,
-      method: 'runApi',
-      schema: GenAiRunActionParamsSchema,
+      method: 'executeApi',


This would be a migration issue if these connectors were used in queued actions. But they aren't, so this seems fine. Though would be a problem for ZDT changes. Which we don't have to worry about yet.

x-pack/plugins/stack_connectors/server/connector_types/gen_ai/lib/utils.ts

…g-response

stephmilovic · 2023-07-14T16:14:34Z

x-pack/plugins/stack_connectors/server/connector_types/gen_ai/gen_ai.ts

    });
    return response.data;
  }

+  public async executeApi({


What do you think about naming this method more descriptively? streamApi? execute seems to be a synonym for run, and it is not immediately apparent what the difference is

How about just testApi since the subaction is test? streamApi seems deceptive since you can request a streaming response or not

Hm... should the subaction be test? The test refers to this screen:

If it would be less confusing, I can add a new subaction for this stream functionality and leave the test subaction the same as before

Updated in 09b0a32

appreciate that, thank you!

pmuellr

LGTM, left a question about the dont-compress-this header

pmuellr · 2023-07-14T18:51:08Z

x-pack/plugins/stack_connectors/server/connector_types/gen_ai/lib/utils.ts

+
+export const pipeStreamingResponse = (response: AxiosResponse<IncomingMessage>) => {
+  response.data.headers = {
+    ['Content-Type']: 'dont-compress-this',


are we leaving this in? It can't be copied to the eventual outgoing request, I assume, not sure if it's a special directive to the core http framework, or what. Also, not sure that these sorts of things can't be compressed, but there do tend to be issues w/compression and streaming APIs generally.

Iirc, it will be compressed by hapi if it recognizes the content type. A bit of a hack but it works. Maybe there's a less quirky string that we can use though.

Slack thread: https://elastic.slack.com/archives/C5TQ33ND8/p1685452168698039?thread_ts=1685076705.291449&cid=C5TQ33ND8

Heh, cool. Let's add a comment in the code, as this will confuse someone in the future!

Added in ecec58c

dgieselaar

Thanks!

kibana-ci · 2023-07-14T21:11:11Z

💚 Build Succeeded

Buildkite Build
Commit: ecec58c

Metrics [docs]

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`stackConnectors`	35.2KB	35.2KB	+18.0B

Unknown metric groups

ESLint disabled line counts

id	before	after	diff
`enterpriseSearch`	14	16	+2
`securitySolution`	411	415	+4
total			+6

Total ESLint disabled count

id	before	after	diff
`enterpriseSearch`	15	17	+2
`securitySolution`	490	494	+4
total			+6

History

💚 Build #142663 succeeded 09b0a32
💚 Build #142553 succeeded c9a5540
💔 Build #142520 failed 58c6b09
💚 Build #142272 succeeded f6411bc
💚 Build #142079 succeeded d8a051f
💚 Build #142048 succeeded fa085b4

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @ymao1

stephmilovic · 2023-07-14T22:48:36Z

hey im having a hard time running the example app. i couldnt get the application to load at all without adding the no base path arg so this is how i am running it: yarn start --run-examples --no-base-path

this is what i see on your app/GenAiStreamingResponseExample route:

i can see the guide onboarding example app running as expected:

Am I missing something?

stephmilovic

LGTM, thanks for the enhancement. It may a good idea to also update this documentation with the new method: docs/management/connectors/action-types/gen-ai.asciidoc. Nice work!

ymao1 · 2023-07-14T23:16:59Z

It may a good idea to also update this documentation with the new method: docs/management/connectors/action-types/gen-ai.asciidoc.

I'll discuss this with @lcawl and create a separate PR for the docs if necessary. Thanks!

…sponse-stream`. (#162609) ## Summary Follow up to #161676. Since we added support for handling SSE-streams returned by OpenAI APIs to `@kbn/ml-response-stream` in #162335, this updates the "Gen AI Streaming Response" developer example to make use of the custom hook `useFetchStream()` from `@kbn/ml-response-stream` instead of its inline code to process the stream on the client. It also improves the refresh behaviour: Previously, once a prompt was entered and the accordion opened, every keystroke in the prompt textarea would cause a new request to the AI endpoint. This update adds a `Refresh prompt` to only do a new request on this user action. Support for cancelling requests when unmounting the `StreamingResponse` component was also added. ![gen-ai-streaming-0001](https://github.com/elastic/kibana/assets/230104/af81d3ac-cc03-4550-9aa5-8685c15304cd) ### Checklist - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

…sponse-stream`. (elastic#162609) ## Summary Follow up to elastic#161676. Since we added support for handling SSE-streams returned by OpenAI APIs to `@kbn/ml-response-stream` in elastic#162335, this updates the "Gen AI Streaming Response" developer example to make use of the custom hook `useFetchStream()` from `@kbn/ml-response-stream` instead of its inline code to process the stream on the client. It also improves the refresh behaviour: Previously, once a prompt was entered and the accordion opened, every keystroke in the prompt textarea would cause a new request to the AI endpoint. This update adds a `Refresh prompt` to only do a new request on this user action. Support for cancelling requests when unmounting the `StreamingResponse` component was also added. ![gen-ai-streaming-0001](https://github.com/elastic/kibana/assets/230104/af81d3ac-cc03-4550-9aa5-8685c15304cd) ### Checklist - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

ymao1 force-pushed the gen-ai-streaming-response branch 3 times, most recently from 978a274 to c258e00 Compare July 11, 2023 20:26

Allow streaming responses from Gen AI connector

fa085b4

ymao1 force-pushed the gen-ai-streaming-response branch from c258e00 to fa085b4 Compare July 12, 2023 19:08

ymao1 and others added 3 commits July 12, 2023 16:28

Adding example to show streaming response

b5a7346

[CI] Auto-commit changed files from 'node scripts/lint_ts_projects --…

c6d1481

…fix'

[CI] Auto-commit changed files from 'node scripts/generate codeowners'

d8a051f

ymao1 changed the title ~~Gen ai streaming response~~ [Response Ops][Actions] Allow streaming responses from Generative AI connector Jul 12, 2023

ymao1 self-assigned this Jul 12, 2023

ymao1 added Feature:Actions Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.10.0 release_note:skip Skip the PR/issue when compiling release notes labels Jul 12, 2023

ymao1 marked this pull request as ready for review July 12, 2023 23:19

ymao1 requested review from a team as code owners July 12, 2023 23:19

ymao1 requested a review from dgieselaar July 12, 2023 23:20

dgieselaar reviewed Jul 13, 2023

View reviewed changes

x-pack/plugins/stack_connectors/server/connector_types/gen_ai/lib/utils.ts Outdated Show resolved Hide resolved

dgieselaar reviewed Jul 13, 2023

View reviewed changes

ymao1 added 3 commits July 13, 2023 09:55

Handling error result

b633843

Merge branch 'main' of github.com:elastic/kibana into gen-ai-streamin…

981982a

…g-response

Merge branch 'gen-ai-streaming-response' of github.com:ymao1/kibana i…

f6411bc

…nto gen-ai-streaming-response

pmuellr requested changes Jul 13, 2023

View reviewed changes

ymao1 added 2 commits July 14, 2023 08:12

Updating headers

512e6a2

Merge branch 'main' of github.com:elastic/kibana into gen-ai-streamin…

58c6b09

…g-response

ymao1 requested a review from pmuellr July 14, 2023 12:18

Fixing tests

c9a5540

stephmilovic reviewed Jul 14, 2023

View reviewed changes

Adding new subaction

09b0a32

pmuellr approved these changes Jul 14, 2023

View reviewed changes

dgieselaar approved these changes Jul 14, 2023

View reviewed changes

Adding comment for funky content type header

ecec58c

ymao1 requested a review from stephmilovic July 14, 2023 21:20

stephmilovic approved these changes Jul 14, 2023

View reviewed changes

ymao1 merged commit 8a56a2b into elastic:main Jul 14, 2023

kibanamachine added the backport:skip This commit does not require backporting label Jul 14, 2023

ymao1 deleted the gen-ai-streaming-response branch July 14, 2023 23:17

walterra mentioned this pull request Jul 27, 2023

Update "Gen AI Streaming Response Example" to make use of @kbn/ml-response-stream. #162609

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Response Ops][Actions] Allow streaming responses from Generative AI connector #161676

[Response Ops][Actions] Allow streaming responses from Generative AI connector #161676

ymao1 commented Jul 11, 2023 •

edited

Loading

elasticmachine commented Jul 12, 2023

lcawl commented Jul 13, 2023

dgieselaar Jul 13, 2023

ymao1 Jul 13, 2023

dgieselaar Jul 13, 2023

ymao1 Jul 13, 2023

dgieselaar left a comment

pmuellr left a comment

pmuellr Jul 13, 2023

stephmilovic Jul 14, 2023 •

edited

Loading

ymao1 Jul 14, 2023

stephmilovic Jul 14, 2023

ymao1 Jul 14, 2023

ymao1 Jul 14, 2023

stephmilovic Jul 14, 2023

pmuellr left a comment

pmuellr Jul 14, 2023

dgieselaar Jul 14, 2023

dgieselaar Jul 14, 2023

pmuellr Jul 14, 2023

ymao1 Jul 14, 2023

dgieselaar left a comment

kibana-ci commented Jul 14, 2023

ESLint disabled line counts

Total ESLint disabled count

stephmilovic commented Jul 14, 2023

stephmilovic left a comment

ymao1 commented Jul 14, 2023

[Response Ops][Actions] Allow streaming responses from Generative AI connector #161676

[Response Ops][Actions] Allow streaming responses from Generative AI connector #161676

Conversation

ymao1 commented Jul 11, 2023 • edited Loading

Summary

To Verify

elasticmachine commented Jul 12, 2023

lcawl commented Jul 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dgieselaar left a comment

Choose a reason for hiding this comment

pmuellr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stephmilovic Jul 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pmuellr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dgieselaar left a comment

Choose a reason for hiding this comment

kibana-ci commented Jul 14, 2023

💚 Build Succeeded

Metrics [docs]

Page load bundle

ESLint disabled line counts

Total ESLint disabled count

History

stephmilovic commented Jul 14, 2023

stephmilovic left a comment

Choose a reason for hiding this comment

ymao1 commented Jul 14, 2023

ymao1 commented Jul 11, 2023 •

edited

Loading

stephmilovic Jul 14, 2023 •

edited

Loading