Skip to content
This repository has been archived by the owner on Jul 15, 2024. It is now read-only.

[STORY] Enable Caikit Python API library #15

Closed
heyselbi opened this issue Oct 2, 2023 · 11 comments
Closed

[STORY] Enable Caikit Python API library #15

heyselbi opened this issue Oct 2, 2023 · 11 comments
Assignees
Labels

Comments

@heyselbi
Copy link

heyselbi commented Oct 2, 2023

Caikit python library - so it can be accessed from a notebook. It would be a wrapper around grpcio/requests for the API. It can be pip installed in the notebook.

Task includes:

  • Develop an abstraction layer a python client API to access the Caikit core APIs from a notebook. Could we do a REST interface with swagger enablement? KServe APIs already supports swagger.
  • Look into generating pb2 files during runtime. Is this an option? Is there an impact on inferencing performance?
  • Create a Caikit Python API documentation that is data scientist persona focused.
  • Long term: can we make these APIs be compatible with OpenAI APIs and HG TGI APIs?

Related issues:

@heyselbi heyselbi converted this from a draft issue Oct 2, 2023
@heyselbi heyselbi moved this from New/Backlog to To-do/Groomed in ODH Model Serving Planning Oct 2, 2023
@heyselbi
Copy link
Author

heyselbi commented Oct 2, 2023

@danielezonca
Copy link

danielezonca commented Oct 2, 2023

Note: the code of the example is still WIP so check with @guimou before doing code changes

@guimou
Copy link
Member

guimou commented Oct 2, 2023 via email

@guimou
Copy link
Member

guimou commented Oct 2, 2023

Quick note on this "Look into generating pb2 files during runtime. Is this an option? Is there an impact on inferencing performance?".
The bad news is that the Python grpc-reflection package does not implement on-the-fly stub generation. It's builtin in Java, available in Go through a third-party package, but nothing in Python. You can only retrieve the protofiles. Of course you could make a shell call to protoc, but that's really dirty, and you'd have to bundle protoc as well, for all architectures.
At the moment it's easier to keep the pb2 files...

@guimou
Copy link
Member

guimou commented Oct 4, 2023

This is also an interesting avenue. A wrapper around different serving providers to allow direct use of OpenAI API: https://github.com/BerriAI/litellm

@krrishdholakia
Copy link

Hey @guimou - i'm the comaintainer of litellm. Happy to help out via PR. What's the problem you're hoping to solve here with litellm?

@z103cb
Copy link

z103cb commented Oct 23, 2023

@Xaenalt and @heyselbi,
@vaibhavjainwiz and I have met discussed how we think we should be tackling this issue (I am also summarising what we discussed on slack).

The library requirements:

  1. Creating a connection, specify HTTP or gRPC, optionally provide a cert to trust (required for gRPC with self-signed certs, there's more we can discuss about potentially fixing that upstream too)
  2. Calling TextGenerationTaskPredict and the streaming version on that http/grpc connection, with inputs like fn_name(text="sometext")
  3. Ensuring that we can pass other keyword args to it, like min_new_tokens, max_new_tokens, etc

The implementation:

  1. The library will be housed in a new repo under the Caikit organisation. Tentatively named Caikit-nlp-client.
  2. In the library we will provide a mechanism to create from .proto files static _pb2.py which will be used to provide the serialisation mechanisms and GRPC client (stub).
  3. The .proto will be generated from executing:
RUNTIME_LIBRARY=caikit_nlp python -m caikit.runtime.dump_services $grpc_interface_dir
  1. The _pb2.py files will be generated from executing the python generation from .proto (the following example command line might not be 100% accurate):
python -m grpc_tools.protoc -I./grpc/ --python_out=. --pyi_out=. --grpc_python_out=. grpc/*.proto
  1. On top of the generated python we will write a client class to provide a simple and straightforward way to make the GRPC calls to the NLP service.

  2. For now we plan on using the generated _pb2.py DTO (request and responses) as the model for the HTTP client. I am not 100% sure that using those objects for the http client would work (a cursory google search, stack overflow would indicate that is possible). I will need to prototype that to make sure it would work.

  3. Provide for some automated tests:

  • Spin up the Caikit NLP server on the test machine (no certificates) with at least two simple models running
  • Connect to the running server
  • Exercise the endpoints, focus of error conditions (un-happy path).

@heyselbi heyselbi moved this from To-do/Groomed to In Progress in ODH Model Serving Planning Oct 24, 2023
@heyselbi heyselbi assigned z103cb and unassigned danielezonca Oct 24, 2023
@heyselbi heyselbi removed the rhods-2.5 label Nov 2, 2023
@vaibhavjainwiz
Copy link
Member

Implement insecure HTTP client
https://github.com/vaibhavjainwiz/caikit-nlp-client/pull/31

@dtrifiro
Copy link

dtrifiro commented Nov 6, 2023

Initial implementation (wip): opendatahub-io/caikit-nlp-client#1

@z103cb
Copy link

z103cb commented Nov 7, 2023

@heyselbi I think we should still keep this open (but I will defer) to your better judgement.

@z103cb z103cb reopened this Nov 7, 2023
@github-project-automation github-project-automation bot moved this from Done to New/Backlog in ODH Model Serving Planning Nov 7, 2023
@heyselbi heyselbi moved this from New/Backlog to In Progress in ODH Model Serving Planning Nov 7, 2023
@heyselbi heyselbi changed the title Enable Caikit Python API library [STORY] Enable Caikit Python API library Nov 16, 2023
@dtrifiro
Copy link

First version (0.0.2) was released on PyPi https://pypi.org/project/caikit-nlp-client/. See https://github.com/opendatahub-io/caikit-nlp-client/releases for releases.

@github-project-automation github-project-automation bot moved this from In Progress to Done in ODH Model Serving Planning Nov 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
Status: Done
Status: No status
Status: Done
Development

No branches or pull requests

8 participants