-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADR 0010 ODH/Caikit/TGIS integration #20
Conversation
Imo we should have the lucidchart be included in this ADR itself |
It's linked in "other docs" and in the "references" section, is there a way to include it more generally? |
Co-authored-by: Anish Asthana <anishasthana1@gmail.com>
I'd recommend exporting the diagram to SVG, commit the SVG file along the ADR, and include the SVG into the ADR markdown file as suggested by @anishasthana. Also as Lucidchart is being retired at Red Hat, I'd recommend to export the diagram in VSDX format, so it can be imported into other diagramming solutions like draw.io. |
| ---------------- | ------------------------------------------------------------------------------------------------------------------------------ | | ||
| Date | 2023-Sept-13 | | ||
| Scope | OpenDataHub and Caikit/TGIS integration architecture | | ||
| Status | Accepted | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this saying accepted already? :-)
|
||
## Non-Goals | ||
|
||
## How |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will the Caikit be deployed on a k8s cluster? Will it be additional pods/services that get deployed to the cluster?
Will it have it's own controller/operator? Will it have it's own CRDs that it manages? If so what will they do?
How will it be integrated with the ODH Operator? Will it be a new component that the DSC will need to deploy?
Will it be able to function if a user has only deployed Ray or KServe and not both?
What is the relationship between a Caikit CR and the Ray/KServe objects? Will it be like a DSPA where an instance of Caikit will need to be deployed in every Data Science Project?
Is there something that a user needs to do to make the Caikit SDK to work with Ray/Kserve or will all of the compatibility be handled on the users end (e.g. Elyra handles 100% of the translation from an "Elyra Pipeline" to a kfp-tekton compatible pipeline in the running notebook so dsp never needs to "understand" Elyra)?
Will anything be required by the Ray or KServe stacks to get Caikit to function or will it slot in on top of them as they exist today?
Some sort of rough architecture diagram would probably be very helpful here.
|
||
## How | ||
|
||
Users will have a few ways to interact with the software stack. Caikit will be used both as a backend software runtime, which is used by the Caikit SDK that users can code against to create their models. These models can be trained in Ray using the Caikit runtime stack as the training backend on the nodes. Caikit will also be integrated as a serving runtime under KServe. All of these components can be interacted with using the standard OpenShift APIs, creating CRs in OpenShift, etc. Additionally, Caikit will also expose an API that can run on the cluster, allowing for several convenience features such as moving a model between training and serving, as well as some tracking. These features will be implemeted in the same manner, creating CRs and calling OpenShift APIs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caikit will be used both as
This sentence only lists one option. The second option probably got moved into a separate sentence while it was being edited so "both" no longer makes sense here.
|
||
## How | ||
|
||
Users will have a few ways to interact with the software stack. Caikit will be used both as a backend software runtime, which is used by the Caikit SDK that users can code against to create their models. These models can be trained in Ray using the Caikit runtime stack as the training backend on the nodes. Caikit will also be integrated as a serving runtime under KServe. All of these components can be interacted with using the standard OpenShift APIs, creating CRs in OpenShift, etc. Additionally, Caikit will also expose an API that can run on the cluster, allowing for several convenience features such as moving a model between training and serving, as well as some tracking. These features will be implemeted in the same manner, creating CRs and calling OpenShift APIs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
creating CRs in OpenShift
What CRs? What will they do?
|
||
## What | ||
|
||
This ADR describes the architecture of the joint IBM-RedHat integration of ODH and Caikit/TGIS into the AI stack. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we good mentioning company ascription here?
It doesn't seem relevant for ODH.
This PR was closed because it has been stale for 21+7 days with no activity. |
Overview of the architecture and diagram of the ODH+Caikit+TGIS architecture
Description
How Has This Been Tested?
Merge criteria: