You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 15, 2024. It is now read-only.
Is your feature request related to a problem? Please describe.
With the new modelcar feature of KServe it is possible to access model data directly from within an OCI image without downloading or copying. However, since this model is injected as a sidecar, the startup order of the containers in non-deterministic.
Since the modelcar container creates the symbolic link /mnt/models to point to the model stored within that image, the path /mnt/models might not exist when the caikit transformer container starts. This will trigger line
When lazy loading is enabled via lazy_load_local_models, the runtime should wait a certain amount of time for the path (/mnt/models) to come up before giving up. This can be tricky since when the model still needs to be loaded in the node's OCI runtime, it gets pulled from the registry, which might take quite some time. But if it is already loaded, the model can be delayed in a matter of seconds.
Additional context
More about the modelcar approach can be found here:
Is your feature request related to a problem? Please describe.
With the new
modelcar
feature of KServe it is possible to access model data directly from within an OCI image without downloading or copying. However, since this model is injected as a sidecar, the startup order of the containers in non-deterministic.Since the modelcar container creates the symbolic link
/mnt/models
to point to the model stored within that image, the path/mnt/models
might not exist when the caikit transformer container starts. This will trigger linecaikit/caikit/runtime/model_management/model_manager.py
Line 105 in 4b42d37
Describe the solution you'd like
When lazy loading is enabled via
lazy_load_local_models
, the runtime should wait a certain amount of time for the path (/mnt/models
) to come up before giving up. This can be tricky since when the model still needs to be loaded in the node's OCI runtime, it gets pulled from the registry, which might take quite some time. But if it is already loaded, the model can be delayed in a matter of seconds.Additional context
More about the modelcar approach can be found here:
The text was updated successfully, but these errors were encountered: