Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporate runtime into model configuration #285

Merged
merged 17 commits into from
Jan 11, 2024
Merged

Conversation

kthui
Copy link
Contributor

@kthui kthui commented Nov 3, 2023

Related PRs:

Previously, the runtime for each backend is determined automatically when loading the backend. The PR adds the ability for user to choose which runtime to use for each model on the model configuration. For example:

runtime: "libtriton_tensorrt.so"
runtime: "model.py"

To retain backward compatibility, the logic previously used to determine runtime is incorporated into autocomplete, which will attempt to fill-in the "runtime" field if left empty. The core will load the exact runtime as specified on the "runtime" field on the model configuration, after any applicable autocomplete.

As a part of the change, the autocomplete logic will fill in Python backend based runtime for backends that are clearly Python backend based backends, for example, vLLM backend.

If a backend provides both C++ and Python backend based runtime, for example, PyTorch backend, the autocomplete will look at the default model filename on the model configuration to try determine whether the C++ or Python backend based runtime is more appropriate. For example, if the default model filename on PyTorch backend is "model.pt", the C++ runtime will be selected. If the default model filename is "model.py", the Python backend based runtime will be selected. In any case, the autocomplete will not alter the runtime if it is explicitly provided by the user.

src/backend_model.cc Outdated Show resolved Hide resolved
@kthui kthui requested a review from nnshah1 November 14, 2023 22:30
src/constants.h Outdated Show resolved Hide resolved
src/model_config_utils.h Outdated Show resolved Hide resolved
src/model_config_utils.h Outdated Show resolved Hide resolved
src/backend_model.cc Outdated Show resolved Hide resolved
src/model_config_utils.cc Outdated Show resolved Hide resolved
src/backend_model.cc Outdated Show resolved Hide resolved
src/constants.h Outdated
constexpr char kPyTorchBackend[] = "pytorch";

constexpr char kPythonFilename[] = "model.py";
constexpr char kPythonBackend[] = "python";

constexpr char kVLLMBackend[] = "vllm";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need these for any new python based backend? Would be good to avoid that if possible - we shouldn't need to update the constants if someone adds a backend in the future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need them. constexpr char kVLLMBackend[] = "vllm"; is removed, and they are handled as custom backend.

@kthui kthui force-pushed the jacky-python-based-pytorch branch 2 times, most recently from df1f9ff to 0d89719 Compare November 28, 2023 18:07
src/backend_model.cc Outdated Show resolved Hide resolved
rmccorm4
rmccorm4 previously approved these changes Jan 10, 2024
Copy link
Contributor

@rmccorm4 rmccorm4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean implementation 🚀

Please make sure to run a relatively full pipeline that tests some custom backends and L0_backend_python and PYTORCH:TEST etc. before merge - not just standard L0

Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
@kthui kthui requested a review from rmccorm4 January 10, 2024 23:57
@kthui kthui merged commit 7363cfe into main Jan 11, 2024
1 check passed
@kthui kthui deleted the jacky-python-based-pytorch branch January 11, 2024 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants