-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
Per the issue I ran into #15804 and then getting directed to PR #14236, I suggest we add --mmproj-device
as an arg to be consistent with the rest of how llama.cpp is used.
This should be available in both llama-cli
and llama-server
.
Motivation
Users may want to specify different devices for different configs, and managing an env var is not ideal, nor is it consistent with how similar features are already set up within llama.cpp. In my setup, my more powerful, newer GPU is recognized as Vulkan1
and an older one as Vulkan0
.
I should add that I'm actually having trouble getting the existing environment variable MTMD_BACKEND_DEVICE
to work. I try setting Vulkan1
or 1
, but llama-server doesn't do anything other than use the default Vulkan0
. Thus vision inference tk/s performance takes a dive due to Vulkan0
being a bottleneck.
Possible Implementation
No response