Replies: 3 comments 5 replies
-
Also, the way llama.cpp is moving, it is accumulating more and more other features, like clip and other embeddings, tts with audio en/decoder... just a matter of time before vae and diffusion sampling becomes a desired feature of llama.cpp . |
Beta Was this translation helpful? Give feedback.
-
Reimplementing from scratch definitely should be avoided in my opinion. I think koboldcpp is positioned well |
Beta Was this translation helpful? Give feedback.
-
The new Qwen-Image will use Qwen2.5-VL as its text/image encoder. The future seems to be moving towards vision-language models, and more models will natively support instruction-based image editing. We should definitely rely on llama.cpp to catch up with this new trend |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
With models like Lumina 2.0 and HiDream I1, the future of diffusion models seems to be to use autoregressive LLMs (GPTs) as text encoders, for example Google Gemma2 for Lumina or Meta Llama3 for HiDream. These models are already very well supported in llama.cpp, So I'm wondering what should be the way to support them.
Should llama.cpp be included as a submodule? (this could maybe help T5 run better on GPU too) Or should sdcpp re-implement these models from scratch?
Beta Was this translation helpful? Give feedback.
All reactions