Request: NUMA support #660

frz121 · 2024-02-04T19:22:36Z

Could you please add NUMA support to the application?
Essentially, this is just the "--numa" option for llamaсpp.
This allows models to run on multiprocessor systems (linux based) with significant acceleration.

rogerfachini · 2025-02-04T00:03:45Z

With the release of Deepseek, those of us running on CPU with EPYC chips would get a noticeable performance benefit with this. I've been testing locally using numactl with koboldcpp to force NUMA aware scheduling (discussed in the llama.cpp repo here: ggerganov#1437).

Having this feature available as a flag in kobold.cpp similar to llama.cpp would be quite helpful.

LostRuins added the enhancement New feature or request label Feb 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request: NUMA support #660

Request: NUMA support #660

frz121 commented Feb 4, 2024

rogerfachini commented Feb 4, 2025

Request: NUMA support #660

Request: NUMA support #660

Comments

frz121 commented Feb 4, 2024

rogerfachini commented Feb 4, 2025