You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Could you please add NUMA support to the application?
Essentially, this is just the "--numa" option for llamaсpp.
This allows models to run on multiprocessor systems (linux based) with significant acceleration.
The text was updated successfully, but these errors were encountered:
With the release of Deepseek, those of us running on CPU with EPYC chips would get a noticeable performance benefit with this. I've been testing locally using numactl with koboldcpp to force NUMA aware scheduling (discussed in the llama.cpp repo here: ggerganov#1437).
Having this feature available as a flag in kobold.cpp similar to llama.cpp would be quite helpful.
Could you please add NUMA support to the application?
Essentially, this is just the "--numa" option for llamaсpp.
This allows models to run on multiprocessor systems (linux based) with significant acceleration.
The text was updated successfully, but these errors were encountered: