Replies: 1 comment
-
It sounds like you might be hitting your GPU compute/VRAM limits. Unfortunately, there aren't any easy ways to fix this other than to just get more GPU or use a smaller model. I think the Neuro STT and TTS models will take up ~5-6GB of vram, so try to find a LLM that will fit in whatever you have left. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey, any advice on increasing the tokens per second when using this?
My typical rate is around 8t/s when using the webui with tts, however when using neuro i'm only getting around 1t/s, often dropping to 0.2t/s.
Any help / pointers would be appreciated!
Beta Was this translation helpful? Give feedback.
All reactions