This is using the awesome whisper.cpp project to transcribe your microphone and send it as a message to VRChat. It is a quick adaptation of the stream example from whisper.cpp and uses tinyosc to build the OSC messages at the moment.
Still work in progress, but already kinda works on Linux.
- Easy to use and setup on Windows
- Easy to compile using zig as a build system
- Download the appropriate models from whisper.cpp and copy them to your
$PWD/models
. - Install Zig and build it using
zig build -Drelease-fast=true
(Tested withzig 0.9.1
) - Run it, e.g. using
zig build run -Drelease-fast=true -- -m ./models/ggml-tiny.en.bin -t 10 --step 1100 --length 5000
.
- Get cross compiling to windows to work
- More post processing of chat output, since VRChat throttles chatbox messages
- Filter out reptition, non-voice tokens and simply throttle the output somehow
- Transcriptions can also dissapear too quickly.
- Port it to zig and clean it up
- Consider moving from SDL to miniaudio or something else
- GUI?
Whispering Tiger looks really cool, but I haven't tried it yet. I hope this project can be a bit more lightweight though, but probably won't be as accurate and fast since it's CPU only and has some other limitations.