This project is built for the Google Cloud x MLB™ Hackathon – Building with Gemini Models
This project may also serve as my backup plan as a CS grad if the job market goes south. Inspired by the AI VTuber Neuro-sama.
Melby-sama (MLB... Melby... get it?), an AI MLB VTuber and streamer powered by Google's Multimodal AI — the segue to our sponsor: the Google Cloud x MLB Hackathon, powered by Gemini!
![]() |
|
---|---|
Video Demo | Melby-sama chilling |
![]() |
|
Melby-sama reacts to YouTube live chat | Live Stream Demo |
You can install Poetry by following the official installation guide.
pip install poetry
poetry config virtualenvs.in-project true # to create the virtual environment in the project directory
Create a .env
file at the root of the project with the following variables:
GEMINI_API_KEY=
SPEECH_KEY=
SPEECH_REGION=
You can get your own GEMINI_API_KEY
at Google AI Studio.
You can get the SPEECH_KEY
and SPEECH_REGION
by following the steps below:
- Sign up for an Azure free account at https://azure.microsoft.com/free/cognitive-services.
- Create a Speech Services resource in the Azure Portal.
- Get the
SPEECH_KEY
andSPEECH_REGION
from the resource.
At the root of the project, run:
poetry run python src/main.py
- Downlaod and launch VTube Studio.
- Optional (Advanced) : Port the model's output audio into microphone input of Vtube Studio via Voicemeeter Banana and VB Cable.
- Add VTube Studio as Game Capture to source.
- Add
src/temp/subtitles.txt
as Text to source. - If you previously set up Voicemeeter and VB Cable in Step 4.2, you'll need to configure it here too.