Gemini Engineer

Description

Inspired by:

An AI assistant built upon the new Gemini 2.0 Flash to allow for real-time audio and text interaction.

Tested on WSL2 Ubuntu 20.04

Important: use headphones.

This script uses the system default audio input and output, which often won't include echo cancellation. So to prevent the model from interrupting itself it is important that you use headphones.

Web Interface

The mode option works as follows:

TEXT: when you press the mic button you can start talking and Gemini will answer by text.
AUDIO: when you press the mic button you can start talking and Gemini will answer by voice.

Installation

sudo apt install libasound2-plugins

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and setup
git clone https://github.com/mett29/gemini-engineer.git
cd gemini-engineer
uv sync
source .venv/bin/activate

# Run web interface
python main.py

You can also directly run the CLI:

python src/gemini_engineer.py

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
src		src
static		static
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gemini Engineer

Description

Web Interface

Installation

About

Languages

License

mett29/gemini-engineer

Folders and files

Latest commit

History

Repository files navigation

Gemini Engineer

Description

Web Interface

Installation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages