Spotty: Voice Interface for Boston Dynamics Spot

Natural language interface for Spot robot with vision, navigation, and contextual awareness.

Video Demo

Features

Voice Control

Wake word activation ("Hey Spot")
Speech-to-text via OpenAI Whisper
Text-to-speech responses
Conversation memory

Navigation

GraphNav integration with waypoint labeling
Location-based commands ("Go to kitchen")
Object search and navigation
Automatic scene understanding via GPT-4o-mini + CLIP

Vision

Scene description and visual Q&A
Object detection and environment mapping
Multimodal RAG system for location context

Architecture

UnifiedSpotInterface: Main orchestrator
GraphNav Interface: Map recording and navigation
Audio Interface: Wake word detection, speech processing
RAG Annotation: Location/object knowledge base
Vision System: Camera processing and interpretation

Uses OpenAI GPT-4o-mini, Whisper, TTS, CLIP, and FAISS vector database.

Setup

Prerequisites

Boston Dynamics Spot robot
Python 3.8+
Boston Dynamics SDK
OpenAI and Picovoice API keys

Installation

git clone https://github.com/vocdex/SpottyAI.git
cd spotty
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
pip install -r requirements-optional.txt
pip install -e .

export OPENAI_API_KEY="your_key"
export PICOVOICE_ACCESS_KEY="your_key"

Map Setup

Record environment map:

python scripts/recording_command_line.py --download-filepath /path/to/map ROBOT_IP

Auto-label waypoints:

python scripts/label_waypoints.py --map-path /path/to/map --label-method clip --prompts kitchen hallway office

Create RAG database:

python scripts/label_with_rag.py --map-path /path/to/map --vector-db-path /path/to/database --maybe-label

Visualize setup:

python scripts/visualize_map.py --map-path /path/to/map --rag-path /path/to/database

Run

python main_interface.py

Usage

Voice Commands:

"Hey Spot" → activate
"Go to the kitchen" → navigate to location
"Find a chair(a mug, a plant, etc.)" → search and navigate to object
"What do you see?" → describe surroundings
"Stand up" / "Sit down" → basic robot control

Configuration

spotty/audio/system_prompts.py - Assistant personality
spotty/vision/vision_assistant.py - Vision settings
spotty/utils/robot_utils.py - Robot connection

Custom wake words: Create at https://console.picovoice.ai/, update KEYWORD_PATH in spotty/__init__.py

Project Structure

spotty/
├── assets/             # Maps, databases, wake words
├── scripts/            # Setup utilities
├── spotty/
│   ├── annotation/     # Waypoint tools
│   ├── audio/          # Speech processing
│   ├── mapping/        # Navigation
│   ├── vision/         # Computer vision
│   └── utils/          # Shared utilities
└── main_interface.py   # Entry point

Pre-recorded assets( maps, voice activation model, RAG database): Download from Google Drive

License

MIT License

Disclaimer

This project is an open-source implementation of Boston Dynamics Demo and is not affiliated with Boston Dynamics. It is intended for educational and research purposes only. Use at your own risk.

Acknowledgements

Big thanks to Automatic Control Chair at FAU for providing the robot for my semester project
Many thanks to Boston Dynamics’ engineers for their work on Spot SDK
HuggingFace, OpenAI, Facebook Research

Name		Name	Last commit message	Last commit date
Latest commit History 152 Commits
assets		assets
scripts		scripts
spotty		spotty
.flake8		.flake8
.gitignore		.gitignore
.header.txt		.header.txt
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
main_interface.py		main_interface.py
requirements.txt		requirements.txt
requirements_optional.txt		requirements_optional.txt
setup.py		setup.py
test_audio.py		test_audio.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spotty: Voice Interface for Boston Dynamics Spot

Video Demo

Features

Architecture

Setup

Prerequisites

Installation

Map Setup

Run

Usage

Configuration

Project Structure

License

Disclaimer

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

License

vocdex/SpottyAI

Folders and files

Latest commit

History

Repository files navigation

Spotty: Voice Interface for Boston Dynamics Spot

Video Demo

Features

Architecture

Setup

Prerequisites

Installation

Map Setup

Run

Usage

Configuration

Project Structure

License

Disclaimer

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages