Gemini + LLMs for Robots (Tiago Pal)

Problem Statement

Controlling robots through natural language instructions is a complex task that requires integrating advanced AI models with robotic systems. This project aims to simplify robot control by leveraging Gemini AI, and LLaVA models to interpret and execute natural language commands, making robotic interactions more intuitive and accessible.

Requirements

Python 3.8
ROS Noetic
Tiago Pal robot
Simulation environment (gazebo)
Flask
Google Cloud credentials for Vertex AI
Ollama and LLaVA model

Architecture

User Interface: A web-based interface built with Flask to input commands.
LLM Models: Integration of Gemini, Ollama, and LLaVA for generating and interpreting commands.
Robot Control: ROS-based control of the Tiago Pal robot, including movement, arm manipulation, and sensory feedback.

Installation

ROS

Install ROS Noetic on your system following the instructions from the official ROS website.

https://wiki.ros.org/Robots/TIAGo/Tutorials/Installation/InstallUbuntuAndROS

Tiago and Simulation

Install the Tiago Pal robot simulation packages by following the instructions from the official ROS website:

https://wiki.ros.org/Robots/TIAGo/Tutorials/Installation/Testing_simulation

Launch the Tiago Pal simulation:

    roslaunch tiago_gazebo tiago_gazebo.launch public_sim:=true

Gemini

Install the Vertex AI Python SDK:

pip install google-cloud-aiplatform

Set up your Google Cloud credentials:

export GOOGLE_APPLICATION_CREDENTIALS=<path_to_your_credentials_file.json>

Initialize Vertex AI:

import vertexai
vertexai.init(project="YOUR_PROJECT_ID", location="YOUR_REGION")

Ollama and LLaVA

To install ollama

curl -fsSL https://ollama.com/install.sh | sh

Install the necessary Python packages:

pip install ollama

This project uses only llava-llama3 (but in theory you can swap in another capable VLM) to install that version ~~code for using gemini 1.5 as a vlm has been removed temporarily~~:

ollama run llava-llama3

Usage

Running the Simulation and Flask App Launch the ROS simulation:

cd ~/tiago_public_ws
source devel/setup.bash
roslaunch tiago_gazebo tiago_gazebo.launch public_sim:=true

Start the Flask app:

export FLASK_APP=app.py
flask run

Open your web browser and navigate to http://127.0.0.1:5000 to access the control interface.

Interacting with the Robot

Use the web interface to input commands. The Flask app will process these commands using Gemini, Ollama, and LLaVA.

The robot will execute the commands, providing feedback on each action.

Example Commands

"Move forward"
"Pick up the object"
"Extend arm"
"Rotate head left"

Monitoring and Feedback system

The app provides real-time feedback on the robot's actions, ensuring each step is completed before proceeding to the next. Check the console output for detailed logs and any error messages.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.catkin_tools		.catkin_tools
build		build
devel		devel
launch		launch
src		src
.catkin_workspace		.catkin_workspace
.gitignore		.gitignore
README.md		README.md
get-pip.py		get-pip.py
history.txt		history.txt
requirements.txt		requirements.txt
run_sim.sh		run_sim.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gemini + LLMs for Robots (Tiago Pal)

Table of Contents

Problem Statement

Requirements

Architecture

Installation

ROS

Tiago and Simulation

Gemini

Ollama and LLaVA

Usage

Interacting with the Robot

Example Commands

About

Languages

Omoshirokunai/ros_llm_ws_pal

Folders and files

Latest commit

History

Repository files navigation

Gemini + LLMs for Robots (Tiago Pal)

Table of Contents

Problem Statement

Requirements

Architecture

Installation

ROS

Tiago and Simulation

Gemini

Ollama and LLaVA

Usage

Interacting with the Robot

Example Commands

About

Topics

Resources

Stars

Watchers

Forks

Languages