Robustifying Long-term Human-Robot Collaboration through a Hierarchical and Multimodal Framework

Abstract

Long-term Human-Robot Collaboration (HRC) is crucial for developing flexible manufacturing systems and for integrating companion robots into daily human environments over extended periods. However, sustaining such collaborations requires overcoming challenges such as accurately understanding human intentions, maintaining robustness in noisy and dynamic environments, and adapting to diverse user behaviors. This paper presents a novel multimodal and hierarchical framework to address these challenges, facilitating efficient and robust long-term HRC. In particular, the proposed multimodal framework integrates visual observations with speech commands, which enables intuitive, natural, and flexible interactions between humans and robots. Additionally, our hierarchical approach for human detection and intention prediction significantly enhances the system's robustness, allowing robots to better understand human behaviors. The proactive understanding enables robots to take timely and appropriate actions based on predicted human intentions. We deploy the proposed multimodal hierarchical framework to the KINOVA GEN3 robot and conduct extensive user studies on real-world long-term HRC experiments.

This is the official code repo for the paper Robustifying Human-Robot Collaboration through a Hierarchical and Multimodal Framework. The video demo can be found at https://www.youtube.com/watch?v=2kkANN9ueVY.

Youtube Video:

Environment

Hardware

KINOVA GEN3 robot arm
OAK-D Lite RGB-D camera
Microphone

Software

ubuntu == 20.04
python == 3.9
ros

To set up the environment, run:

conda create -n hrc -f env_ubuntu2004.yml

conda activate hrc

Prerequisite

For speech model and scorer, download [deepspeech-0.9.3-models.pbmm] and [deepspeech-0.9.3-models.scorer] from https://github.com/mozilla/DeepSpeech/releases, and put these two models into the ./speech.

For robot controller, please set up as https://github.com/intelligent-control-lab/robot_controller_ros.

To activate the robot controller, run:

source devel/setup.bash

roslaunch kinova kinova_bringup.launch

rosrun rqt_gui rqt_gui

roscore

Once the robot is activated, you can see the robot arm moving.

Usage

python controller/receiver.py # start the robot receiving process, waiting for receiving visual and audio signal

# Attention: the three digital numbers succeding [task_id] is necessary!
python run.py --show --task [task_id001] # open up the camera and start the HRC pipeline

python speech/speech_recognize.py # open the speech recognition program and communicate signals to the robot

Reference

@article{yu2024robustify,
  title     = {Robustifying Long-term Human-Robot Collaboration through a Hierarchical and Multimodal Framework,
  author    = {Peiqi Yu  and Abulikemu Abuduweili and Ruixuan Liu and Changliu Liu },
  journal={arXiv preprint arXiv:2411.15711},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets/images		assets/images
controller		controller
depthai_blazepose		depthai_blazepose
models		models
robot/bin/activate		robot/bin/activate
speech		speech
traj_intention		traj_intention
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
env_ubuntu2004.yml		env_ubuntu2004.yml
env_windows64.yml		env_windows64.yml
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robustifying Long-term Human-Robot Collaboration through a Hierarchical and Multimodal Framework

Abstract

Youtube Video:

Environment

Hardware

Software

Prerequisite

Usage

Reference

About

Releases

Packages

Contributors 2

Languages

License

intelligent-control-lab/Robust-Hierarchial-Multimodal-HRC

Folders and files

Latest commit

History

Repository files navigation

Robustifying Long-term Human-Robot Collaboration through a Hierarchical and Multimodal Framework

Abstract

Youtube Video:

Environment

Hardware

Software

Prerequisite

Usage

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages