User-VLM 360°

Personalized Vision Language Models with User-aware Tuning for Social Human-Robot Interactions

Authors

Hamed Rahimi (ISIR, Sorbonne University)
Adil Bahaj (International University of Rabat)
Mouad Abrini (ISIR, Sorbonne University)
Mahdi Khoramshahi (ISIR, Sorbonne University)
Mounir Ghogho (International University of Rabat)
Mohamed Chetouani (ISIR, Sorbonne University)

Overview

User-VLM 360° is a personalized vision-language model that enhances human-robot interaction by integrating user-aware tuning and bias-aware optimization. It adapts in real-time using multimodal signals and mitigates bias through preference optimization. The framework is validated across multiple benchmarks and real-world deployment on the Pepper robot.

Features

User-aware Tuning: Real-time adaptation of interactions using visual-linguistic signals.
Bias Mitigation: Ethical and fair optimization of user personalization.
360° Socio-Emotive Interaction Dataset: Annotated with demographic, emotion, and relational metadata.
State-of-the-Art Performance: Achieves up to +35.3% F1 in personalized VQA and +47.5% F1 in facial feature understanding, with a 15% bias reduction and 30× speedup over baselines.

Deployment on Pepper

Results

Personalized Performance

User-VLM 360° outperforms baseline models in user-aware personalization, facial feature understanding, and multimodal reasoning. It achieves up to a 2x improvement in ROUGE-1 F1 scores over baselines in user-centric VQA tasks.

Fairness Optimization

User-VLM 360° enhances fairness, improving ROUGE-1 and BERTScore while mitigating bias through DPO tuning.

Computational Efficiency

User-VLM 360° achieves up to a 30× reduction in FLOPs, significantly improving computational efficiency without compromising performance.

Resources

Citation

If you use User-VLM 360° in your research, please cite:

@article{rahimi2025uservlm,
  author    = {Hamed Rahimi and Adil Bahaj and Mouad Abrini and Mahdi Khoramshahi and Mounir Ghogho and Mohamed Chetouani},
  title     = {User-VLM 360°: Personalized Vision Language Models with User-aware Tuning for Social Human-Robot Interactions},
  journal   = {arXiv preprint arXiv:<ARXIV_PAPER_ID>},
  year      = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
.idea		.idea
code		code
static		static
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

User-VLM 360°

Personalized Vision Language Models with User-aware Tuning for Social Human-Robot Interactions

Authors

Overview

Features

Deployment on Pepper

Results

Personalized Performance

Fairness Optimization

Computational Efficiency

Resources

Citation

About

Contributors 2

Languages

License

hamedR96/User-VLM

Folders and files

Latest commit

History

Repository files navigation

User-VLM 360°

Personalized Vision Language Models with User-aware Tuning for Social Human-Robot Interactions

Authors

Overview

Features

Deployment on Pepper

Results

Personalized Performance

Fairness Optimization

Computational Efficiency

Resources

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages