diff --git a/sections/2024/main/MMSP-P6.md b/sections/2024/main/MMSP-P6.md new file mode 100644 index 0000000..c6981ed --- /dev/null +++ b/sections/2024/main/MMSP-P6.md @@ -0,0 +1,45 @@ +# ICASSP-2024-Papers + + + + + + + + + + +
Application + + App + +
Previous Collections + + Conference + +
+ +
+ + + + + + + + + +
+ + +## Human-Centric Multimedia + +![Section Papers](https://img.shields.io/badge/Section%20Papers-5-42BA16) ![Preprint Papers](https://img.shields.io/badge/Preprint%20Papers-4-b31b1b) ![Papers with Open Code](https://img.shields.io/badge/Papers%20with%20Open%20Code-2-1D7FBF) ![Papers with Video](https://img.shields.io/badge/Papers%20with%20Video-0-FF0000) + +| **Title** | **Repo** | **Paper** | **Video** | +|-----------|:--------:|:---------:|:---------:| +| Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness | [![GitHub Page](https://img.shields.io/badge/GitHub-Page-159957.svg?style=flat)](https://youngseng.github.io/FreeTalker/)
[![GitHub](https://img.shields.io/github/stars/YoungSeng/FreeTalker?style=flat)](https://github.com/YoungSeng/FreeTalker/) | [![IEEE Xplore](https://img.shields.io/badge/IEEE-10447978-E4A42C.svg)](https://ieeexplore.ieee.org/document/10447978)
[![arXiv](https://img.shields.io/badge/arXiv-2401.03476-b31b1b.svg)](https://arxiv.org/abs/2401.03476) | :heavy_minus_sign: | +| Enhancing Expressiveness in Dance Generation via Integrating Frequency and Music Style Information | [![GitHub](https://img.shields.io/github/stars/thuhcsi/ExpressiveBailando?style=flat)](https://github.com/thuhcsi/ExpressiveBailando) | [![IEEE Xplore](https://img.shields.io/badge/IEEE-10448469-E4A42C.svg)](https://ieeexplore.ieee.org/document/10448469)
[![arXiv](https://img.shields.io/badge/arXiv-2403.05834-b31b1b.svg)](https://arxiv.org/abs/2403.05834) | :heavy_minus_sign: | +| Modality Dropout for Multimodal Device Directed Speech Detection using Verbal and Non-Verbal Features | :heavy_minus_sign: | [![IEEE Xplore](https://img.shields.io/badge/IEEE-10446421-E4A42C.svg)](https://ieeexplore.ieee.org/document/10446421)
[![arXiv](https://img.shields.io/badge/arXiv-2310.15261-b31b1b.svg)](http://arxiv.org/abs/2310.15261) | :heavy_minus_sign: | +| Audio-Visual Child-Adult Speaker Classification in Dyadic Interactions | :heavy_minus_sign: | [![IEEE Xplore](https://img.shields.io/badge/IEEE-10447515-E4A42C.svg)](https://ieeexplore.ieee.org/document/10447515)
[![arXiv](https://img.shields.io/badge/arXiv-2310.01867-b31b1b.svg)](https://arxiv.org/abs/2310.01867) | :heavy_minus_sign: | +| Long-Term Social Interaction Context: The Key to Egocentric Addressee Detection | :heavy_minus_sign: | [![IEEE Xplore](https://img.shields.io/badge/IEEE-10447323-E4A42C.svg)](https://ieeexplore.ieee.org/document/10447323) | :heavy_minus_sign: |