STProDis-JaEn: Speech Translation with Prosodic Disambiguation for Japanese-English Languages.
This dataset contains 475 audio speeches for translation from Japanese to English recored by 4 different speakers. Its aim is to provide resource for disambiguating the syntactic ambiguation in Japanese speech which varies semantic information using prosodic features like pitch and pause.
If you find our work is useful in your research, please cite the following paper:
@inproceedings{tranlow,
title={Low-Resource Japanese-English Speech-to-Text Translation Leveraging Speech-Text Unified-model Representation Learning},
author={Tran, Tu Dinh and Sakti, Sakriani},
booktitle={Proceedings of INTERSPEECH Satellite Workshop of the Special Interest Group on Under-resourced Languages (SIGUL)},
year={2023}
}