data_stprodis_jaen

STProDis-JaEn: Speech Translation with Prosodic Disambiguation for Japanese-English Languages.

This dataset contains 475 audio speeches for translation from Japanese to English recored by 4 different speakers. Its aim is to provide resource for disambiguating the syntactic ambiguation in Japanese speech which varies semantic information using prosodic features like pitch and pause.

Citation

If you find our work is useful in your research, please cite the following paper:

@inproceedings{tranlow,
  title={Low-Resource Japanese-English Speech-to-Text Translation Leveraging Speech-Text Unified-model Representation Learning},
  author={Tran, Tu Dinh and Sakti, Sakriani},
  booktitle={Proceedings of INTERSPEECH Satellite Workshop of the Special Interest Group on Under-resourced Languages (SIGUL)},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
F01		F01
F02		F02
M01		M01
M02		M02
LICENSE		LICENSE
README.md		README.md
test.txt		test.txt
train.txt		train.txt
translations.txt		translations.txt
valid.txt		valid.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

data_stprodis_jaen

Citation

About

Releases

Packages

License

ha3ci-lab/data_stprodis_jaen

Folders and files

Latest commit

History

Repository files navigation

data_stprodis_jaen

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages