Skip to content

Commit

Permalink
P&C Docs (#5068) (#5069)
Browse files Browse the repository at this point in the history
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
  • Loading branch information
github-actions[bot] and jubick1337 authored Oct 4, 2022
1 parent e0cc6b7 commit a71712b
Show file tree
Hide file tree
Showing 5 changed files with 429 additions and 9 deletions.
2 changes: 1 addition & 1 deletion docs/source/nlp/models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ NeMo's NLP collection supports provides the following task-specific models:
.. toctree::
:maxdepth: 1

punctuation_and_capitalization
punctuation_and_capitalization_models
token_classification
joint_intent_slot
text_classification
Expand Down
7 changes: 7 additions & 0 deletions docs/source/nlp/nlp_all.bib
Original file line number Diff line number Diff line change
Expand Up @@ -170,4 +170,11 @@ @inproceedings{koehnetal2007moses
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/P07-2045",
pages = "177--180",
}

@article{sunkara2020multimodal,
title={Multimodal Semi-supervised Learning Framework for Punctuation Prediction in Conversational Speech},
author={Monica Sunkara, Srikanth Ronanki, Dhanush Bekal, Sravan Bodapati, Katrin Kirchhoff},
journal={arXiv preprint arXiv:2008.00702},
year={2020}
}
8 changes: 0 additions & 8 deletions docs/source/nlp/punctuation_and_capitalization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,6 @@
Punctuation and Capitalization Model
====================================

Automatic Speech Recognition (ASR) systems typically generate text with no punctuation and capitalization of the words.
There are two issues with non-punctuated ASR output:

- it could be difficult to read and understand
- models for some downstream tasks, such as named entity recognition, machine translation, or text-to-speech, are
usually trained on punctuated datasets and using raw ASR output as the input to these models could deteriorate their
performance

Quick Start Guide
-----------------

Expand Down
Loading

0 comments on commit a71712b

Please sign in to comment.