GitHub - Roboy/sonosco: Framework for Deep Speech Recognition

`Documentation`

Sonosco (from Lat. sonus - sound and nōscō - I know, recognize) is a library for training and deploying deep speech recognition models.

The goal of this project is to enable fast, repeatable and structured training of deep automatic speech recognition (ASR) models as well as providing a transcription server (REST API & frontend) to try out the trained models for transcription.
Additionally, we provide interfaces to ROS in order to use it with the anthropomimetic robot Roboy.

Installation

Via pip

The easiest way to use Sonosco's functionality is via pip:

pip install sonosco

Note: Sonosco requires Python 3.6 or higher.

For reliability, we recommend using an environment virtualization tool, like virtualenv or conda.

For developers or trying out the transcription server

Clone the repository and install dependencies:

# Clone the repo and cd inside it
git clone https://github.com/Roboy/sonosco.git && cd sonosco

# Create a virtual python environment to not pollute the global setup
python -m venv venv

# Activate the virtual environment
source venv/bin/activate

# Install normal requirements
pip install -r requirements.txt

# Link your local sonosco clone into your virtual environment
pip install -e .

Now you can check out some of the Getting Started tutorials, to train a model or use the transcription server.

Quick Start

Dockerized inference server

Get the hold of our new fully trained models from the latest release! Try out the LAS model for the best performance. Then specify the folder with the model to the runner script as shown underneath.

You can get the docker image from dockerhub under yuriyarabskyy/sonosco-inference:1.0. Just run cd server && ./run.sh yuriyarabskyy/sonosco-inference:1.0 to pull and start the server or optionally build your own image by executing the following commands.

cd server

# Build the docker image
./build.sh

# Run the built image
./run.sh sonosco_server

You can also specify the path to your own models by writing ./run.sh <image_name> <path/to/models>.

Open http://localhost:5000 in Chrome. You should be able to add models for performing transcription by clicking on the plus button. Once the models are added, record your own voice by clicking on the record button. You can replay and transcribe with the corresponding buttons.

You can get pretrained models from the release tab in this repository.

High Level Design

The project is split into 4 parts that correlate with each other:

For data(-processing) scripts are provided to download and preprocess some publicly available datasets for speech recognition. Additionally, we provide scripts and functions to create manifest files (i.e. catalog files) for your own data and merge existing manifest files into one.

This data or rather the manifest files can then be used to easily train and evaluate an ASR model. We provide some ASR model architectures, such as LAS, TDS and DeepSpeech2 but also individual pytorch models can be designed to be trained.

The trained model can then be used in a transcription server, that consists of a REST API as well as a simple Vue.js frontend to transcribe voice recorded by a microphone and compare the transcription results to other models (that can be downloaded in our Github repository).

Further we provide example code, how to use different ASR models with ROS and especially the Roboy ROS interfaces (i.e. topics & messages).

Check our Documentation to learn more!

Name		Name	Last commit message	Last commit date
Latest commit History 588 Commits
.github		.github
demo		demo
docs		docs
executables		executables
notebooks		notebooks
ros1_roboy		ros1_roboy
ros2_roboy		ros2_roboy
scripts		scripts
server		server
sonosco		sonosco
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
.travis.yaml		.travis.yaml
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.md		README.md
post_requirements.txt		post_requirements.txt
requirements.txt		requirements.txt
setup.py		setup.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Via pip

For developers or trying out the transcription server

Quick Start

Dockerized inference server

High Level Design

About

Releases 2

Sponsor this project

Packages

Contributors 5

Languages

License

Roboy/sonosco

Folders and files

Latest commit

History

Repository files navigation

Installation

Via pip

For developers or trying out the transcription server

Quick Start

Dockerized inference server

High Level Design

About

Resources

License

Stars

Watchers

Forks

Releases 2

Sponsor this project

Packages 0

Contributors 5

Languages

Packages