Our project propose a new way of interacting with the operating system that prioritizes on improving the user experience via voice commands. It is able to recognize the spoken language and is able to draw meaningful conclusions from it and to provide responses accordingly.
Our project propose a new way of interacting with the operating system that prioritizes on improving the user experience via voice commands. It is able to recognize the spoken language and is able to draw meaningful conclusions from it and to provide responses accordingly. Unlike the traditional approach which rely heavily on the physical inputs, our proposed system can provide an alternative method through the means of voice interactions. Though we are developing a voice based system, the traditional physical input is still available, so the user can experience the best of both worlds.
- Custom wake word detection
- Natural Language Understanding
- Ability to launch applicatons
- Launch custom scripts
- Play music and movies from the folder specified
- Usage analysis
For effective and efficient embedding of speech recognition into Linux Operating System, we employ a multimodule approach, namely Assistant, Coordinator and Skill modules. These modules determine how the voice data is collected, processed and evaluated. The entire working of the system is divided into two phases, Assistant-Coordinator (Primary) phase and Coordinator-Skill-Synthesis (Secondary) phase. The primary phase consist of transcribing the voice data to the corresponding intents. The secondary phase deals with mapping intents into corresponding skills and providing feedback in the form of speech or raw data. Read more
Python 3.7 is needed for dependencies. Check the python version by running
python --version
We recommend installing Hera on seperate virtual environment
sudo apt install python3-venv
python3 -m venv env
git clone https://github.com/HeyHera/Hera.git
pip install -r requitements.txt
Models for wake word detection and intent classification is given in the repository itself. Other models needs to be downloaded and placed in the right directory.
- vosk model for Automatic Speech Recognition Download a model of your choice and move it to Hera/vosk-models/ specify the model path in Hera/automatic_speech_recognition_script.py
- Entity Extraction Model Unzip and place it (all folders) inside Hera/nlu/entity_extraction/output/
- nix-TTS model Download and place it inside Hera/tts/nix/models/
python app.py
- Ahammed Siraj K K
arecord -f cd -d 10 --device="hw:0,0" /tmp/test-mic.wav
aplay /tmp/test-mic.wav