Code using your voice
You can try a live demo of Speech2Code here: https://pedrooaugusto.github.io/speech-to-code/webapp
You can also check this video on how to solve the FizzBuzz problem using Speech2Code: https://www.youtube.com/watch?v=I71ETEeqa5E
(for this demo the app was ported to the web, to run directly on the browser)
Speech2Code is an application that enables you to code using just voice comands, with Speech2Code instead of using the keyboard to write code in the code editor like a caveman you can just express in natural language what you wish to do and that will be automatically written, as code, in the code editor.
Using Speech2Code instead of using the mouse and keyboard to navigate to line 42 of a file, you can just say: "line 42", "go to line 42" or even "please go to line 42". It's possible to say stuff like:
-
new variable answer equals the string john was the eggman string
-
let answer = "john was the eggman"
-
-
call function max with arguments variable answer and expression gap plus number 42 on namespace Math
-
Math.max(answer, gap + 42) // 'gap' can later be replaced later by an actual value
-
This project can be divided into 3 main modules:
-
Webapp, Server and Client: are responsible for the application UI, capture audio and transform audio into text.
-
Spoken: is responsible for testing if a given phrase is a valid voice command and to extract important information out of it (parse).
-
Spoken VSCode Extension: is a Visual Studio Code extension able to receive commands to manipulate VSCode. Is through this extension that Speech2Code is able to control the Visual Studio Code.
Those modules interact as follows:
flowchart TB
A[fab:fa-microsoft MS Azure Speech to Text] <-->|HTTP/Sockets| B(Server)
B <--> |HTTP| C(Client)
B --> |Serves| E(Webapp)
C <--> |Inter Process-Communication| D(VS Code Extension)
E --- |NPM Dependency| F(Spoken)
C --- |NPM Dependency| F(Spoken)
D <--> G(Visual Studio Code)
style B fill:white,stroke:gold,stroke-width:2px
style C fill:white,stroke:gold,stroke-width:2px
style D fill:white,stroke:gold,stroke-width:2px
style E fill:white,stroke:gold,stroke-width:2px
style F fill:white,stroke:gold,stroke-width:2px
Voice commands are transformed into text using the Azure Speech to Text service and later parsed by Spoken, which makes use of several pushdown automaton to extract information of the text.
Currently, Speech2Code only supports voice commands for the JavaScript language, a list of all those commands can be found here. All commands can be said in both english and portuguese HU3BR.
Speech2Code was designed to work with any IDE that implements its interface, this is usually done through plugins and extensions. Currently, it has support for Visual Studio Code and CodeMirror.
For example, the voice command "call function fish with two arguments" will eventually call for editor.write(...)
where editor can be any IDE/Editor like vscode, codemirror and sublime and each will have a different implementation for write(...)
. The only common thing is that calling that function will write something in the current open file, no matter the IDE. Here you have an example of different implementations of the same function: VSCode.write(...) x CodeMirror.write(...)
The connection between VSCode and Speech2Code is done through a custom VSCode extension and Inter-Process Communication.
First, install all the required dependencies with:
node scripts.js install
Then, you can start the server with:
./run.sh
A web based demo of Speech2Code will be accessible through: http://localhost:3000/webapp
Finnaly, if you wish to start the actual application run (make sure that VSCode is running before doing that):
npm --prefix client start
Dont forget to edit server/.env
with your azure speech-to-text API keys.
Non code-like material produced in the creation of this project: