SoundSwallower web demo application

This is a simple demo application that shows how to integrate SoundSwallower speech recognition in a web page. You can try it out at https://dhdaines.github.io/soundswallower-demo/

Compilation

To compile, run:

npm run build

To start, run:

npm start

There is some magic to getting this to work so pay attention. The Webpack configuration has a few extra elements that allow loading the SoundSwallower module (in particular the WebAssembly part, compiled with Emscripten) to succeed. Look at these parts of the config variable in particular:

    plugins: [
        // Just copy the damn WASM because webpack can't recognize
        // Emscripten modules.
        new CopyPlugin({
            patterns: [
            { from: "node_modules/soundswallower/soundswallower.wasm*",
              to: "[name][ext]"},
            // And copy the model files too.  FIXME: Not sure how
            // this will work with require("soundswallower/model")
            { from: modelDir,
              to: "model"},
        ],
    module: {
        rules: [
            { test: /\.(dict|gram)$/i,
              type: "asset/resource",
              generator: {
                  // Don't mangle the names of dictionaries or grammars
                  filename: "model/[name][ext]"
              }
            },
        ],
    },
    // Eliminate webpack's node junk when using webpack
    resolve: {
        fallback: {
            crypto: false,
            fs: false,
            path: false,
        },
    },
    node: {
        global: false,
        __filename: false,
        __dirname: false,
    },

Implementation

The audio capture is implemented in a separate worker, an AudioWorkletNode to be precise, because ... well, because, for the moment, it doesn't seem possible to get live PCM audio any other way. This seems to be the general consensus among developers, as the Web Audio API is not at all designed for this use case, but the MediaStream recording API, which would be the logical choice, isn't really either (although it is actually pretty good for recording entire clips/files).

By contrast, we do the speech recognition in the main thread. For the simple grammars in the demo, it should be more than fast enough.

We also do endpointing, though the implementation is subject to change as the API is a little bit clunky.

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
webpack.config.cjs		webpack.config.cjs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SoundSwallower web demo application

Compilation

Implementation

About

Releases

Packages

Languages

License

dhdaines/soundswallower-demo

Folders and files

Latest commit

History

Repository files navigation

SoundSwallower web demo application

Compilation

Implementation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages