Skip to content

Trawsgrifio ar gael drwy’r eicon microffon o fewn bysellfwrdd arferol ffon symudol

License

Notifications You must be signed in to change notification settings

techiaith/whisper-mobile-mic

Repository files navigation

Mewnbwn leferydd drwy'r meicroffon o fewn bysellfwrdd arferol eich ffôn symudol

Addaswyd y cod WhisperInput yma i arbrofi gyda chynnwys model adnabod lleferydd Cymraeg all-lein gyda meicroffon sydd ar gael o fewn bysellfwrdd arferol eich ffôn symudol. Dim ond ffonau Android sydd yn bosib eu hehangu ar y moment.

Golygir hyn bod modd lefaru eich neges yn lle ei theipio o fewn unrhyw un app ar eich dyfais e.e. WhatsApp, Facebook Messenger, Outlook a gmail.

Yn anffodus, nid yw'r pecyn yn ddigon sefydlog ar gyfer defnydd pob dydd gan ddefnyddwyr. Mae'n aml yn crashio ac yn rhy araf yn rhoi trawsgrifiad.

Speech input via the microphone within your mobile's keyboard

This is a modified version of WhisperInput for experimenting with offline Welsh speech recognition models within the microphone available mobile phone keyboards. Only Android phones are supported at the moment.

This means that your speak your message instead of typing within any one app on your device e.g. WhatsApp, Facebook Messenger, Outlook and gmail.

Unfortunately, the package is not stable enough for everyday use by users. It crashes too often and transcribes too slowly.


WhisperInput

Offline voice input panel & keyboard with punctuation for Android, experimental, powered by Whisper AI & Kõnele components.

Voice input is supported in English. Multilingual input can be used, see in Installation.

Features

  • Works as a voice keyboard (input method editor), a voice input panel, or an assistant app.
  • On-device speech recognition, offline.
  • Auto-start, auto-stop, audio cue option.

Usage tips

  • To set the app as a web search assistant (long press Home button to open voice input), open the app -> Settings gear icon -> Recognition services (system UI). A system menu should open for selecting the assistant app, for example in Samsung UI it's Device assistance app. Select Whisper Input in the list.
  • You can switch your keyboard to Whisper Input voice keyboard in the system keyboard list. Click the keyboard icon when done to switch back.

Installation

Requirements: Java, Android SDK.

Initial setup:

  • Put your keystore.jks to the project's root folder for signing the app.
  • Create a signing.properties in the project's root folder with keystore.jks credentials:
signingStoreLocation=../keystore.jks
signingStorePassword=<keystore.jks password>
signingKeyAlias=<keystore.jks alias>
signingKeyPassword=<keystore.jks key password>
  • (Optional) To replace the included English-only speech model with a bigger or multilingual one, replace ggml-tiny.en.bin in the assets/models folder with another .bin type model from the whisper.cpp model list. The models without .en. are multilingual. Tiny or base size models recommended. Note: a multilingual model is expected to perform worse in English than the .en. model of the similar size.

Run:

git clone https://github.com/alex-vt/WhisperInput.git
cd WhisperInput/
./gradlew assembleRelease

Install app/build/outputs/apk/release/app-release.apk on Android device.

Development

Points of interest:

Possible issues:

  • Permissions handling. The permissing model from the Kõnele components may be falling back behind new Android versions requirements, that may result in failure to record voice. A workaround is to allow permissions manually in Android's app settings.
  • The voice model may take a few seconds to load - the first recording may be inconsistent. Optimally the ready status of the model at WhisperRecognitionModel.kt:35 should be reflected in the recording UI.

Underlying projects

  • Kõnele, a voice inputs integration app for Android, by Kaljurand, Apache-2.0 license
  • speechutils, a voice recognition library for Android, by Kaljurand, Apache-2.0 license
  • whisper.cpp, a performance-tuned build of a speech model, by ggerganov, MIT license
  • whisper, a speech model, by OpenAI, MIT license

License

MIT license

About

Trawsgrifio ar gael drwy’r eicon microffon o fewn bysellfwrdd arferol ffon symudol

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published