Skip to content
zjn0505 edited this page Aug 13, 2024 · 50 revisions

Speech

temi's speech flow is comprised of four main components - wakeup, ASR (Automatic Speech Recognition), NLP (Natural language processing) and TTS (text to speech).

temi's SDK provides developers tools to utilize, customize and listen to any of the components.


API Overview

Return Method Description
void speak(TtsRequest ttsRequest) Ask temi to speak(play TTS)
void cancelAllTtsRequests() cancel TTS request
void wakeup() Wake up temi
String getWakeupWord() Get current wake-up word
void askQuestion(String question) temi speak actively and wait for user's reply
void finishConversation() Finish a conversation(Stop recording for ASR)
void startDefaultNlu(String text, SttLanguage sttLanguage) Trigger default NLU service
boolean setTtsVoice(TtsVoice ttsVoice) Set TTS voice, speed, and pitch
TtsVoice getTtsVoice() Trigger default NLU service
int setAsrLanguages() Set system ASR languages
Interface Description
TtsListener TTS status listener
WakeupWordListener Wake-up event listener
AsrListener ASR result listener
ConversationViewAttachesListener Conversation view attaches listener
OnConversationStatusChangedListener Listener for status chagned of Conversation view
OnTtsVisualizerWaveFormDataChangedListener Listener for wave form data changes of TTS audio visualizer
OnTtsVisualizerFftDataChangedListener Listener for fft data changes of TTS audio visualizer
Model Description
TtsRequest TTS request instance
TtsVoice TTS voice configuration
Gender TTS voice gender
SttLanguage ASR / STT languages

Methods

speak()

Use this method to let temi speak something that from the parameter ttsRequest of this method.

From 134 verison, if the TtsRequest.id is the same, the requested will be queued when there is an ongoing TTS.

  • Parameters

    Parameter Type Description
    ttsRequest TtsRequest An object of type TtsRequest in this object you will add the text to be spoken.
  • Prototype

    void speak(TtsRequest ttsRequest);
  • Required permissions

    None.

  • Support from

    0.10.36


cancelAllTtsRequests()

Stops currently processed TTS request and empty the queue.

  • Prototype

    void cancelAllTtsRequests();
  • Required permissions

    None.

  • Support from


wakeup()

Use this method to trigger temi's wakeup programmatically.

In 132 version, this method can take an optional list to make temi listen to specific langauges. The languages set will temporarily override setAsrLanguages() for current wake up session. Max. 3 extra languages can be set.

In 133 version there is a new overload method wakeup(SttRequest sttRequest)

  • Parameters

    Parameter Type Description
    languages List<SttLanguage> Langauges for STT, empty list to use system language. Add in 1.132.0, works with 132 version
    Parameter Type Description
    sttRequest SttRequest Contains languages, and some experimental paramters, added in 133.
  • Prototype

    void wakeup(List<SttLanguege> languages);
    
    void wakeup(SttRequest sttRequest);
  • Required permissions

    None.

  • Support from

    0.10.49


getWakeupWord()

Use this method to get temi's wake word assistant.

  • Return

    Type Description
    String Wake-up word
  • Prototype

    String getWakeupWord();
  • Required permissions

    None.

  • Support from

    0.10.49


askQuestion()

Use this method to let temi actively speak to the user and wait for the user to answer.

In 133 version, there is an overload method to pass TtsRequest, and SttRequest parameter.

  • Parameters

    Parameter Type Description
    question String The text to be spoken
    Parameter Type Description
    question TtsRequest The text to be spoken, support all TtsRequest parameters added in 133 version.
    sttRequest SttRequest Control the speech in the STT session after the TTS, similiar to wakeup(SttRequest sttRequest), default is null, added in 133 version.
  • Prototype

    void askQuestion(String question);
    
    void askQuestion(TtsReqeust question, SttRequest sttRequest);
  • Required permissions

    None.

  • Support from

    0.10.63

  • Recommendation

    A custom dialog flow can be realized by cooperating with AsrListener. For details, please refer to Sample code.


finishConversation()

Use this method to finish the conversation (Stop recording for ASR).

  • Prototype

    void finishConversation();
  • Required permissions

    None.

  • Support from

    0.10.63


startDefaultNlu()

Use this method to trigger the system's default natural language understanding (NLU). If you want to directly trigger system skills such as weather or music in your skill, you can directly pass in "What's the wheather today" or "Play Music" as a parameter and invoke this method to achieve it.

In 1.133.0 version. It will allow to assign system langauge, to trigger an NLU recognition of target language.

  • Parameters

    Parameter Type Description
    text String Natural language text to be processed
    sttLanguage SttLanguage Language of NLU, default as SttLangauge.SYSTEM, added in 1.133.0
  • Prototype

    void startDefaultNlu(String text, SttLanguage sttLanguage);
  • Required permissions

    Selected Kiosk

  • Support from

    0.10.70

  • Note

    This interface can only be called once every 5 seconds.


setTtsService()

Use this method to configure your own TTS service. After the correct configuration, temi's TTS function requirements will depend on this TTS service. If you want to use temi's original TTS service, call method speak() to do that.

  • Parameters

    Parameter Type Description
    ttsService ITtsService The instance of class that implemented ITtsService. If the passed paramenter is null, the TTS service will be unbound.
  • Prototype

    void setTtsService(ITtsService ttsService);
  • Required permissions

    com.robotemi.sdk.metadata.OVERRIDE_TTS declared in the manifest to override original TTS

  • Support from

    0.10.77


publishTtsStatus()

Use this method to publish the TTS status from your TTS service to temi.

  • Parameters

    Parameter Type Description
    ttsReqeust TtsRequest Current instance of TtsRequest (need to include status))
  • Prototype

    void setTtsService(TtsRequest ttsReqeust);
  • Required permissions

    com.robotemi.sdk.metadata.OVERRIDE_TTS declared in the manifest to override original TTS

  • Support from

    0.10.77


setTtsVoice()

Set TTS voice, speed and pitch. Only available for temi Global version.

  • Parameters

    Parameter Type Description
    ttsVoice TtsVoice TtsVoice configuration
  • Return

    Type Description
    boolean true if set is successful
  • Prototype

    boolean setTtsVoice(TtsVoice ttsVoice);
  • Required permissions

    SETTINGS

  • Support from

    1.129.0


getTtsVoice()

Get TTS voice, speed and pitch. Only available for temi Global version.

  • Return

    Type Description
    TtsVoice current TTS voice configuration
  • Prototype

    TtsVoice getTtsVoice();
  • Support from

    1.129.0


setAsrLanguages()

Change ASR langauges, this settings will persist when this kiosk app is running. Max. 3 extra languages can be set.

This languages can be temporarily overriden by wakeup()

  • Return

    Type Description
    int 0 OK, -1 invalid, 403 no permission
  • Prototype

    int setAsrLanguages(List<SttLanguage> languages);
  • Required permissions

    KIOSK

  • Support from

    1.132.0


Interfaces

TtsListener

Set your context to implement this listener and add the override method to get TTS status changes.

Prototype

package com.robotemi.sdk;

interface Robot.TtsListener {}

Abstract methods

  • Parameters

    Parameter Type Description
    ttsRequest TtsRequest TTS request object that holds its text content and status
  • Prototype

    void onTtsStatusChanged(TtsRequest ttsRequest);

Method for adding listener

  • Parameters

    Parameter Type Description
    listener TtsListener An instance of a class that implements this interface
  • Prototype

    void addTtsListener(TtsListener listener);

Method for removing listener

  • Parameters

    Parameter Type Description
    listener TtsListener An instance of a class that implements this interface
  • Prototype

    void removeTtsListener(TtsListener listener);
  • Support from

    0.10.36


WakeupWordListener

Set your context to implement this listener and add the override method to get wake word value when triggered by the user.

Prototype

package com.robotemi.sdk;

interface Robot.WakeupWordListener {}

Abstract methods

  • Parameters

    Parameter Type Description
    wakeupWord String The wakeup word used to trigger
    direction int
    • 0 - temi was triggered from the front
    • 90 - temi was triggered from the left
    • 180 - temi was triggered from the back
    • 270 - temi was triggered from the right
    • 555 - cannot detect wakeup direction
  • Prototype

    void onWakeupWord(String wakeupWord, int direction);

Method for adding listener

  • Parameters

    Parameter Type Description
    listener WakeupWordListener An instance of a class that implements this interface
  • Prototype

    void addWakeupWordListener(WakeupWordListener listener);

Method for removing listener

  • Parameters

    Parameter Type Description
    listener WakeupWordListener An instance of a class that implements this interface
  • Prototype

    void removeWakeupWordListener(WakeupWordListener listener);
  • Support from

    0.10.36


AsrListener

Set your context to implement this listener and add the override method to get the ASR result.

From 132 version, both the text and language will be returned.

Prototype

package com.robotemi.sdk;

interface Robot.AsrListener {}

Abstract methods

  • Parameters

    Parameter Type Description
    asrResult String The ASR result
    sttLanguage SttLanguage The language of the ASR result (Added in 132)
  • Prototype

    void onAsrResult(String asrResult, SttLanguage sttLanguage);

Method for adding listener

  • Parameters

    Parameter Type Description
    listener AsrListener An instance of a class that implements this interface
  • Prototype

    void addAsrListener(AsrListener listener);

Method for removing listener

  • Parameters

    Parameter Type Description
    listener AsrListener An instance of a class that implements this interface
  • Prototype

    void removeAsrListener(AsrListener listener);
  • Support from

    0.10.53


ConversationViewAttachesListener

Set your context to implement this listener and add the override method to listen if the conversation view attaches.

Prototype

package com.robotemi.sdk;

interface Robot.ConversationViewAttachesListener {}

Abstract methods

  • Parameters

    Parameter Type Description
    isAttached boolean true means the conversation view attaches, false otherwise
  • Prototype

    void onConversationAttaches(boolean isAttached);

Method for adding listener

  • Parameters

    Parameter Type Description
    listener ConversationViewAttachesListener An instance of a class that implements this interface
  • Prototype

    void addConversationViewAttachesListenerListener(ConversationViewAttachesListener listener);

Method for removing listener

  • Parameters

    Parameter Type Description
    listener ConversationViewAttachesListener An instance of a class that implements this interface
  • Prototype

    void removeConversationViewAttachesListenerListener(ConversationViewAttachesListener listener);
  • Support from

    0.10.36


OnConversationStatusChangedListener

Set your context to implement this interface and override its' abstract method to listen to the status and text changes of the Conversation layer.

Note: Only the selected Kiosk App declared in the AndroidManifest.xml file to override the original conversation layer UI can receive the related callback data.

Prototype

package com.robotemi.sdk.listeners;

interface OnConversationStatusChangedListener {}

Static constants

All constants here are only for the status of Conversation layer.

Constant Type Value Description
IDLE int 0 Idle, no useriteraction
LISTENING int 1 Listening user's voice
THINKING int 2 Doing NLP
SPEAKING int 3 Playing TTS

Abstract methods

  • Parameters

    Parameter Type Description
    status int Status of Conversation layer
    text String Text of Conversation layer
  • Prototype

    void onConversationStatusChanged(int status, String text);

Method for adding listener

  • Parameters

    Parameter Type Description
    listener OnConversationStatusChangedListener An instance of a class that implements this interface
  • Prototype

    void addOnConversationStatusChangedListener(OnConversationStatusChangedListener listener);

Method for removing listener

  • Parameters

    Parameter Type Description
    listener OnConversationStatusChangedListener An instance of a class that implements this interface
  • Prototype

    void removeOnConversationStatusChangedListener(OnConversationStatusChangedListener listener);
  • Support from

    0.10.72


OnTtsVisualizerWaveFormDataChangedListener

Set your context to implements this interface and override its' abstract method to listen to the wave form data changes of the TTS audio visualizer.

Prototype

package com.robotemi.sdk.listeners;

interface OnTtsVisualizerWaveFormDataChangedListener {}

Abstract methods

  • Parameters

    Parameter Type Description
    waveForm byte[] Wave form data
  • Prototype

    void onTtsVisualizerWaveFormDataChanged(byte[] waveForm);

Method for adding listener

  • Parameters

    Parameter Type Description
    listener OnTtsVisualizerWaveFormDataChangedListener An instance of a class that implements this interface
  • Prototype

    void addOnTtsVisualizerWaveFormDataChangedListener(OnTtsVisualizerWaveFormDataChangedListener listener);

Method for removing listener

  • Parameters

    Parameter Type Description
    listener OnTtsVisualizerWaveFormDataChangedListener An instance of a class that implements this interface
  • Prototype

    void removeOnTtsVisualizerWaveFormDataChangedListener(OnTtsVisualizerWaveFormDataChangedListener listener);
  • Support from

    0.10.72


OnTtsVisualizerFftDataChangedListener

Set your context to implements this interface and override its' abstract method to listen to the fft data changes of the TTS audio visualizer.

Prototype

package com.robotemi.sdk.listeners;

interface OnTtsVisualizerFftDataChangedListener {}

Abstract methods

  • Parameters

    Parameter Type Description
    fft byte[] fft data
  • Prototype

    void OnTtsVisualizerFftDataChangedListener(byte[] fft);

Method for adding listener

  • Parameters

    Parameter Type Description
    listener OnTtsVisualizerFftDataChangedListener An instance of a class that implements this interface
  • Prototype

    void addOnTtsVisualizerFftDataChangedListener(OnTtsVisualizerFftDataChangedListener listener);

Method for removing listener

  • Parameters

    Parameter Type Description
    listener OnTtsVisualizerFftDataChangedListener An instance of a class that implements this interface
  • Prototype

    void removeOnTtsVisualizerFftDataChangedListener(OnTtsVisualizerFftDataChangedListener listener);
  • Support from

    0.10.72


ITtsService

Implement this interface and override the abstract methods, and use method setTtsService() to bind the TTS service.

Prototype

package com.robotemi.sdk.voice;

interface ITtsService {}

Abstract methods

speak()

temi will call this method indirectly to play TTS.

  • Parameters

    Parameter Type Description
    ttsRequest TtsRequest Pending TtsRequest instance
  • Prototype

    void speak(TtsRequest ttsRequest);
cancel()

temi will call this method indirectly to cancel(stop) current TTS.

  • Prototype

    void cancel();
pause()

temi will call this method indirectly to pause current TTS.

  • Prototype

    void pause();
resume()

temi will call this method indirectly to resume current TTS.

  • Prototype

    void resume();
  • Support from

    0.10.77


Models

TtsRequest

Request object passed to temi, which contains all the information temi needs to in order to speak and for the skill to track its' request.

Prototype

package com.robotemi.sdk;

class TtsRequest {}

Subclass

  • Status

    • The status currently in use

      Status description
      STARTED Start playing
      COMPLETED Finish playing
      ERROR Errors occurred while playing
      NOT_ALLOWED Play is not allowed
    • Prototype

      enum Status {
          PENDING,
          PROCESSING,
          STARTED,
          COMPLETED,
          ERROR,
          NOT_ALLOWED;
      }
  • Language

    • Current supported TTS language

      Language Description
      SYSTEM(0) Follow system
      EN_US(1) English (United States)
      ZH_CN(2) Chinese (Mandarin, Simplified)
      ZH_HK(3) Chinese (Cantonese, Traditional)
      ZH_TW(4) Chinese (Taiwanese Mandarin)
      TH_TH(5) Thai (Thailand)
      HE_IL(6) Hebrew (Israel)
      KO_KR(7) Korean (Korea)
      JA_JP(8) Japanese (Japan)
      ID_ID(10) Indonesian (Indonesia)
      DE_DE(11) German (Germany)
      FR_FR(12) French (France)
      FR_CA(13) French (Canada)
      PT_BR(14) Portuguese (Brazil)
      AR_EG(15) Arabic (Egypt)
      RU_RU(18) Russian (Russia)
      IT_IT(19) Italian (Italy)
      PL_PL(20) Polish (Poland)
      ES_ES(21) Spanish (Spain)
      CA_ES(22) Catalan (Spain) (supported from 130 version)
      HI_IN(23) Hindi (supported from 130 version)
      ET_EE(24) Estonian (supported from 131 version)
      TR_TR(25) Turkish (supported from 131 version)
      EN_IN(26) English (India) (supported from 133 version)
      MS_MY(27) Malay (supported from 134 version)
      VI_VN(28) Vietnamese (supported from 134 version)
      EL_GR(29) Greek (supported from 134 version)
    • Prototype

      enum Language {
          SYSTEM(0),
          EN_US(1),
          ZH_CN(2),
          ZH_HK(3),
          ZH_TW(4),
          TH_TH(5),
          HE_IL(6),
          KO_KR(7),
          JA_JP(8),
          IN_ID(9),
          ID_ID(10),
          DE_DE(11),
          FR_FR(12),
          FR_CA(13),
          PT_BR(14),
          AR_EG(15),
          AR_AE(16),
          AR_XA(17),
          RU_RU(18),
          IT_IT(19),
          PL_PL(20),
          ES_ES(21),
          CA_ES(22),
          HI_IN(23),
          ET_EE(24),
          TR_TR(25),
          EN_IN(26),
          MS_MY(27),
          VI_VN(28),
          EL_GR(29);
      }

Attributes

Attribute Type Description
id UUID Unique number that identifies each tts request
speech String The text to be spoken
packageName String Skill package name so that temi knows who made the request
status Status Status of the request
isShowOnConversationLayer boolean Should the conversation line be shown when temi speaks the text. Note: Only relevant for 'Hey temi' assistant skills
language int Language
showAnimationOnly boolean true if you want to show a face animation while the speech is ongoing.
This only works if there is an assigned interaction animation in temi Settings,
otherwise it will just display the text on screen without a face animation.
Set this as true will override isShowOnConversationLayer if that value is set to false
cached boolean true if you want to have this tts cached. Default as false.
If there is cache, it will be spoken offline.
This is useful for TTS from some sentences you have in the strings.xml (Supported from 129 version)

Static methods

Create a TtsRequest object and pass it to speak(TtsRequest ttsRequest) method to play TTS. Only speech is mandatory. The other parameters are optional.

  • Parameters

    Parameter Type Description
    speech String The text to be spoken
    isShowOnConversationLayer boolean default as true
    language Language default as Language.SYSTEM
    showAnimationOnly boolean default as false
    cached boolean default as false
  • Return

    Type Description
    TtsRequest TTS request object created by this method
  • Prototype

    static TtsRequest create(String speech, boolean isShowOnConversationLayer);

TtsVoice

Tts voice configuration.

Prototype

package com.robotemi.sdk.voice.model;

class TtsVoice {}

Attributes

Attribute Type Description
gender Gender only female and male can be used as parameter
speed float 0.5 - 2.0, stepping by 0.1, default 1.0
pitch int -10 - 10, stepping by 1, default 0

Gender

Tts voice gender.

Prototype

package com.robotemi.sdk.constants;

enum class Gender {
  FEMALE, MALE, UNKNOWN
}

SttLanguage

  • Current supported ASR language

    Language Description
    SYSTEM(0) Follow system
    EN_US(1) English (United States)
    ZH_CN(2) Chinese (Mandarin, Simplified)
    JA_JP(3) Japanese (Japan)
    KO_KR(4) Korean (Korea)
    ZH_HK(5) Chinese (Cantonese, Traditional)
    ZH_TW(6) Chinese (Taiwanese Mandarin)
    DE_DE(7) German (Germany)
    TH_TH(8) Thai (Thailand)
    IN_ID(9) Indonesian (Indonesia)
    PT_BR(10) Portuguese (Brazil)
    AR_EG(11) Arabic (Egypt)
    FR_CA(12) French (Canada)
    FR_FR(13) French (France)
    ES_ES(14) Spanish (Spain)
    CA_ES(15) Catalan (Spain)
    IW_IL(16) Hebrew (Israel)
    IT_IT(17) Italian (Italy)
    ET_EE(18) Estonian
    TR_TR(19) Turkish
    HI_IN(20) Hindi, added in 1.133.0 version
    EN_IN(21) English (India), added in 1.133.0 version
    MS_MY(22) Malay, added in 134 version
    VI_VN(23) Vietnamese, added in 134 version
    RU_RU(24) Russian, added in 134 version
    EL_GR(25) Greek, added in 134 version

Prototype

enum SttLanguage {
    SYSTEM(0),
    EN_US(1),
    ZH_CN(2),
    JA_JP(3),
    KO_KR(4),
    ZH_HK(5),
    ZH_TW(6),
    DE_DE(7),
    TH_TH(8),
    IN_ID(9),
    PT_BR(10),
    AR_EG(11),
    FR_CA(12),
    FR_FR(13),
    ES_ES(14),
    CA_ES(15),
    IW_IL(16),
    IT_IT(17),
    ET_EE(18),
    TR_TR(19),
    HI_IN(20),
    EN_IN(21),
    MS_MY(22),
    VI_VN(23),
    RU_RU(24),
    EL_GR(25);
}

Override original voice flow

Override the NLP

Steps

  • Add the following <meta-data>s under the <application> element to AndroidManifest.xml file:

    <!-- Kiosk mode is required -->
    <meta-data android:name="@string/metadata_kiosk" android:value="true" />
    
    <meta-data android:name="@string/metadata_override_nlu" android:value="true" />
  • Listen to the ASR and access your own NLP service in its callback method.

  • Operate in Launcher: Settings > Kiosk > Select your skill. Or request to be the selected Kiosk App by method Robot.getInstance().requestToBeKioskApp().

Required permissions

Selected Kiosk

Support from

0.10.63


Override the ASR

Steps

  • Add the following <meta-data>s under the <application> element to AndroidManifest.xml file:

    <!-- Kiosk mode is required -->
    <meta-data android:name="@string/metadata_kiosk" android:value="true" />
    
    <meta-data android:name="@string/metadata_override_stt" android:value="true" />
  • Listen to the wake-up event and access your own ASR service in its callback method.

  • Operate in Launcher: Settings > Kiosk > Select your skill. Or request to be the selected Kiosk App by method Robot.getInstance().requestToBeKioskApp().

Required permissions

Selected Kiosk

Support from

0.10.70


Override the Conversation layer

Steps

  • Add the following <meta-data>s under the <application> element to AndroidManifest.xml file:

    <!-- Kiosk mode is required -->
    <meta-data android:name="@string/metadata_kiosk" android:value="true" />
    
    <meta-data android:name="@string/metadata_override_conversation_layer" android:value="true" />
  • Listen to onConversationStatusChangedListener and draw the UI according to the data(status, text) from its callback method.

  • Operate in Launcher Settings > App > Kiosk > Select your skill. Or request to be the selected Kiosk App by method Robot.getInstance().requestToBeKioskApp().

Required permissions

Selected Kiosk

Support from

0.10.72


Override the TTS

Steps

  • Add the following <meta-data>s under the <application> element to AndroidManifest.xml file:

    <!-- Kiosk mode is required -->
    <meta-data android:name="@string/metadata_kiosk" android:value="true" />
    
    <meta-data android:name="@string/metadata_override_tts" android:value="true" />
  • Implement interface ITtsService and override methods speak(), cancel(), pause(), resume() . temi will call the corresponding method to achieve TTS requirements.

  • Use method setTtsService() to bind the instance of the class that implemented ITtsService in the previous step.

  • [Very important] Use method publishTtsStatus() to publish the TTS status to temi, temi needs these status to respond things(UI, Sequence steps).

  • Operate in Launcher Settings > App > Kiosk > Select your skill. Or request to be the selected Kiosk App by method Robot.getInstance().requestToBeKioskApp().

  • Refer to Sample for details.

Required permissions

Selected Kiosk

Support from

0.10.77

Clone this wiki locally