ASR configuration

Edit on GitHub

Spokestack is designed to support multiple speech recognition providers so you can decide which is right for your use case. Support varies by mobile platform, however, so we decided to gather the information in one place to make the choice as easy as possible for your app.

Supported ASR providers by platform

Provider Android iOS
Android ASR (on-device)
Apple ASR (on-device)
Azure Speech Services
Google Cloud

Configuration

ASR providers require various configuration, usually in the form of API keys, but sometimes runtime components. This configuration takes place when you first build a Spokestack SpeechPipeline; below is a list of configuration needed for each platform and some usage notes.

For Android, primitive configuration properties are set via a call to setProperty(propertyName, value) on the speech pipeline’s builder (or a SpeechConfig object supplied to it); in iOS, they’re set as fields of a SpeechConfiguration object.


Android ASR

Android

No API keys or configuration properties are required, but a Context (android.content.Context) object must be added to the SpeechPipeline’s builder via the setAndroidContext() method. See the javadoc for AndroidSpeechRecognizer for more information.

Device compatibility

Android’s native ASR support is device-dependent. For production apps targeting broad compatibility, we recommend testing for its availability by calling SpeechRecognizer.isRecognitionAvailable() and having a fallback option in place for if it returns false.

This chart lists physical devices on which it has been tested by either the Spokestack team or our community. If you have a device that is not listed, please try it out and submit a PR with your results!

Device API Level ASR working?
Moto X (2nd Gen) 22 *
Lenovo TB-X340F tablet 27
Pixel 1 29
Pixel 3 XL 29
Pixel 3a 29
Pixel 4 29

* ASR fails consistently with a SERVER_ERROR, which seems to indicate that the server used by the device manufacturer to handle these requests is no longer operational.

iOS

N/A


Apple ASR

Android

N/A

iOS

None required! 🎉


Azure Speech Services

Android
iOS

N/A (for now)


Google Cloud

Android
  • google-credentials (string): A JSON-serialized string containing Google account credentials. See Google’s documentation for more information.
  • locale (string): A BCP-47 language identifier to identify the language that should be used for speech recognition (example: “en-US”). See Google’s documentation for a list of supported codes.
iOS

N/A (for now)