GitHubContribute in GitHub: Edit online

copyright: years: 2017, 2023 lastupdated: "2023-01-05"


Configuring speech recognition for Google Cloud Speech API

For Google Cloud Speech API, you can change the default configuration of the RecognitionConfig API. For example, you can toggle profanity filtering, change the language, or add speech context. You only need to specify any Cloud Speech API configuration if you want to change the behavior from the service defaults.

To change the default configuration, you can define

  • As Docker environment variables, directly in the deployment configuration
  • As JSON properties in a separate JSON file

Creating a separate JSON file enables you to define more fields, in particular the speech context. If a field is defined in both places, the value of specified in the JSON file takes precedence.

Configuring Google Cloud Speech API in the deployment configuration

To configure Google Cloud Speech API as part of the Speech to Text Adapter deployment, define the GOOGLE_SPEECH environment variables. For a full list of configuration environment variables, see Speech to Text Adapter environment variables.

Configuring Google Cloud Speech API in a JSON file

  1. Create a recognitionConfig.json file, and define fields from the RecognitionConfig API in JSON format. The stt-adapter folder in the sample.voice.gateway GitHub repository contains a sample recognitionConfig.json file that you can use to get started.

    Important: The fields for the RecognitionConfig API must be specified in camel-case format in the recognitionConfig.json file. For example, for the language_code field, specify languageCode instead.

{ "languageCode": "es-ES" }


  **Note:** The following fields for `RecognitionConfig` in the Cloud Speech API can't be modified because they have fixed values that are used by the Speech To Text Adapter.
  * `encoding`
  * `sample_rate_hertz`

1. In the configuration for the `stt.adapter` container, mount the `recognitionConfig.json` file on a volume and reference the file location on the `GOOGLE_SPEECH_RECOGNITION_CONFIG` environment variable.

  For example, on Docker:
  ```yaml
stt.adapter:
  ...
  environment:
    - GOOGLE_APPLICATION_CREDENTIALS=/stt-adapter/credentials/google-service-account.json
    - GOOGLE_SPEECH_RECOGNITION_CONFIG=/stt-adapter/recognitionConfig.json
  volumes:
    - "/path/to/credentials/google-service-account.json:/stt-adapter/credentials/google-service-account.json"
    - "./recognitionConfig.json:/stt-adapter/recognitionConfig.json"

What to do next

After you change the configuration, redeploy Voice Gateway with the Speech to Text Adapter for your change to take effect.