GitHubContribute in GitHub: Edit online

copyright: years: 2018 lastupdated: "2018-09-21"


Configuring MRCPv2 speech synthesizer services

As an alternative to IBM® Text to Speech, you can configure your Voice Gateway deployment to connect with a third-party speech synthesizer service by using an MRCPv2 connection.

Configuring an MRCPv2 vocalizer

  1. Clone or download the sample.voice.gateway repository on GitHub.

  2. Go to the directory where you cloned sample.voice.gateway repository on your machine, and open the mrcp/ directory, which contains the following files:

    • docker-compose.yml - Basic configuration of Voice Gateway with MRCPv2
    • tenantConfiguration.json - JSON configuration file
  3. Open the unimrcpConfig/unimrcpclient.xml configuration file. In the server-ip field, specify the IP address of the MRCPv2 In the ext-ip field, specify the external IP address of the machine where the Media Relay container is running.

  4. In the docker-compose.yml file, mount the unimrcpclient.xml file to the Media Relay container.

  5. In the tenantConfiguration.json file, you can specify to use an MRCPv2 provider for speech synthesis by setting your tts configuration providerType parameter to mrcpv2. You can include more configuration fields to further customize your deployment.

"tts": { "providerType": "mrcpv2" }

  {:codeblock}

  **Remember**: If you don't specify a `providerType`, Voice Gateway uses the `watson` parameter by default.

### Example: Single provider that uses MRCPv2.

```json
{
"tts": {
  "providerType": "mrcpv2",
  "config": {
    "mrcpv2ProfileID": "MRCP #1",
    "speakHeaders": {
      "Voice-Age": "25",
      "Voice-Gender": "neutral"
    }
  },
  "cacheTimeToLive": 0
}
}


Example: Multiple provider configuration that uses MRCPv2

In the following example, one provider is shown in a multiple provider formatted JSON configuration file. Unlike the single provider configuration, multiple provider has providerSelectionPolicy and providers at the root level.

{
 "tts": {
    "providerSelectionPolicy" : "sequential",
    "providers" : [
      {
          "name" : "mrcp-synthesizer-primary",
          "providerType": "mrcpv2",
          "config": {
            "mrcpv2ProfileID": "MRCP #1",
            "speakHeaders": {
              "Speech-Language": "en-US",
              "Voice-gender": "neutral"
            },
          },
      }
    ]
  }
}

Text to Speech configuration parameters

The top-level Voice Gateway configuration for the Text to Speech has equivalent values for when you configure an MRCPv2 speech synthesizer.

Table 1. Parameters that can be used for both MRCPv2 speech synthesizer services and Text to Speech services.
Parameters Value Description
providerType string Defines the type of the speech provider mrcpv2 or watson. Defaults to watson.
credentials Credentials Required if using Text to Speech if you have mixed providers. Not required for MRCPv2 synthesizer services.
config WatsonTextToSpeechConfig/MrcpSynthesizerConfig Required. Defines the configuration for the specified text to speech provider.
connectionTimeout float Optional. Time in seconds that Voice Gateway waits to establish a socket connection with the Text to Speech or MRCPv2 synthesizer service. If the time is exceeded, Voice Gateway reattempts to connect with the Text to Speech or MRCPv2 synthesizer service. If the service still can't be reached, the call fails. Version 1.0.0.5 and later.
requestTimeout float Optional. Time in seconds that Voice Gateway waits to establish a speech recognition session with the Text to Speech or MRCPv2 synthesizer service. If the time is exceeded, Voice Gateway reattempts to connect with the Text to Speech or MRCPv2 synthesizer service. If the service still can't be reached, the call fails. Version 1.0.0.5 and later.
jitterBufferDelay The amount of time in milliseconds to buffer before playing back audio from the service. This buffer accounts for any jitter in the streaming audio.
cacheTimeToLive The time in hours to cache responses from the synthesizer service to improve playback response time. When enabled, all responses are cached unless they are excluded in the Watson Assistant dialog.
providers[] string Optional. A list of speech providers.

MRCPv2 Synthesizer Configuration

Configuration parameters specific to your mrcpv2 provider configuration.

Table 2. Configuration parameters specific to MRCPv2 speech synthesizzerss.
Parameters Format Description
mrcpv2ProfileID string Profile ID for the MRCPv2 Client. The original configuration document in XML can be found in the MRCPv2 Client configuration manual.
speakHeaders JSON object Collection of name/value pairs that will be used as headers for the recognition request.
speakBody JSON object Specifies the content body of the recognition request.
  • contentType: Content type of request body. For example: text/uri, application/ssml+xml.
  • body: Body sent in the request.
.
Text utterances coming from Watson Assistant are wrapped in an xml document. Using this field overrides the body of the SPEAK request that is sent to the MRCPv2 synthesizer.