copyright: years: 2019, 2023 lastupdated: "2023-04-18"
Monitoring
Monitoring and alerting via log messages
On every call IBM®, Voice Gateway will log information such as the end status of the call, reasons for failure, and other details. You can monitor the logs and build alerts on these log messages to proactively check the state of the application. We typically recommend monitoring for Call failures and Call quality as mentioned below.
Call failures
You can determine the number of calls that failed via the CWSGW0158E
log message code and the number of calls received via the CWSGW0003I
log message code. Using a log aggregation system, such as Splunk or LogDNA, a query can be built to count the instances of CWSGW0158E
and divide by the number of CWSGW0003I
instances to determine a percentage of calls that failed within a time frame and to create an alert. A recommended alert threshold would be at least 5% of calls within 15 minutes.
Call quality
You can determine the number of calls that had call quality issues via the CWSMR0134W
log message code and the number of calls received via the CWSGW0003I
log message code. Using a log aggregation system, such as
Splunk or LogDNA, a query can be built to count instances of CWSMR0134W
and divide by the number of CWSGW0003I
instances to determine a percentage of calls that experienced a call quality issue within a time frame and to create an alert. A recommended alert threshold would be at least 5% of calls within 15 minutes.
For a full reference of system messages, see the Voice Gateway System Messages page.
Monitoring via Prometheus metrics
The Voice Gateway monitoring feature provides a REST API to display metrics for administrators to access.
- Formats
- REST endpoint
- Connecting to data stores and monitoring tools
- Prometheus format details
- JSON format details
- Metrics
- Configuration
Formats
The metrics endpoint provides two output formats. The format that is used for each response depends on the HTTP accept header of the corresponding request.
Prometheus text format
- A representation of the metrics that is compatible with the Prometheus monitoring tool. This format is returned for requests that have atext/plain
accept header.JSON format
- A JSON representation of the metrics. This format is returned for requests that have aapplication/json
accept header.
REST endpoint
The following table illustrates the monitoring endpoint that can be accessed to provide metrics.
Endpoints | Request type | Supported formats | Description |
---|---|---|---|
/metrics/application | GET | JSON, Prometheus | Returns Voice Gateway metrics. |
Connecting to data stores and monitoring tools
You can connect the Voice Gateway metrics to tools and stacks that can analyze and monitor the metric information. By default, the /metrics/application
endpoint returns data in a format that is compatible with Prometheus. To connect
the Voice Gateway server to Prometheus, configure Prometheus to use the http://host:http_port/metrics/application
or https://host:https_port/metrics/application
endpoint. The JSON format can be used by other metrics
collection tools that understand JSON.
Prometheus format details
The Prometheus text format is based on the 0.0.4 exposition format described in the Prometheus documentation. Where available, metadata is provided for each metric. The # Help
line
contains a description of the metric. Any tags present in the metadata are provided as Prometheus labels. The metric's unit is appended at the end of the metric name for gauges and histograms.
A gauge is represented by a single value. The following example shows how a gauge named vg_max_conversation_latency
, with seconds as the unit of measurement, would be displayed for the 123456789
tenant:
# TYPE application_vg_max_conversation_latency_seconds gauge
# HELP application_vg_max_conversation_latency_seconds Maximum conversation latency per monitoring interval
application_vg_max_conversation_latency_seconds{tenant_id="123456789"} 7.049
The following example illustrates the generated text format for the vg_max_calls
gauge.
# TYPE application_vg_max_calls_per_second gauge
# HELP application_vg_max_calls_per_second Maximum calls per second per monitoring interval
application_vg_max_calls_per_second 1
JSON format details
The JSON format returns data that is formatted in a tree. Each metric is referenced by the name and the value.
{
"vg_max_tts_latency_seconds{tenant_id=\"123456789\"}": 0.528,
"vg_max_conversation_latency_seconds{tenant_id=\"123456789\"}": 7.049,
"vg_max_stt_latency_seconds{tenant_id=\"123456789\"}": 0,
"vg_max_calls_per_second": 1,
"vg_max_concurrent_calls{tenant_id=\"123456789\"}": 1
}
Metrics
Key | Value |
---|---|
vg_max_calls | Maximum calls per second per monitoring interval. |
Tenant-specific metrics
Key | Value |
---|---|
vg_max_concurrent_calls | Maximum concurrent calls per monitoring interval. |
vg_max_conversation_latency | Maximum Watson Assistant service latency per monitoring interval, in seconds. |
vg_max_tts_latency | Maximum Text to Speech Service latency per monitoring interval, in seconds. |
vg_max_stt_latency | Maximum Speech to Text Service latency per monitoring interval, in seconds. |
Configuration
Use the following environment variables to configure the monitoring feature.
Key | Value |
---|---|
METRICS_SAMPLING_INTERVAL | See Configuration environment variables for Voice Gateway. |
ENABLE_METRICS_AUTH | See Configuration environment variables for Voice Gateway. |
HTTP_HOST | See Configuration environment variables for Voice Gateway. |
ADMIN_USERNAME | See Configuration environment variables for Voice Gateway. |
ADMIN_PASSWORD | See Configuration environment variables for Voice Gateway. |