Resolving problem: MQTT client does not connect

Resolve the problem of an MQTT client program failing to connect to the telemetry (MQXR) service.

Before you begin

Is the problem at the server, at the client, or with the connection? Have you have written your own MQTT v3 protocol handling client, or an MQTT client app using the C or Java WebSphere® MQTT clients?

Run the verification application supplied with WebSphere MQ Telemetry on the server, and check that the telemetry channel and telemetry (MQXR) service are running correctly. Then transfer the verification application to the client, and run the verification application there.

About this task

There are a number of reasons why an MQTT client might not connect, or you might conclude it has not connected, to the telemetry server.

Procedure

  1. Consider what inferences can be drawn from the reason code that the telemetry (MQXR) service returned to MqttClient.Connect. What type of connection failure is it?
    Option Description
    REASON_CODE_INVALID_PROTOCOL_VERSION

    Make sure that the socket address corresponds to a telemetry channel, and you have not used the same socket address for another broker.

    REASON_CODE_INVALID_CLIENT_ID

    Check that the client identifier is no longer than 23 bytes, and contains only characters from the range: A-Z, a-z, 0-9, './_%

    REASON_CODE_INVALID_DESTINATION

    Check that the client identifier is not the same as the queue manager name.

    REASON_CODE_SERVER_CONNECT_ERROR

    Check that the telemetry (MQXR) service and the queue manager are running normally. Use netstat to check that the socket address is not allocated to another application.

    If you have written an MQTT client library rather than use one of the libraries provided by IBM® WebSphere MQ Telemetry, look at the CONNACK return code.

    From these three errors you can infer that the client has connected to the telemetry (MQXR) service, but the service has found an error.

  2. Consider what inferences can be drawn from the reason codes that the client produces when the telemetry (MQXR) service does not respond:
    Option Description
    REASON_CODE_CLIENT_EXCEPTION
    REASON_CODE_CLIENT_TIMEOUT

    Look for an FDC file at the server; see Server-side logs. When the telemetry (MQXR) service detects the client has timed out, it writes a first-failure data capture (FDC) file. It writes an FDC file whenever the connection is unexpectedly broken.

    The telemetry (MQXR) service might not have responded to the client, and the timeout at the client expires. The WebSphere MQ Telemetry Java client only hangs if the application has set an indefinite timeout. The client throws one of these exceptions after the timeout set for MqttClient.Connect expires with an undiagnosed connection problem.

    Unless you find an FDC file that correlates with the connection failure you cannot infer that the client tried to connect to the server:

    1. Confirm that the client sent a connection request.

      Check the TCPIP request with a tool such as tcpmon, available from https://java.net/projects/tcpmon

    2. Does the remote socket address used by the client match the socket address defined for the telemetry channel?

      The default file persistence class in the Java SE MQTT client supplied with IBM WebSphere MQ Telemetry creates a folder with the name: clientIdentifier-tcphostNameport or clientIdentifier-sslhostNameport in the client working directory. The folder name tells you the hostName and port used in the connection attempt; see Client-side log files.

    3. Can you ping the remote server address?
    4. Does netstat on the server show the telemetry channel is running on the port the client is connecting too?
  3. Check whether the telemetry (MQXR) service found a problem in the client request.

    The telemetry (MQXR) service writes errors it detects into mqxr.log, and the queue manager writes errors into AMQERR01.LOG; see

  4. Attempt to isolate the problem by running another client.
    • Run the MQTT sample application using the same telemetry channel.
    • Run the wmqttSample GUI client to verify the connection. Get wmqttSample by downloading SupportPac IA92.
      Note: Older versions of IA92 do not include the MQTT v3 Java client library.

    Run the sample programs on the server platform to eliminate uncertainties about the network connection, then run the samples on the client platform.

  5. Other things to check:
    1. Are tens of thousands of MQTT clients trying to connect at the same time?

      Telemetry channels have a queue to buffer a backlog of incoming connections. Connections are processed in excess of 10,000 a second. The size of the backlog buffer is configurable using the telemetry channel wizard in IBM WebSphere MQ Explorer. Its default size is 4096. Check that the backlog has not been configured to a low value.

    2. Are the telemetry (MQXR) service and queue manager still running?
    3. Has the client connected to a high availability queue manager that has switched its TCPIP address?
    4. Is a firewall selectively filtering outbound or return data packets?