Setting up the netmon.cf file on a TCP/IP network

In a Db2® pureScale® environment running on a TCP/IP protocol over Ethernet (TCP/IP) network, if you are using a private network, one or more pingable IP addresses must be manually set up in the configuration file netmon.cf. The netmon.cf file is required by Reliable Scalable Cluster Technology (RSCT) to monitor the network and ensure that the interfaces are pingable or not. For a private network, this file must be set up manually. (On a public network, the Db2 installer updates this file automatically.)

Starting from V11.1.4.4, the procedures documented in this page are no longer required as adapter port liveliness test has been enhanced and automated. Some restrictions apply. Refer to technote#0733765 for restrictions.

Before you begin

The examples in this topic are based on a Db2 pureScale environment setup with two CFs and two members.

Procedure

To set up the netmon.cf configuration file:

  1. Stop the domain:
    1. Log in to one of the cluster hosts as root.
    2. Retrieve the cluster manager domain name.
      /home/instname/sqllib/bin/db2cluster -cm -list -domain
    3. Stop the domain.
      /home/instname/sqllib/bin/db2cluster -cm -stop -domain domainname -force 
  2. Set up the configuration file netmon.cf for each host in the cluster:
    1. Log in to the host as root.
    2. Determine which IP address to enter into each members' netmon.cf configuration file.
      • On AIX® operating systems, to check the communication adapter ports and the associated destination IP subnet, run the netstat command on the member host. For example:
        netstat -rn
        Routing tables
        Destination        Gateway           Flags   Refs     Use  If   Exp  Groups
        
        Route Tree for Protocol Family 2 (Internet):
        default            9.26.51.1         UG       21  15309923 en0      -      -
        9.26.51.0          9.26.51.163       UHSb      0         0 en0      -      -   =>
        9.26.51/24         9.26.51.163       U        15  70075017 en0      -      -
        9.26.51.163        127.0.0.1         UGHS     32   1505251 lo0      -      -
        9.26.51.255        9.26.51.163       UHSb      0       945 en0      -      -
        10.1.5.0           10.1.5.13         UHSb      0         0 en1      -      -   =>
        10.1.5/24          10.1.5.13         U       519 3031889427 en1      -      -
        10.1.5.13          127.0.0.1         UGHS      0    347651 lo0      -      -
        10.1.5.255         10.1.5.13         UHSb      0         3 en1      -      -
        127/8              127.0.0.1         U        10    734058 lo0      -      -
        
        Route Tree for Protocol Family 24 (Internet v6):
        ::1%1              ::1%1             UH        2   2463710 lo0      -      -
        The column "If" lists the adapters on the current host. Choose the adapter that corresponds to the target communication adapter port. In this example, "en1" is the target Ethernet private network adapter. The corresponding IP addresses in the first column show the target IP subnet to be used in the next step. In this case, the IP subnet is "10.1.5.0".
      • On Linux® operating systems, to check the communication adapter ports and the associated destination IP subnet, run the route command on the member host. For example:
        /sbin/route | grep -v link-local
        Member 0
        [root@host3]# route | grep -v link-local
        Kernel IP routing table
        Destination	Gateway 	Genmask Flags Metric Ref Use Iface
        192.168.1.0 	* 				255.255.255.0 U 0 0 0 eth0
        9.26.92.0 	* 				255.255.254.0 U 0 0 0 eth2
        default 	9.26.92.1 0.0.0.0 UG 0 0 0 eth2
        The last column (with column name "Iface") lists the adapters on the current host. Choose the adapter that corresponds to the target communication adapter port. In this example, "eth0" is the target Ethernet private network adapter. The corresponding IP addresses in the first column show the target IP subnet to be used in the next step. In this case, the IP subnet is "192.168.1.0".

      On most hosts, the same adapters is attached to the same subnet and the /var/ct/cfg/netmon.cf files are identical for all the hosts in the cluster. However, this might not be the case. For example, AIX configurations on LPARs can have more complex network configurations and each /var/ct/cfg/netmon.cf file can be different.

    3. With the IP subnet, use the IP interfaces created on the switch that the current host connects to with the same IP subnet. In the Linux example, assuming the IP interfaces on the switch has IP address 192.168.1.2, this entry is added to the member configuration file/var/ct/cfg/netmon.cf.
      For example, for Member 0 (host3), the following entry is added:
      !REQD eth0 192.168.1.2
      
      Where:
      • token1 - !REQD is required entity
      • token2 - eth0 (or en1) is the Ethernet private network interface name on the local host
      • token3 - 192.168.1.2 is the external pingable IP address that is assigned to the interface created on the switch.
      The following is an example of what the full configuration file /var/ct/cfg/netmon.cf looks like for Member0 (host3):
      !REQD eth2 9.26.92.1
      !REQD eth0 192.168.1.2
      
  3. After all the netmon.cf files are updated, the domain must be restarted:
    1. Log in to one of the cluster hosts as root.
    2. Restart the domain.
      /home/instname/sqllib/bin/db2cluster -cm -start -domain domainname
  4. Verify that all adapters are stable by running the lssrc command:
    lssrc -ls cthats
    The output is similar to the following:
    [root@coralm234 ~]# lssrc -ls cthats
    Subsystem         Group            PID     Status
     cthats           cthats           31938   active
    Network Name   Indx Defd  Mbrs  St   Adapter ID      Group ID
    CG1            [ 0] 3     3     S    192.168.1.234   192.168.1.234
    CG1            [ 0] eth0             0x46d837fd      0x46d83801
    HB Interval = 0.800 secs. Sensitivity = 4 missed beats
    Ping Grace Period Interval = 60.000 secs.
    Missed HBs: Total: 0 Current group: 0
    Packets sent    : 560419 ICMP 0 Errors: 0 No mbuf: 0
    Packets received: 537974 ICMP 0 Dropped: 0
    NIM's PID: 31985
    CG2            [ 1] 4     4     S    9.26.93.226     9.26.93.227
    CG2            [ 1] eth2             0x56d837fc      0x56d83802
    HB Interval = 0.800 secs. Sensitivity = 4 missed beats
    Ping Grace Period Interval = 60.000 secs.
    Missed HBs: Total: 0 Current group: 0
    Packets sent    : 515550 ICMP 0 Errors: 0 No mbuf: 0
    Packets received: 615159 ICMP 0 Dropped: 0
    NIM's PID: 31997
      2 locally connected Clients with PIDs:
     rmcd( 32162) hagsd( 32035)
      Dead Man Switch Enabled:
         reset interval = 1 seconds
         trip  interval = 67 seconds
         Watchdog module in use: softdog
      Client Heartbeating Enabled. Period: 6 secs. Timeout: 13 secs.
      Configuration Instance = 1322793087
      Daemon employs no security
      Segments pinned: Text Data Stack.
      Text segment size: 650 KB. Static data segment size: 1475 KB.
      Dynamic data segment size: 2810. Number of outstanding malloc: 1165
      User time 32 sec. System time 26 sec.
      Number of page faults: 0. Process swapped out 0 times.
      Number of nodes up: 4. Number of nodes down: 0.