Setting up the netmon.cf file on a TCP/IP network
In a Db2®
pureScale® environment running on a TCP/IP protocol over Ethernet (TCP/IP) network, if you
are using a private network, one or more pingable IP addresses must be manually set up in the
configuration file netmon.cf.
The
netmon.cf file is required by Reliable Scalable Cluster Technology (RSCT) to
monitor the network and ensure that the interfaces are pingable or not. For a private network, this
file must be set up manually. (On a public network, the Db2 installer updates
this file automatically.)
Starting from
V11.1.4.4, the procedures documented in this page are no longer required as adapter port liveliness
test has been enhanced and automated. Some restrictions apply. Refer to technote#0733765 for
restrictions.
Before you begin
Procedure
To set up the netmon.cf configuration file:
- Stop the domain:
- Log in to one of the cluster hosts as root.
- Retrieve the cluster manager domain name.
/home/instname/sqllib/bin/db2cluster -cm -list -domain
- Stop the domain.
/home/instname/sqllib/bin/db2cluster -cm -stop -domain domainname -force
- Set up the configuration file netmon.cf for
each host in the cluster:
- Log in to the host as root.
- Determine which IP address to enter into each members' netmon.cf configuration
file.
- On AIX® operating systems,
to check the communication adapter ports and the associated destination
IP subnet, run the netstat command on the member
host. For example:
netstat -rn
The column "If" lists the adapters on the current host. Choose the adapter that corresponds to the target communication adapter port. In this example, "en1" is the target Ethernet private network adapter. The corresponding IP addresses in the first column show the target IP subnet to be used in the next step. In this case, the IP subnet is "10.1.5.0".Routing tables Destination Gateway Flags Refs Use If Exp Groups Route Tree for Protocol Family 2 (Internet): default 9.26.51.1 UG 21 15309923 en0 - - 9.26.51.0 9.26.51.163 UHSb 0 0 en0 - - => 9.26.51/24 9.26.51.163 U 15 70075017 en0 - - 9.26.51.163 127.0.0.1 UGHS 32 1505251 lo0 - - 9.26.51.255 9.26.51.163 UHSb 0 945 en0 - - 10.1.5.0 10.1.5.13 UHSb 0 0 en1 - - => 10.1.5/24 10.1.5.13 U 519 3031889427 en1 - - 10.1.5.13 127.0.0.1 UGHS 0 347651 lo0 - - 10.1.5.255 10.1.5.13 UHSb 0 3 en1 - - 127/8 127.0.0.1 U 10 734058 lo0 - - Route Tree for Protocol Family 24 (Internet v6): ::1%1 ::1%1 UH 2 2463710 lo0 - -
- On Linux® operating systems,
to check the communication adapter ports and the associated destination
IP subnet, run the route command on the member
host. For example:
/sbin/route | grep -v link-local
The last column (with column name "Iface") lists the adapters on the current host. Choose the adapter that corresponds to the target communication adapter port. In this example, "eth0" is the target Ethernet private network adapter. The corresponding IP addresses in the first column show the target IP subnet to be used in the next step. In this case, the IP subnet is "192.168.1.0".Member 0 [root@host3]# route | grep -v link-local Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.1.0 * 255.255.255.0 U 0 0 0 eth0 9.26.92.0 * 255.255.254.0 U 0 0 0 eth2 default 9.26.92.1 0.0.0.0 UG 0 0 0 eth2
On most hosts, the same adapters is attached to the same subnet and the /var/ct/cfg/netmon.cf files are identical for all the hosts in the cluster. However, this might not be the case. For example, AIX configurations on LPARs can have more complex network configurations and each /var/ct/cfg/netmon.cf file can be different.
- On AIX® operating systems,
to check the communication adapter ports and the associated destination
IP subnet, run the netstat command on the member
host. For example:
- With the IP subnet, use the IP interfaces created on
the switch that the current host connects to with the same IP subnet.
In the Linux example, assuming
the IP interfaces on the switch has IP address 192.168.1.2, this entry
is added to the member configuration file/var/ct/cfg/netmon.cf.
For example, for Member 0 (host3), the following entry is added:
Where:!REQD eth0 192.168.1.2
- token1 - !REQD is required entity
- token2 - eth0 (or en1) is the Ethernet private network interface name on the local host
- token3 - 192.168.1.2 is the external pingable IP address that is assigned to the interface created on the switch.
The following is an example of what the full configuration file /var/ct/cfg/netmon.cf looks like for Member0 (host3):!REQD eth2 9.26.92.1 !REQD eth0 192.168.1.2
- After all the netmon.cf files are
updated, the domain must be restarted:
- Log in to one of the cluster hosts as root.
- Restart the domain.
/home/instname/sqllib/bin/db2cluster -cm -start -domain domainname
- Verify that all adapters are stable by running the lssrc command:
The output is similar to the following:lssrc -ls cthats
[root@coralm234 ~]# lssrc -ls cthats Subsystem Group PID Status cthats cthats 31938 active Network Name Indx Defd Mbrs St Adapter ID Group ID CG1 [ 0] 3 3 S 192.168.1.234 192.168.1.234 CG1 [ 0] eth0 0x46d837fd 0x46d83801 HB Interval = 0.800 secs. Sensitivity = 4 missed beats Ping Grace Period Interval = 60.000 secs. Missed HBs: Total: 0 Current group: 0 Packets sent : 560419 ICMP 0 Errors: 0 No mbuf: 0 Packets received: 537974 ICMP 0 Dropped: 0 NIM's PID: 31985 CG2 [ 1] 4 4 S 9.26.93.226 9.26.93.227 CG2 [ 1] eth2 0x56d837fc 0x56d83802 HB Interval = 0.800 secs. Sensitivity = 4 missed beats Ping Grace Period Interval = 60.000 secs. Missed HBs: Total: 0 Current group: 0 Packets sent : 515550 ICMP 0 Errors: 0 No mbuf: 0 Packets received: 615159 ICMP 0 Dropped: 0 NIM's PID: 31997 2 locally connected Clients with PIDs: rmcd( 32162) hagsd( 32035) Dead Man Switch Enabled: reset interval = 1 seconds trip interval = 67 seconds Watchdog module in use: softdog Client Heartbeating Enabled. Period: 6 secs. Timeout: 13 secs. Configuration Instance = 1322793087 Daemon employs no security Segments pinned: Text Data Stack. Text segment size: 650 KB. Static data segment size: 1475 KB. Dynamic data segment size: 2810. Number of outstanding malloc: 1165 User time 32 sec. System time 26 sec. Number of page faults: 0. Process swapped out 0 times. Number of nodes up: 4. Number of nodes down: 0.