Troubleshooting
Problem
db2start completes successfully but with a warning message "libnuma: Warning: /sys not mounted or invalid."
Symptom
$ db2start
libnuma: Warning: /sys not mounted or invalid. Assuming one node: No such file or directory
2017-12-14 14:28:44 0 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.
db2diag.log:
2017-12-19-15.00.09.447117+480 E135107179E487 LEVEL: Warning (OS)
PID : 104183 TID : 140737016641312 PROC : db2star2
INSTANCE: db2inst1 NODE : 000
HOSTNAME: myhost1
FUNCTION: DB2 UDB, oper system services, sqloKADetermineNUMASupport, probe:50
CALLED : OS, -, open
OSERR : -1 "Unknown error 18446744073709551615"
DATA #1 : Hexdump, 65171 of 4028196206 bytes
Object not dumped: Address: 0x0000000000000000 Size: 4028196206 Reason: Address is NULL
libnuma: Warning: /sys not mounted or invalid. Assuming one node: No such file or directory
2017-12-14 14:28:44 0 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.
db2diag.log:
2017-12-19-15.00.09.447117+480 E135107179E487 LEVEL: Warning (OS)
PID : 104183 TID : 140737016641312 PROC : db2star2
INSTANCE: db2inst1 NODE : 000
HOSTNAME: myhost1
FUNCTION: DB2 UDB, oper system services, sqloKADetermineNUMASupport, probe:50
CALLED : OS, -, open
OSERR : -1 "Unknown error 18446744073709551615"
DATA #1 : Hexdump, 65171 of 4028196206 bytes
Object not dumped: Address: 0x0000000000000000 Size: 4028196206 Reason: Address is NULL
Cause
db2start will load libnuma.so.1 and call numa_node_to_cpus() as following:
libHandle = dlopen( "/usr/lib64/libnuma.so.1", RTLD_NOW );
funcv1 = (dlvsym(libHandle, "numa_node_to_cpus", "libnuma_1.1"));
sysrc = (*funcv1)( 0, (long unsigned int*)&cpuSetNode, sizeof(cpuSetNode) ) ;
In this case, numa_node_to_cpus fails somehow.
libHandle = dlopen( "/usr/lib64/libnuma.so.1", RTLD_NOW );
funcv1 = (dlvsym(libHandle, "numa_node_to_cpus", "libnuma_1.1"));
sysrc = (*funcv1)( 0, (long unsigned int*)&cpuSetNode, sizeof(cpuSetNode) ) ;
In this case, numa_node_to_cpus fails somehow.
Environment
Redhat Linux
Diagnosing The Problem
db2trc shows sqloKADetermineNUMASupport fails:
7047 | | | sqloKADetermineNUMASupport entry
7048 | | | | OSSHLibrary::load entry
7049 | | | | OSSHLibrary::load data [probe 10]
7050 | | | | OSSHLibrary::load exit
7051 | | | | OSSHLibrary::getFuncAddress entry
7052 | | | | OSSHLibrary::getFuncAddress data [probe 10]
7053 | | | | OSSHLibrary::getFuncAddress data [probe 100]
7054 | | | | OSSHLibrary::getFuncAddress exit
7055 | | | | OSSHLibrary::getFuncAddress entry
7056 | | | | OSSHLibrary::getFuncAddress data [probe 10]
7057 | | | | OSSHLibrary::getFuncAddress data [probe 20]
7058 | | | | OSSHLibrary::getFuncAddress data [probe 30]
7059 | | | | OSSHLibrary::getFuncAddress data [probe 100]
7060 | | | | OSSHLibrary::getFuncAddress exit
7061 | | | | pdLogSysRC entry
7062 | | | | | pdIsDiagLevelOk entry
<skipped>
7111 | | | sqloKADetermineNUMASupport SYSTEM ERROR [probe 50]
7112 | | | | | | sqloclose entry
7113 | | | | | | sqloclose exit
7114 | | | | | | sqloSigMask entry
7115 | | | | | | sqloSigMask exit
7116 | | | | | | sqloSigMask entry
7117 | | | | | | sqloSigMask exit
7118 | | | | | pdLogInternal exit
7119 | | | | pdLogSysRC exit
7111 SYSTEM ERROR DB2 UDB oper system services sqloKADetermineNUMASupport fnc (5.3.15.1286.0.50)
pid 104183 tid 140737016641312 cpid -1 node 0 probe 50
Func.Called: open
System Errno: 0
bytes 504
Data1 (PD_TYPE_DIAG_LOG_REC,488) Diagnostic log record:
2017-12-19-15.00.09.447117+480 E135107179E487 LEVEL: Warning (OS)
PID : 104183 TID : 140737016641312 PROC : db2star2
INSTANCE: db2inst1 NODE : 000
HOSTNAME: myhost1
FUNCTION: DB2 UDB, oper system services, sqloKADetermineNUMASupport, probe:50
CALLED : OS, -, open
OSERR : -1 "Unknown error 18446744073709551615"
DATA #1 : Hexdump, 65171 of 4028196206 bytes
Object not dumped: Address: 0x0000000000000000 Size: 4028196206 Reason: Address is NULL
Then why does sqloKADetermineNUMASupport fail? Need to collect a Linux trace:
$ cat start.sh
echo 'My process ID = ' $$
read -p 'Enter to run db2start ...' temp
echo 'Run db2start ...'
/home/db2inst1/sqllib/adm/db2start
Session1:
chmod a+x start.sh
./start.sh
#sample output:
$ ./start.sh
My process ID = 7200
Enter to run db2start ...
#Remember the process ID, in this example it is 7200.
Session2: with root user
strace -o db2stat.strace.out -f -p 7200
Note: replace 7200 with the real process ID you got in session1.
Session1:
Press 'Enter' key, command 'db2start' starts
#as soon as 'db2start' returns
Session2:
press ctrl + c
Look at the Linux trace, call of numa_node_to_cpus() fails due to error message as below:
68475 open("/sys/devices/system/node/node0/cpumap", O_RDONLY) = -1 ENOENT (No such file or directory)
68475 write(2, "libnuma: Warning: ", 18) = 18
68475 write(2, "/sys not mounted or invalid. Ass"..., 73) = 73
68475 write(2, "\n", 1)
Seems like missing of /sys/devices/system/node/node0/cpumap caused the failure.
7047 | | | sqloKADetermineNUMASupport entry
7048 | | | | OSSHLibrary::load entry
7049 | | | | OSSHLibrary::load data [probe 10]
7050 | | | | OSSHLibrary::load exit
7051 | | | | OSSHLibrary::getFuncAddress entry
7052 | | | | OSSHLibrary::getFuncAddress data [probe 10]
7053 | | | | OSSHLibrary::getFuncAddress data [probe 100]
7054 | | | | OSSHLibrary::getFuncAddress exit
7055 | | | | OSSHLibrary::getFuncAddress entry
7056 | | | | OSSHLibrary::getFuncAddress data [probe 10]
7057 | | | | OSSHLibrary::getFuncAddress data [probe 20]
7058 | | | | OSSHLibrary::getFuncAddress data [probe 30]
7059 | | | | OSSHLibrary::getFuncAddress data [probe 100]
7060 | | | | OSSHLibrary::getFuncAddress exit
7061 | | | | pdLogSysRC entry
7062 | | | | | pdIsDiagLevelOk entry
<skipped>
7111 | | | sqloKADetermineNUMASupport SYSTEM ERROR [probe 50]
7112 | | | | | | sqloclose entry
7113 | | | | | | sqloclose exit
7114 | | | | | | sqloSigMask entry
7115 | | | | | | sqloSigMask exit
7116 | | | | | | sqloSigMask entry
7117 | | | | | | sqloSigMask exit
7118 | | | | | pdLogInternal exit
7119 | | | | pdLogSysRC exit
7111 SYSTEM ERROR DB2 UDB oper system services sqloKADetermineNUMASupport fnc (5.3.15.1286.0.50)
pid 104183 tid 140737016641312 cpid -1 node 0 probe 50
Func.Called: open
System Errno: 0
bytes 504
Data1 (PD_TYPE_DIAG_LOG_REC,488) Diagnostic log record:
2017-12-19-15.00.09.447117+480 E135107179E487 LEVEL: Warning (OS)
PID : 104183 TID : 140737016641312 PROC : db2star2
INSTANCE: db2inst1 NODE : 000
HOSTNAME: myhost1
FUNCTION: DB2 UDB, oper system services, sqloKADetermineNUMASupport, probe:50
CALLED : OS, -, open
OSERR : -1 "Unknown error 18446744073709551615"
DATA #1 : Hexdump, 65171 of 4028196206 bytes
Object not dumped: Address: 0x0000000000000000 Size: 4028196206 Reason: Address is NULL
Then why does sqloKADetermineNUMASupport fail? Need to collect a Linux trace:
$ cat start.sh
echo 'My process ID = ' $$
read -p 'Enter to run db2start ...' temp
echo 'Run db2start ...'
/home/db2inst1/sqllib/adm/db2start
Session1:
chmod a+x start.sh
./start.sh
#sample output:
$ ./start.sh
My process ID = 7200
Enter to run db2start ...
#Remember the process ID, in this example it is 7200.
Session2: with root user
strace -o db2stat.strace.out -f -p 7200
Note: replace 7200 with the real process ID you got in session1.
Session1:
Press 'Enter' key, command 'db2start' starts
#as soon as 'db2start' returns
Session2:
press ctrl + c
Look at the Linux trace, call of numa_node_to_cpus() fails due to error message as below:
68475 open("/sys/devices/system/node/node0/cpumap", O_RDONLY) = -1 ENOENT (No such file or directory)
68475 write(2, "libnuma: Warning: ", 18) = 18
68475 write(2, "/sys not mounted or invalid. Ass"..., 73) = 73
68475 write(2, "\n", 1)
Seems like missing of /sys/devices/system/node/node0/cpumap caused the failure.
Resolving The Problem
The problem is outside of Db2. It is recommended to contact a Linux support to check why /sys/devices/system/node/node0/cpumap is not there. But before that, you are suggested to check BUG 998678:
https://bugzilla.redhat.com/show_bug.cgi?id=998678
https://bugzilla.redhat.com/show_bug.cgi?id=998678
[{"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"Operating System \/ Hardware - Other OS\/Hardware","Platform":[{"code":"PF016","label":"Linux"}],"Version":"10.5;11.1","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Was this topic helpful?
Document Information
Modified date:
07 December 2022
UID
swg22013988