Technical Blog Post
Abstract
SOLARIS: INCORRECT DB2 STACK TRACE WHEN STATIC FUNCTION
Body
Sometimes db2 stack traces on Solaris can be misleading as the function name we see on the stack is not the name of the function that was called. This might happen when a 'static' function is called. It has to do with the fact that 'static' functions do not have 'global' entries in the symbol table. For example, if you have a source file containing functions like this: int function1() { ... } static int function2() { ... } int function3() { ... } If some function calls 'function2()' then in the db2 stack trace you will see 'function1()'. So let's take a quick example with a stack trace from a 'trap' file that was generated on Solaris. The 'return' address are not printed in the stack trace but if we add them (from the rawstack) we would get this: ... 0xFFFFFFFF73D6E238 SsqldLongDataLength() + 0x278 0xFFFFFFFF73D5E060 sqldBuildCSO() + 0x1f0 0xFFFFFFFF73D5E3CC sqldBuildAndLockCSO() + 0x84 0xFFFFFFFF73D6019C sqldGetXMLDocument() + 0x1adc 0xFFFFFFFF73D686B4 sqldRowFetch() + 0x1b4c 0xFFFFFFFF76BABB78 sqlriFetch() + 0x1e8 0xFFFFFFFF76D035A4 sqlrita() + 0x684 0xFFFFFFFF76D68250 sqlrihsjn() + 0xb78 ... The function that does not fit here is 'sqldGetXMLDocument()'. That function is never called from 'sqldRowFetch()' and doesn't call 'sqldBuildAndLockCSO()'. So, let's see what happened... Using the return address in 'sqldGetXMLDocument()' we locate what library it belongs to and at what offset it is in that library. FFFFFFFF70400000 172032K r-x-- /vbs/engn/lib/libdb2e.so.1 Offset in library = 0xFFFFFFFF73D6019C - 0xFFFFFFFF70400000 = 0x396019C Now we check the symbol table for the library: nm -tx -v ~/sqllib/lib64/libdb2e.so.1 > libdb2e.so.1.nm We walk the global symbol table and find that our offset fits right between those 2 entries. Therefore we collect the name 'sqldGetXMLDocument()': [63584] |0x000000000395e6c0|0x0000000000000294|FUNC |GLOB |0x3 |11 |sqldGetXMLDocument() [5641] |0x000000000395e970|0x000000000000323c|FUNC |LOCL |0x3 |11 |sqldReturnData() [5642] |0x0000000003961bd8|0x0000000000001410|FUNC |LOCL |0x3 |11 |sqldDirectFetch() [5643] |0x0000000003963000|0x0000000000000580|FUNC |LOCL |0x3 |11 |sqldSamplingFetch() [5644] |0x0000000003963598|0x000000000000050c|FUNC |LOCL |0x3 |11 |sqldFromListFetch() [5645] |0x0000000003963ac0|0x0000000000000d40|FUNC |LOCL |0x3 |11 |sqldRIDlistFetch() [63799] |0x0000000003964818|0x0000000000001310|FUNC |GLOB |0x3 |11 |sqldDataFetch() This is because we only consider global symbols and our address fits between 0x000000000395e6c0 and 0x0000000003964818 so we wrongly assume it is part of 'sqldGetXMLDocument()'. One thing that can be done to find out the real name is look at the assembly instruction at the return address in 'sqldGetXMLDocument() using a debugger: (dbx) examine 0xFFFFFFFF73D686B4/i 0xffffffff73d686b4: sqldRowFetch+0x1b4c: call sqldReturnData ! 0xffffffff73d5e970 Here we clearly see the call to 'sqldReturnData()' which is the correct one.
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
UID
ibm13286077