IBM Support

How to recognize and correct ccmdb check errors in Rational Synergy with Informix On-Line 7.x, 9.x, 10.x and 11.x database server

Question & Answer


Question

How do you recognize and correct ccmdb check errors in IBM Rational Synergy with IBM Informix On-Line 7.x, 9.x, 10.x and 11.x database server?

Cause

The IBM Rational Synergy ccmdb check command performs a database consistency check on the IBM Informix database server which hosts the Rational Synergy database. It should be run nightly either separately or as part of the ccmdb backup and its output should be reviewed for errors.

This technote lists some of the more common errors reported by ccmdb check, and suggests how these errors may be corrected. It is not expected that you should fix database errors on your own, rather this document should be used on the advice of Rational Client Support (RCS) in order to make instructions clearer should action need to be taken.

 

If a reported error is not covered by this bulletin, or multiple types of errors occur, or you are at all unsure of any of the steps in this bulletin, you should send the output from ccmdb check along with any tmp output files named in the ccmdb check output to RCS.

To create and update your PMRs please use the Service Request(SR) web application.

See the following on how to send attachments to RCS: Exchanging information with IBM Technical Support

Answer

DBACCESS: During problem analysis and resolution you will have to run SQL commands using the Informix utility dbaccess .

See the following TechNotes for details on how to run dbaccess for your specific environment:

How to run Informix SQL commands for a Rational Synergy database in a Microsoft Windows environment


How to run Informix SQL commands for your Rational Synergy database on a UNIX/Linux environment

ONCHECK: This deals with the Informix oncheck utility, which is used to get more details about a particular database error. See Section 1 below for details.

 

DATABASE CHECK ERROR MESSAGES: This covers some sample database error messages, and contains information on how to resolve these specific types of error. You should try to match the ccmdb check error message reported at your site to one of the headings in section 2.

 

Even where it appears clear that the instructions match an error that you can see affects your database, do NOT attempt to run the Informix utilities on your own. Please first check with RCS before attempting to modify the database at the Informix level.

In all situations requiring repair of a database, you should first confirm you have a valid backup of the databases on a database server. A valid backup is one made via ccmdb pack or ccmdb backup (per database), or ccmdb archive (per database server).

 

Here is a list of the database check error messages we discuss:

 

  • a) cannot obtain lock ccm_root.attrib

    b) WARNING:No syscolauth records found

     

    c) ERROR: Bad Page bd00, page not found

     

    d) Multiple bindings found for bsite 2/csrc/alpha.c (33268) in assembly 1/project/projname/projver (11033)

     

    e) bad value for is_attr_of (15252) in record:
    attrib.id = 45922, name = comment
    Owner CV = Non-existent (is_attr_of = 15252)

    f) bad value for to_cv (58781) in record:
    name = set_memberH
    from_cv = DCM/tset/51/1 (cvid = 59879)
    to_cv = Non-existent (cvid = 58781)

     

    g) unreachable bindings found in assembly 1/project/projname/projver (46637)

     

    h) Multiple heads found in the bsite linked list

     

    i) bad value for has_next_bs (38681) in record

     

    j) Parent CV 1/dir/xxx/1 (25418) not bound in assembly 1/project/yyy/1 (55819)

     

    k) textval larger than 2000000 bytes in record

     

    l) bad value for has_asm (123456) in record

     

    m) bad value for from_cv (139739) in record:
    name = associated_cv
    from_cv = Non-existent (cvid = 139739)
    to_cv = 1/project/projname/projver (cvid = 139743)



Your error messages may vary somewhat from those shown above. Object names or page numbers, some of the attribute names, and the 5 or 6-digit CVID values may be different. If you are unsure whether your error matches the error messages shown above, you should contact RCS.

1. How to run oncheck

If the ccmdb check output contains errors and a reference to a tmp output file, such as:

 

  • WARNING: summary of INFORMIX oncheck output has been saved in '/tmp/ccmdb check_inf_10876'


and at end of that referenced file a message such as the one shown below appears:

  • INFORMIX-OnLine must be in Quiescent mode to fix it.
    Please put INFORMIX-OnLine in Quiescent and rerun oncheck.


Then the Informix oncheck program needs to be run. You should not attempt to run oncheck on your own, but should e-mail or fax the output from ccmdb check along with any tmp output files named in the ccmdb check output to RCS.

RCS will provide you with specific instructions about how you should run oncheck.

 

The information in this section of the bulletin will make it simpler for you to run oncheck under the direction of RCS by spelling out many of the basic steps.

 

Before attempting these procedures, you should first confirm that you have a valid backup of the databases on a database server.

Prior to running oncheck, you will need to log onto the database server host as user ccm_root.

 

By default, an Informix database server, <servername>, has the same name as the host machine. Confirm the Informix server's name before proceeding with the following steps. You can confirm the Informix server name by looking at the file <full pathname to the database>/db/informixdb. This file contains a line of this format: //<servername>/<database "leaf" name>.

 

Set the environment variables: INFORMIXDIR, INFORMIXSERVER, ONCONFIG and PATH:

 

using csh:


  • % setenv INFORMIXDIR $CCM_HOME/informix

    % setenv INFORMIXSERVER <servername>

    % setenv ONCONFIG <servername>

    % setenv PATH $INFORMIXDIR/bin:$PATH


using sh or ksh:

  • % INFORMIXDIR=$CCM_HOME/informix; export INFORMIXDIR

    % INFORMIXSERVER=<servername>; export INFORMIXSERVER

    % ONCONFIG=<servername>; export ONCONFIG

    % PATH=$INFORMIXDIR/bin:$PATH; export PATH



On Windows you just need to run the following batch file at the Command Prompt to set up the Informix environment variables :

  • %CCM_HOME%\informix\setenv.cmd


Some oncheck command options need to be run with the Informix server in quiescent mode. If you are directed to put the Informix server in quiescent mode, you should run:

  • % ccmsrv quiescent [servername]


If you are directed to put the Informix server back into online mode, you can either go directly to online, or you may prefer to reset the server by first shutting it down, then bringing it online.

  • % ccmsrv offline


will shut down the server. The next command will bring it online.

  • % ccmsrv online [servername]


When you are directed to run oncheck, some oncheck command options require no further arguments, some require the database leaf name only, and some require the specification of both database and table. Do not get this wrong. oncheck will abort, and the server may go offline if the wrong arguments are provided to certain options.

RCS will direct you to run oncheck or put a server in quiescent, offline, or online mode. Use the information in this section, as needed.

 

Here are some examples of oncheck syntax. This oncheck command checks each of the root dbspace reserved pages. It requires no further arguments than the -cr option:

 

  • % oncheck -cr


This oncheck command checks each of the system catalog tables, or in the latter example, each of the catalog tables for a specified database with a leaf name ' stores5'.

  • % oncheck -cc
    % oncheck -cc stores5


This oncheck command checks all non-blob pages from the tblspace for the specified table, including any dbspace blob pages if they exist. Both the database name and table must be specified.

  • % oncheck -cD stores5:catalog


When an Informix server has been brought back online, you should run ccmdb check to ensure problems have been corrected.

2. DATABASE CHECK ERROR MESSAGES

a) cannot obtain lock ccm_root.attrib

 

  • The cannot obtain lock message may occur for multiple tables. For example, cannot obtain lock ccm_root.attrib
    cannot obtain lock for compver

    This message means that the check program failed to lock a table. This may happen if another process was active in a database at the time the check program attempted to acquire a lock.

    You should ensure there are no active processes in the database when the ccmdb check program is run. Either ask your users to cease activity or use the ccmdb shutdown command to end all sessions in the database. When there are no active processes in the database, rerun ccmdb check and confirm that no errors are reported.



 

b) WARNING:No syscolauth records found

 

  • This warning usually occurs in a group with three other warnings. For example,

    WARNING:No syscolauth records found.
    WARNING:No sysdepend records found.
    WARNING:No syssyntable records found.
    WARNING:No sysviews records found.

     

    These warnings can be ignored.

 

c) ERROR: Bad Page bd00, page not found

 

  • If the specified Bad Page is the last page in the table, then this error has resulted from a bug in the Informix check program. In order to confirm that the specified Bad Page is the last page, as user informix check the $CCM_HOME/informix/log/<servername>.log file for an entry (with a corresponding time stamp) such as the one shown here:

    22:18:54 ptmap: bad pagenum = 48384 -- only 48384 pages
    22:18:55 ptmap failure: userp = 1202780, pid = 28837
    22:18:55 partp = 136ec20, partnum = 0x3000037

     

    Note the specified Bad Page bd00 is hexadecimal 48384, and that bad pagenum 48384 is shown to be the last of 'only' 48384 pages. You must confirm that the "bad pagenum" and the "only XXXXX pages" values match. If those values match, and they are the decimal equivalent of the hexadecimal number specified in the "Bad Page XXXX, page not found" error message, then this error can be ignored. It will go away on its own as normal database activity causes the table to grow past the point that's causing this bug to occur.

 

d) Multiple bindings found for bsite 2/csrc/alpha.c (33268) in assembly 1/project/projname/projver (11033)

 

  • For example,

    checking all assemblies.....

    Multiple bindings found for bsite 2/csrc/alpha.c (33268) in assembly 1/project/projname/projver (11033):

    has_parent = 2/dir/mydir/8 (20082)
    has_child = 2/csrc/alpha.c/5 (19041)
    has_parent = 2/dir/mydir/8 (20082)
    has_child = 2/csrc/alpha.c/9 (20423)

    Multiple bindings found for bsite 2/csrc/beta.c (33269) in assembly 1/project/projname/projver (11033):

    has_parent = 2/dir/mydir/8 (20082)
    has_child = 2/csrc/beta.c/6 (19042)
    has_parent = 2/dir/mydir/8 (20082)
    has_child = 2/csrc/beta.c/10 (20399)

     

    This error shows a project, projname-projver, containing objects (alpha.c and beta.c) with bindings to which two versions of the expected object are bound. Only one version should be bound.

    When fixing this type of error, it's best to begin by rebuilding the indexes for the bind table. Once the indexes have been rebuilt, run dbaccess to clean up the binding sites (directory entries).

     

    First confirm that you have a valid backup of the databases on a database server before attempting these procedures.

    To rebuild the indexes, log onto the database server host machine as ccm_root. For performance reasons, ensure no users are active in the database specified by <full path to the database>.

     

    % ccmdb repair <full path to the database> -repair_index bind -y

     

    Refer to the DBACCESS section of this bulletin for the relevant TechNote for information on running the dbaccess command to clean up the bind sites. This also has information on running dbaccess in batch mode.

     

    % dbaccess <database "leaf" name> << eoc
    delete from bind where has_asm=11033 and has_bound_bs=33268;
    delete from bind where has_asm=11033 and has_bound_bs=33269;
    eoc

     

    Note that has_asm= is set to the 5 digit CVID of the project specified in the error messages on the lines beginning " in assembly"

     

    The has_bound_bs= values are set to the 5 digit CVID of the actual object specified in the error messages on the lines beginning " Multiple bindings found"

     

    Every object version in the database has a unique CVID, so this number will naturally increase over time as new object versions are created.

    After running both ccmdb repair and dbaccess, you should now reconfigure the project projname-projver, then rerun ccmdb check to validate that all is well.



 

e) bad value for is_attr_of (15252) in record: attrib.id = 45922, name = comment Owner CV = Non-existent (is_attr_of = 15252)

 

  • For example,

    checking attrib table...
    .........................................................................
    bad value for is_attr_of (15252) in record:
    attrib.id = 45922, name = comment
    Owner CV = Non-existent (is_attr_of = 15252)
    ..............
    bad value for is_attr_of (15343) in record:
    attrib.id = 46709, name = comment
    Owner CV = Non-existent (is_attr_of = 15343)
    ........................................................................

     

    This error shows that the attrib table contains entries that are not associated with any actual object version. These entries are therefore invalid, and should be deleted.

     

    First confirm that you have a valid backup of the databases on a database server before attempting these procedures.

    Refer to the DBACCESS section of this bulletin for the relevant TechNote for information on running the dbaccess command to clean up the bind sites. This also has information on running dbaccess in batch mode.

     

    % dbaccess <database "leaf" name> << eoc
    delete from attrib where is_attr_of = 15252;
    delete from attrib where is_attr_of = 15343;
    eoc



 

f) bad value for to_cv (58781) in record: name = set_memberH from_cv = DCM/tset/51/1 (cvid = 59879) to_cv = Non-existent (cvid = 58781)

 

  • This error is similar to the previous one except that it concerns a relationship. A relationship in Rational Synergy has 2 object versions and a direction. Therefore the relationship goes from one object version to another. Normally, when either object version is deleted, the relationship is too, but if this does not happen, your database check will flag this as an inconsistency. In this example there was a relationship between two object versions (a DCM transfer set and another object version which was an indirect member of that transfer set). One of the objects is no longer in the database but the relationship is still there. The name of the relationship in this case is set_memberH. Other relationships that might be reported are successor, task_in_rp, associated_cv, baseline_project, fix, etc. These errors are very rare, but in each case are quite easy to fix, you need to delete either the from_cv row (if that is the bad value) or the to_cv row as in the example above:

    % dbaccess <database_name> << eoc
    delete from relate where to_cv=58781
    eoc

    You should get output like this:
    Database selected.


    1 row(s) deleted.
    Database closed.






  •  


g) unreachable bindings found in assembly 1/project/projname/projver (46637)

  • This error represents a project in which expected object versions are not bound, though the bindings for those versions exist. This error is usually removed by performing a reconfigure for all projects mentioned in error messages such as these. Rerun ccmdb check to confirm the reconfigures have cleared these errors.


 

h) Multiple heads found in the bsite linked list:

 

  • For example,

    checking bsite linked list..........
    Multiple heads found in the bsite linked list:
    CV: 2/dir/acsls/1(id = 11459)
    id has_next_bs
    -------------------------
    11625 -374551176
    11626 -190278782

    Multiple heads found in the bsite linked list:
    CV: 4/dir/graphics/1(id = 11462)
    id has_next_bs
    -------------------------
    11628 11629
    11629 11630
    11630 11631
    11631 11632
    11632 11633
    11633 11634
    11634 11635
    11635 11636
    11636 11637
    11637 11638
    11638 11639
    11639 -1131339541
    11640 11641
    11641 11642
    11642 11643
    11643 11644
    11644 -1296915946

    2 total errors

     

    This error shows that two directories have corrupted bsite linked lists. The bsite linked list is a single-linked list where each entry should point to the next entry and then the list should end in a negative number. Both of the bsite linked lists shown above (for the " acsls" and " graphics" directory versions) are corrupted and need to be fixed.

     

    First confirm that you have a valid backup of the databases on a database server before attempting these procedures.

    Then rebuild the bsite indexes. Log onto the database server host machine as ccm_root. For performance reasons, ensure no users are active in the database specified by <full path to the database>.

     

    % ccmdb repair <full path to the database> -repair_index bsite -y

     

    Refer to the "How to run dbaccess" section of this bulletin, then run the dbaccess command to clean up the bsite linked lists. Also, you may wish to refer to the "How to run dbaccess in batch mode" section of this bulletin.

     

    % dbaccess <database "leaf" name> << eoc
    update bsite set has_next_bs = 11626 where id = 11625;
    update bsite set has_next_bs = 11640 where id = 11639;
    eoc

     

    The preceding procedure corrected two bsite linked list entries previously set incorrectly to negative numbers by setting the linked list entries to point to the next entry in the linked list. Note that the format of the command is:

     

    update bsite set has_next_bs = <where the entry points to> where id = <the entry that is updated>;

     

    In the example, the entry for 11625 is corrected to point to 11626 (for the acsls dir object), and the entry for 11639 is corrected to point to 11640 (for the graphics dir object).

    Re-run ccmdb check to ensure the database has been repaired.




 

i) bad value for has_next_bs (38681) in record

 

  • For example,

    checking bsite table...........
    bad value for has_next_bs (38681) in record:

    bsite.id =3D 14014, info =3D 1/project/XY_Compile
    Owner CV =3D ccm_wa/misc/ccm_user/1 (is_bsite_of =3D 16331)

    ......................................................................
    checking bsite linked list...
    .............................................................
    Failed to find a tail for the bsite linked list:
    CV: ccm_wa/misc/ccm_user/1(id =3D 16331)

    id has_next_bs
    -------------------------
    10997 10998
    10998 10999
    10999 13948
    13948 14013
    14013 14014
    14014 38681

     

    This error shows a bsite linked list that is incorrectly terminated. The bsite linked list is a single-linked list where each entry should point to the next entry and then the list should end in a negative number. The list shown above is corrupted and needs to be fixed.

     

    First confirm that you have a valid backup of the databases on a database server before attempting these procedures.

    Then rebuild the bsite indexes. Log onto the database server host machine as ccm_root. For performance reasons, ensure no users are active in the database specified by <full path to the database>.

     

    % ccmdb repair <full path to the database> -repair_index bsite -y

     

    Refer to the "How to run dbaccess" section of this bulletin, then run the dbaccess command to clean up the bsite linked lists. Also, you may wish to refer to the "How to run dbaccess in batch mode" section of this bulletin.

     

    % dbaccess <database "leaf" name> << eoc
    update bsite set has_next_bs = -64872 where id=14014;
    eoc

     

    This sets the end of the linked list to a negative value, which is what is expected to terminate the linked list.

    Re-run ccmdb check to ensure the database has been repaired.



 

j) Parent CV 1/dir/something/1 (25418) not bound in assembly 1/project/something/1 (55819)


  • These warnings are repeated for each row in the database where they were found, therefore we are just concerned with each unique warning no matter how often it appears in the list.
     

    The procedure for fixing this is as follows :

    For each unique message :

    Parent CV 1/dir/something/1 (25418) not bound in assembly 1/project/something/1 (55819)

     

    Repeat the following until all checks run cleanly:

    delete from bind where has_parent=<First CVID> and has_asm=<Second CVID>;

     

    For example:

    delete from bind where has_parent=25418 and has_asm=55819;

     

    You can script all these together into a SQL script file as follows :

    delete from bind where has_parent=25418 and has_asm=55819;
    delete from bind where has_parent=25419 and has_asm=55819;
    delete from bind where has_parent=25664 and has_asm=55819;
    delete from bind where has_parent=28510 and has_asm=55819;

     

    and run the script (e.g. script.sql) as follows (as user ccm_root or informix) after setting up the Informix environment variables as described in the DBACCESS section.

     

    dbaccess <database leaf name> script.sql

     

    Up to and including version 5.1, ccmdb check does not recursively check the database for this particular error. Therefore, depending on the size of the hierarchy affected, you may need to run ccmdb check again after deleting the affected rows to identify further similar errors. You may need to repeat this process several times before all the affected rows are deleted. If in any doubt, please contact RCS.



 

k) textval larger than 2000000 bytes in record

 

  • For example:

    texval larger than 2000000 bytes in record:

    attrib.id = 1234566, name= bom
    Owner CV = 1/executable/asdf/as113 (is_attr_of=123456)..

     

    This message is displayed for every text attribute which is larger than 2 MB. It is not serious and can safely be ignored.


l) bad value for has_asm (123456) in record

  • For example:
     bad value for has_asm (123456) in record:                      

       has_asm      = Non-existent (cvid = 123456)  

       has_bound_bs = 1/project/my_project (bsid = 54321)    

       has_child    = 1/project/my_project/1 (cvid = 54322)

       has_parent   = 1/dir/my_project/1 (cvid = 54323)

     

    Refer to the DBACCESS section of this bulletin for the relevant TechNote for information on running the dbaccess command to clean up the bind sites. This also has information on running dbaccess in batch mode.


    1) Firstly run the select command to see how man entries are affected:
    •  dbaccess <database "leaf" name> << eoc

      select * from bind where has_asm=123456;

      eoc



    2) If the select only returns one row then you may run the following:
    •  dbaccess <database "leaf" name> << eoc

      delete from bind where has_asm=123456;

      eoc


    3) If you get more than one row returned then please contact RCS.



m) bad value for from_cv (139739) in record: name = associated_cv from_cv = Non-existent (cvid = 139739) to_cv = 1/project/projname/projver (cvid = 139743)

  • This error is very similar to the (f) above. These errors are very rare, but in each case are quite easy to fix, you need to delete the from_cv row:


    % dbaccess <database_name> << eoc


    delete from relate where from_cv=139739;
    eoc



    You should get output like this:
    Database selected.
    1 row(s) deleted.
    Database closed.
 

As in all these cases, if you are in any doubt, please contact Rational Client Support.



Summary

These steps should only be used under the direction of Rational Client Support and in all cases, you should have a made a backup of your database before running any commands at the Informix database level.

After correcting the database you may want to use the command ccm prop @=<cvid> to identify any objects reported in the error messages. Using the information you can do a ccm finduse to get more details on the object in question. The prop command may not return anything if the particular object was deleted. This may help you see if there are any other issues with the objects.

[{"Product":{"code":"SSC6Q5","label":"Rational Synergy"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"General Information","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"6.3;6.4;6.5;6.6a;7.0;7.1;7.2;7.2.1","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
22 December 2020

UID

swg21325223