Technical Blog Post
Abstract
DB2 Purescale : How to manually cleanup dangling entries for mount records from the HA registry and the mount resources?
Body
Couple of months before we saw a practical example of adding a storage group in pureScale - what to expect at DB2, TSA and GPFS level.
/support/pages/node/1139998
Well, that's the ideal world scenario (good case), but user might see dangling entries for mount records from the HA registry and the mount resources after "drop stogroup" command.
So this article is more about how to clean up such mess.
1. create GPFS filesystem
$>db2cluster -cfs -create -filesystem gpfs_temp11 -disk /dev/hdisk113,/dev/hdisk114,/dev/hdisk115,/dev/hdisk116,/dev/hdisk117,/dev/hdisk118,/dev/hdisk119,/dev/hdisk120,/dev/hdisk121,/dev/hdisk122 -mount /db2/DUIT/temp11
$>db2cluster -cfs -create -filesystem gpfs_temp12 -disk /dev/hdisk123,/dev/hdisk124,/dev/hdisk125,/dev/hdisk126,/dev/hdisk127,/dev/hdisk128,/dev/hdisk129,/dev/hdisk130,/dev/hdisk131,/dev/hdisk132 -mount /db2/DUIT/temp12
2. create storage group in db2instance user
$>db2 "create stogroup <group name> on '/db2/DUIT/temp11', '/db2/DUIT/temp12'"
3. create system temporary tablespace
$>db2 "create system temporary tablespace <tbs name> managed by automatic storage using stogroup <storage group name>"
4. remove system temporary tablespace
$>db2 "drop tablespace <tbs name>"
4.5 remove files
$>rm -rf /db2/DUIT/temp11/*
$>rm -rf /db2/DUIT/temp12/*
$>db2cluster -cfs -delete -filesystem gpfs_temp11
=> Note here user did not remove storage group yet
5. remove storage group
$>db2 "drop stogroup <storage group name>"
At this stage we see dangling entries for mount records from the HA registry (db2hareg -dump) and the mount resources (mmlsnsd)
> check the storage group by db2pd command and seems the corresponding entries are deleted from each member ( member 0 & member 1 ) as expected. $>db2pd -d DUIT -storagepaths Database Member 0 -- Database DUIT -- Active -- Up 4 days 20:06:04 -- Date 2016-12-14-17.39.16.519793 Storage Group Configuration: Address SGID Default DataTag Name 0x0A00030026227B80 0 Yes 0 IBMSTOGROUP 0x0A00030026227E00 1 No 0 SGDUITDB 0x0A00030026275460 2 No 0 SGDUITHS 0x0A00030026299460 3 No 0 SGDUITTP 0x0A000300262C1460 4 No 0 SGDUITTL Storage Group Statistics: Address SGID State Numpaths NumDropPen 0x0A00030026227B80 0 0x00000000 1 0 0x0A00030026227E00 1 0x00000000 4 0 0x0A00030026275460 2 0x00000000 2 0 0x0A00030026299460 3 0x00000000 4 0 0x0A000300262C1460 4 0x00000000 1 0 Storage Group Paths: Address SGID PathID PathState PathName 0x0A0003002624D000 0 0 InUse /db2/DUIT/datas 0x0A0003002626F000 1 1024 InUse /db2/DUIT/data1 0x0A00030026271000 1 1025 InUse /db2/DUIT/data2 0x0A00030026273000 1 1026 InUse /db2/DUIT/data3 0x0A00030026275000 1 1027 InUse /db2/DUIT/data4 0x0A00030026297000 2 2048 InUse /db2/DUIT/datahst1 0x0A00030026299000 2 2049 InUse /db2/DUIT/datahst2 0x0A000300262BB000 3 3072 InUse /db2/DUIT/temp1 0x0A000300262BD000 3 3073 InUse /db2/DUIT/temp2 0x0A000300262BF000 3 3074 InUse /db2/DUIT/temp3 0x0A000300262C1000 3 3075 InUse /db2/DUIT/temp4 0x0A000300262E3000 4 4096 InUse /db2/DUIT/tool Database Member 1 -- Database DUIT -- Active -- Up 4 days 20:33:37 -- Date 2016-12-14-17.39.20.571062 Storage Group Configuration: Address SGID Default DataTag Name 0x0A00030026227B80 0 Yes 0 IBMSTOGROUP 0x0A00030026227E00 1 No 0 SGDUITDB 0x0A00030026275460 2 No 0 SGDUITHS 0x0A00030026299460 3 No 0 SGDUITTP 0x0A000300262C1460 4 No 0 SGDUITTL Storage Group Statistics: Address SGID State Numpaths NumDropPen 0x0A00030026227B80 0 0x00000000 1 0 0x0A00030026227E00 1 0x00000000 4 0 0x0A00030026275460 2 0x00000000 2 0 0x0A00030026299460 3 0x00000000 4 0 0x0A000300262C1460 4 0x00000000 1 0 Storage Group Paths: Address SGID PathID PathState PathName 0x0A000307DD2B3000 0 0 InUse /db2/DUIT/datas 0x0A000307DD239000 1 1024 InUse /db2/DUIT/data1 0x0A000307DD6F5000 1 1025 InUse /db2/DUIT/data2 0x0A000307DD42E000 1 1026 InUse /db2/DUIT/data3 0x0A000307DD6F8000 1 1027 InUse /db2/DUIT/data4 0x0A00030026297000 2 2048 InUse /db2/DUIT/datahst1 0x0A00030026299000 2 2049 InUse /db2/DUIT/datahst2 0x0A000307DCACA000 3 3072 InUse /db2/DUIT/temp1 0x0A000307DC9B6000 3 3073 InUse /db2/DUIT/temp2 0x0A000307DC006000 3 3074 InUse /db2/DUIT/temp3 0x0A000307DE0C7000 3 3075 InUse /db2/DUIT/temp4 0x0A000307DD37E000 4 4096 InUse /db2/DUIT/tool
However db2hareg -dump still shows those entries (/db2/DUIT/temp11, /db2/DUIT/temp12)
<db2hareg -dump>
B01000000000000,IN,100,2,1,1
B01000000000000,DN,ilancer21,,ilancer21-en1,ilancer21-en2
B01000000000000,NL,128,ilancer21,0,ilancer21-en1,ilancer21-en2,-,CF
B01000000000000,NL,0,ilancer21,0,ilancer21-en1,ilancer21-en2,-,MEMBER
B01000000000000,MO,/db2/DUIT/inst, ,0,4,0
B01000000000000,RU,32872,32736
B01000000000000,NL,129,ilancer22,0,ilancer22-en1,ilancer22-en2,-,CF
B01000000000000,NL,1,ilancer22,0,ilancer22-en1,ilancer22-en2,-,MEMBER
B01000000000000,DN,ilancer22,,ilancer22-en1,ilancer22-en2
B01000000000000,MO,/db2/DUIT/diag, ,0,1,0
B01000000000000,DB,DUIT,1
B01000000000000,MO,/db2/DUIT/temp12,DUIT,0,1,0
B01000000000000,MO,/db2/DUIT/temp11,DUIT,0,1,0
B01000000000000,MO,/db2/DUIT/tool,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/temp4,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/temp3,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/temp2,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/temp1,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/datahst2,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/datahst1,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/data4,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/data3,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/data2,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/data1,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/datas,DUIT,0,10,0
It also shown in mmlsnsed result
$>mmlsnsd FILE SYSTEM NAME MOUNT POINT --------------------------------- ------------------------- db2fs1 /db2/DUIT/inst gpfs_arch /db2/DUIT/arch gpfs_data1 /db2/DUIT/data1 gpfs_data2 /db2/DUIT/data2 gpfs_data3 /db2/DUIT/data3 gpfs_data4 /db2/DUIT/data4 gpfs_datahst1 /db2/DUIT/datahst1 gpfs_datahst2 /db2/DUIT/datahst2 gpfs_datas /db2/DUIT/datas gpfs_diag /db2/DUIT/diag gpfs_log_act /db2/DUIT/log_act gpfs_log_mir /db2/DUIT/log_mir gpfs_temp1 /db2/DUIT/temp1 gpfs_temp11 /db2/DUIT/temp11 gpfs_temp12 /db2/DUIT/temp12 gpfs_temp2 /db2/DUIT/temp2 gpfs_temp3 /db2/DUIT/temp3 gpfs_temp4 /db2/DUIT/temp4 gpfs_tool /db2/DUIT/tool gpfs_work /db2/DUIT/work tried to offline the TSA resource group of temp11 #>chrg -o offline db2mnt-db2_DUIT_temp11-rg root@ilancer22:/tmp # lssam | grep -i temp11 Pending offline IBM.ResourceGroup:db2mnt-db2_DUIT_temp11-rg Nominal=Offline '- Online IBM.Application:db2mnt-db2_DUIT_temp11-rs |- Online IBM.Application:db2mnt-db2_DUIT_temp11-rs:ilancer21 '- Online IBM.Application:db2mnt-db2_DUIT_temp11-rs:ilancer22 Online IBM.Equivalency:db2mnt-db2_DUIT_temp11-rg_group-equ : that group shows pending offline.
By design, when the "drop stogroup" command is issued, we attempt to delete the mount resources in the following way:
1. first, decrement the usecount for the mount records
2. If the new usecount == 0 for this mount record, we proceed with deleting the mount resource from the TSA resource model and remove the mount record from the HA registry.
Now, in this above case, the initial usecount for both the mount resources was 2 (found from db2 traces).
Due to this, the code decrements the usecount for the mount record (which makes it 1) and exits without taking any further action.
Later it was found that due to the user's incorrect usage of DB2 commands, the resources may not have been properly cleanup in the first invocation of the "drop stogroup" command.
Due to this, in the subsequent attempts to create storage group resulted in incrementing the usecount of the existing mount records to '2'.
Now next task is how to clean up this mess?
How to delete the mount records from the HA registry and delete the mount resources?
1. HA registry attempt deletion
db2inst1@ilancer21:/unify/IBM/db2inst1] db2hareg -del Mount path=/db2/DUIT/temp11,databasename=DUIT
db2inst1@ilancer21:/unify/IBM/db2inst1] db2hareg -del Mount path=/db2/DUIT/temp12,databasename=DUIT
2. Checking for HA registry [db2inst1@ilancer21:/unify/IBM/db2inst1] db2hareg -dump; date B01000000000000,IN,100,2,1,1 B01000000000000,DN,ilancer21,,ilancer21-en1,ilancer21-en2 B01000000000000,NL,128,ilancer21,0,ilancer21-en1,ilancer21-en2,-,CF B01000000000000,NL,0,ilancer21,0,ilancer21-en1,ilancer21-en2,-,MEMBER B01000000000000,MO,/db2/DUIT/inst, ,0,4,0 B01000000000000,RU,32872,32736 B01000000000000,NL,129,ilancer22,0,ilancer22-en1,ilancer22-en2,-,CF B01000000000000,NL,1,ilancer22,0,ilancer22-en1,ilancer22-en2,-,MEMBER B01000000000000,DN,ilancer22,,ilancer22-en1,ilancer22-en2 B01000000000000,MO,/db2/DUIT/diag, ,0,1,0 B01000000000000,DB,DUIT,1 B01000000000000,MO,/db2/DUIT/tool,DUIT,0,2,0 B01000000000000,MO,/db2/DUIT/temp4,DUIT,0,2,0 B01000000000000,MO,/db2/DUIT/temp3,DUIT,0,2,0 B01000000000000,MO,/db2/DUIT/temp2,DUIT,0,2,0 B01000000000000,MO,/db2/DUIT/temp1,DUIT,0,2,0 B01000000000000,MO,/db2/DUIT/datahst2,DUIT,0,2,0 B01000000000000,MO,/db2/DUIT/datahst1,DUIT,0,2,0 B01000000000000,MO,/db2/DUIT/data4,DUIT,0,2,0 B01000000000000,MO,/db2/DUIT/data3,DUIT,0,2,0 B01000000000000,MO,/db2/DUIT/data2,DUIT,0,2,0 B01000000000000,MO,/db2/DUIT/data1,DUIT,0,2,0 B01000000000000,MO,/db2/DUIT/datas,DUIT,0,10,0
=> Both /db2/DUIT/temp11 and /db2/DUIT/temp12 have been deleted
3. Change for management scope
root@ilancer21:/usr/lpp/mmfs/bin # export CT_MANAGEMENT_SCOPE=2
4. Lock for resource group
root@ilancer21:/usr/lpp/mmfs/bin # rgreq -o lock db2mnt-db2_DUIT_temp11-rg
Completed applying request to resource group "db2mnt-db2_DUIT_temp11-rg".
5.Checking for relations
root@ilancer21:/usr/lpp/mmfs/bin # lsrel | grep -i temp11
db2_db2inst1_1-rs_DependsOn_db2mnt-db2_DUIT_temp11-rs-rel IBM.Application:db2_db2inst1_1-rs db2_db2inst1_1-rg
db2_db2inst1_0-rs_DependsOn_db2mnt-db2_DUIT_temp11-rs-rel IBM.Application:db2_db2inst1_0-rs db2_db2inst1_0-rg
6. Removing relations (Problem occurred here)
t1n1[root]:/>rmrel db2_db2inst1_0-rs_DependsOn_db2mnt-db2_DUIT_temp11-rs-rel
(rmrsrc-api) 2621-014 Command not allowed - one or more related resource groups are online.
rmrel: 2622-009 An unexpected RMC error occurred.The RMC return code was 1.
rmrel: 2622-229 None of the specified Relationships were found or could not be removed.
=> db2mnt-db2_DUIT_temp11-rs is related to db2_db2_0-rs
And db2_db2_0-rs is db2 process (db2sysc)
If we want to delete db2_db2_0-rs_DependsOn_db2mnt-temp11-rs-rel, it need db2 down
Actually, through test was erased after the db2stop we need to work online and so need to look for another way.
So attempting to use db2 commands to create the storage group and delete the storage group. It worked this time since the HA registries for the mount records were deleted.
Creation and deletion of the storage group made new entries in the HA registry with the proper usecount values and yielded the expected outcomes.
1. Created storage group once again after HA registry attempt deletion
db2inst1@ilancer21:/unify/IBM/db2inst1] db2 "create stogroup test_group on '/db2/DUIT/temp11','/db2/DUIT/temp12'"
DB20000I The SQL command completed successfully.
=> It must be making use of both "/db2/DUIT/temp11" and "/db2/DUIT/temp12" because 2 filesystems are able to delete later.
2. Performing deletion for storage group
[db2inst1@ilancer21:/unify/IBM/db2inst1] db2 "drop stogroup test_group"
DB20000I The SQL command completed successfully.
3. Checking for DB2, TSA and HA registry
<DB2 Check>
[db2inst1@ilancer21:/unify/IBM/db2inst1] db2pd -d DUIT -storagepaths
Storage Group Configuration:
Address SGID Default DataTag Name
0x0A0003002622DBA0 0 Yes 0 IBMSTOGROUP
0x0A0003002622DE20 1 No 0 SGDUITDB
0x0A0003002627B460 2 No 0 SGDUITHS
0x0A0003002629F460 3 No 0 SGDUITTP
0x0A000300262C7460 4 No 0 SGDUITTL
Storage Group Statistics:
Address SGID State Numpaths NumDropPen
0x0A0003002622DBA0 0 0x00000000 1 0
0x0A0003002622DE20 1 0x00000000 4 0
0x0A0003002627B460 2 0x00000000 2 0
0x0A0003002629F460 3 0x00000000 4 0
0x0A000300262C7460 4 0x00000000 1 0
0x0A000307E281A000 5 0x00000000 2 0
Storage Group Paths:
Address SGID PathID PathState PathName
0x0A00030026253000 0 0 InUse /db2/DUIT/datas
0x0A00030026275000 1 1024 InUse /db2/DUIT/data1
0x0A00030026277000 1 1025 InUse /db2/DUIT/data2
0x0A00030026279000 1 1026 InUse /db2/DUIT/data3
0x0A0003002627B000 1 1027 InUse /db2/DUIT/data4
0x0A0003002629D000 2 2048 InUse /db2/DUIT/datahst1
0x0A0003002629F000 2 2049 InUse /db2/DUIT/datahst2
0x0A000300262C1000 3 3072 InUse /db2/DUIT/temp1
0x0A000300262C3000 3 3073 InUse /db2/DUIT/temp2
0x0A000300262C5000 3 3074 InUse /db2/DUIT/temp3
0x0A000300262C7000 3 3075 InUse /db2/DUIT/temp4
0x0A000300262E9000 4 4096 InUse /db2/DUIT/tool
ilancer21: db2pd -d DUIT -storagepaths ... completed ok
<TSA Check>
[db2inst1@ilancer21:/unify/IBM/db2inst1] lssam -nocolor | grep -i temp11
[db2inst1@ilancer21:/unify/IBM/db2inst1] lssam -nocolor | grep -i temp12
=> TSA is not showing temp11 and temp12
<HA registry>
[db2inst1@ilancer21:/unify/IBM/db2inst1] db2hareg -dump
B01000000000000,IN,100,2,1,1
B01000000000000,DN,ilancer21,,ilancer21-en1,ilancer21-en2
B01000000000000,NL,128,ilancer21,0,ilancer21-en1,ilancer21-en2,-,CF
B01000000000000,NL,0,ilancer21,0,ilancer21-en1,ilancer21-en2,-,MEMBER
B01000000000000,MO,/db2/DUIT/inst, ,0,4,0
B01000000000000,RU,32872,32736
B01000000000000,NL,129,ilancer22,0,ilancer22-en1,ilancer22-en2,-,CF
B01000000000000,NL,1,ilancer22,0,ilancer22-en1,ilancer22-en2,-,MEMBER
B01000000000000,DN,ilancer22,,ilancer22-en1,ilancer22-en2
B01000000000000,MO,/db2/DUIT/diag, ,0,1,0
B01000000000000,DB,DUIT,1
B01000000000000,MO,/db2/DUIT/tool,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/temp4,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/temp3,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/temp2,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/temp1,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/datahst2,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/datahst1,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/data4,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/data3,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/data2,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/data1,DUIT,0,2,0
B01000000000000,MO,/db2/DUIT/datas,DUIT,0,10,0
=> HA registry not showing temp11 and temp12
4. Delete files from both /db2/DUIT/temp11 and /db2/DUIT/temp12
root@ilancer21:/db2/DUIT/temp11 # rm -fr /db2/DUIT/temp11/.*
root@ilancer21:/db2/DUIT/temp11 # rm -fr /db2/DUIT/temp11/*
root@ilancer21:/db2/DUIT/temp12 # rm -fr /db2/DUIT/temp12/.*
root@ilancer21:/db2/DUIT/temp12 # rm -fr /db2/DUIT/temp12/*
=> All the files have been deleted. (The .snapshots file was not deleted)
5. Performing deletion for GPFS filesystem (root user and location is DB2 <install path>/bin)
root@ilancer21:/unify/IBM/db2/V11.1_SB_36064/bin # ./db2cluster -cfs -delete -filesystem gpfs_temp11
File system 'gpfs_temp11' has been successfully deleted.
root@ilancer21:/unify/IBM/db2/V11.1_SB_36064/bin # ./db2cluster -cfs -delete -filesystem gpfs_temp12
File system 'gpfs_temp12' has been successfully deleted.
=> Both of them have been successfully deleted
Thanks,
Shashank Kharche
IBM DB2 LUW Lab
UID
ibm13286641