Recovering from disk failure
You can recover from a disk hardware failure that results in the loss of an entire unit.
Symptoms
No I/O activity occurs for the affected disk address. Databases and tables that reside on the affected unit are unavailable.Resolving the problem
Operator response:
- Ensure that no incomplete I/O requests exist for the failing device.
One way to do this is to force the volume offline by issuing the following z/OS® command, where xxx is
the unit address:
VARY xxx,OFFLINE,FORCE
To check disk status, issue the following command:D U,DASD,ONLINE
The following console message is displayed after you force a volume offline:UNIT TYPE STATUS VOLSER VOLSTATE 4B1 3390 O-BOX XTRA02 PRIV/RSDNT
The disk unit is now available for service.
If you previously set the I/O timing interval for the device class, the I/O timing facility terminates all requests that are incomplete at the end of the specified time interval, and you can proceed to the next step without varying the volume offline. You can set the I/O timing interval either through the IECIOSxx z/OS parameter library member or by issuing the following z/OS command:SETIOS MIH,DEV=devnum,IOTIMING=mm:ss.
- Issue (or request that an authorized operator issue) the following Db2 command to stop all databases and table spaces that reside on the
affected volume:
-STOP DATABASE(database-name) SPACENAM(space-name)
If the disk unit must be disconnected for repair, stop all databases and table spaces on all volumes in the disk unit.
- Select a spare disk pack and use ICKDSF to initialize from scratch a disk unit with a different
unit address (yyy) and the same volume serial
number (VOLSER).
// Job //ICKDSF EXEC PGM=ICKDSF //SYSPRINT DD SYSOUT=* //SYSIN DD * REVAL UNITADDRESS(yyy) VERIFY(volser)
If you initialize a 3380 or 3390 volume, use REVAL with the VERIFY parameter to ensure that you initialize the intended volume, or to revalidate the home address of the volume and record 0. Alternatively, use ISMF to initialize the disk unit.
- Issue the following z/OS console
command, where yyy is the new unit address:
VARY yyy,ONLINE
- To check disk status, issue the following command:
D U,DASD,ONLINE
The following console message is displayed:UNIT TYPE STATUS VOLSER VOLSTATE 7D4 3390 O XTRA02 PRIV/RSDNT
- Issue the following Db2 command to start all the
appropriate databases and table spaces that were previously stopped:
-START DATABASE(database-name) SPACENAM(space-name)
- Delete all table spaces (VSAM linear data sets) from the ICF catalog
by issuing the following access method services command for each one
of them, where y is either I or J:
DELETE catnam.DSNDBC.dbname.tsname.y0001.A00x CLUSTER NOSCRATCH
- For user-managed table spaces, define the VSAM cluster and data components for the new volume by issuing the access method services DEFINE CLUSTER command with the same data set name as in the previous step, in the following format: catnam.DSNDBC.dbname.tsname.y0001.A00x. The y is I or J, and the x is C (for VSAM clusters) or D (for VSAM data components).
- For a user-defined table space, define the new data set before an attempt to recover it. You can recover table spaces that are defined in storage groups without prior definition.
- Recover the table spaces by using the Db2 RECOVER utility.