IBM Support

IT26182: MOVE CONTAINER CAN LEAVE THE CONTAINER IN A READONLY STATE

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Error Description:
    
    When a MOVE CONTAINER command is issued either manually or from
    the internal defragmentation processing on a directory container
    storage pool, this can leave the container in a readonly state.
    There might not be any obvious errors or messages to the reason
    the container is now in the readonly state, which can lead to
    out of space issues within the container pool. Please be aware
    that there valid times that a move container will leave a
    container in a readonly state, which should be seen from the
    activity log.  Also note that when performing a MOVE CONTAINER,
    this firstly changes the container state to readonly. Once
    complete, the state should change appropriately.
    
    Currently there are two scenarios when seeing this happen :
    
    1) When a MOVE CONTAINER is ran on a container at the same time
    that chunks in the container are expiring, this can cause the
    location update to fail when it finds that a chunk entry no
    longer exists. This will cause the MOVE CONTAINER process to end
    in warning and leave the container in a read-only state,
    preventing further writes to it.  You will see the following
    error logged in the activity log :
    
     ANR0103E sddefrag.c(4485): Error 1114 updating row in table
    "SD.Chunk.Locations"
    
    
    2) Extents which cannot be moved during the move container
    process, will leave the container in a readonly state. There are
    no errors logged in the activity log, nor within the
    dsmffdc.log.
    
    
    Customer/L2 Diagnostics (If Applicable)
    
    For scenario 2), obtaining an SDCNTR trace will reveal that the
    movement of extents failed. Example of this being :
    
    23:07:47.546 [254][sdcntr.c][4645][SdAcquireAnyContainer]:Using
    strategy AllocNewCntr with size 294912
    23:07:47.546 [254][sdcntr.c][4781][SdAcquireContainer]:Enter:
    directory E:\TSM\stgdir(1), type 1, size 294912.
    23:07:47.546 [254][sdcntr.c][9375][AllocNewContainer]:Enter:
    size 294912, type 1, getSufficientRange False
    23:07:47.546 [254][sdcntr.c][9434][AllocNewContainer]:Directory
    is full.  Requested 294912, minSize 1073741824
    23:07:47.546 [254][sdcntr.c][9517][AllocNewContainer]:Couldn't
    allocate a new container for directory E:\TSM\stgdir(1) with
    rc=1001
    23:07:47.546 [254][sdcntr.c][9627][AllocNewContainer]:Exit: rc
    1001, cntrId 0, offset -1
    23:07:47.546 [254][sdcntr.c][5233][SdAcquireContainer]:Exit: rc
    1001, cntrId 0
    
    23:07:47.572 [253][sddefrag.c][2934][SdMoveContainer]:sdRtrv
    failed with rc=1001
    23:07:47.603 [253][sddefrag.c][3456][SdMoveContainer]:Number of
    chunks: 17428, moved: 17427, failed: 1
    
    The key error for this example is hitting 1001 which is :
    
    #define GRC_NO_SPACE                            1001
    
    In this instance, the move process was running at the same time
    as client backups were sending data into the container pool, so
    multiple containers were open and space reserved in the
    containers at the time. The move itself cannot be performed as
    it hits an out of space error, but this is not externalised into
    the activity log or dsmffdc log to report this. The process
    simply ends, but the container remains in a readonly state. This
    has a knock on effect of potentially many other containers which
    require to be defragmented, all changing to readonly with the
    same error. As each container remains in a readonly state, less
    space is available in the storage pool and client sessions will
    eventually start to fail with an out of space error.
    
    
    
    IBM Spectrum Protect Versions Affected:
    Spectrum Protect Server versions 8.1.4 and 8.1.5 on all
    platforms.
    
    Initial Impact: Low|Medium|High
    Medium
    
    Additional Keywords:
    tsm, ANR0522W, OOS,  TS001182092, TS000843636, TS001166483,
    defrag
    

Local fix

  • For scenario 1), simply ensure the move is performed again on
    the container.
    
    For scenario 2) :
    
    Find the containers which are in a readonly by issuing :
    
    Q CONTAINER STGPOOL=<POOL> STATE=READONLY
    
    Check the activity log for any errors which could validly have
    placed the container into a readonly state, such as an I/O
    error. If you find a valid error, do not attempt to place that
    container back into an AVAILABLE state. If nothing is found,
    then you can change the state of the containers back to
    AVAILABLE by issuing the following command from within a DB2
    command prompt :
    
    To update ALL readonly containers, issue :
    
    db2 connect to tsmdb1
    db2 set schema tsmdb1
    db2 "update sd_containers set state=0 where state=2"
    
    For individual readonly containers, issue :
    
    db2 connect to tsmdb1
    db2 set schema tsmdb1
    db2 "update sd_containers set state=0 where
    cntrname='container-name'"
    
    Replace 'container-name' with the actual container name.
    
    Once containers are available, these can be written to again for
    data ingest.
    
    Also consider spreading the workload out for client data ingest
    and lowering the number of containers that sessions are allowed
    to open by lowering the SDMaxSessionContainers value from the
    default of 50. Be aware of potential performance problems if
    altering this for client backups.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All IBM Spectrum Protect server users.                       *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See error description.                                       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be fixed in level 8.1.6. Note that this is      *
    * subject to change at the discretion of IBM.                  *
    ****************************************************************
    

Problem conclusion

  • This problem was fixed.
    Affected platforms for reported release:  AIX, Linux, and
    Windows.
    Platforms fixed:  AIX, Linux, and Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT26182

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    81W

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-09-05

  • Closed date

    2018-09-17

  • Last modified date

    2018-09-17

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"81W","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
17 September 2018