IBM Support

QMGTOOLS: Work Management Menu

Troubleshooting


Problem

This document describes the functions provided by the work management menu in QMGTOOLS.

Resolving The Problem

The Work Management menu currently has these listed data collections (as more are added, this document will be updated to reflect it).
NOTE:  It is recommended that SERVICEDOCS be collected while the job is hanging if root cause analysis is needed. Sometimes SERVICEDOCS does not contain enough information to fully determine root cause and other data may be needed.  A main store dump is the best data to collect to assist with job and system hangs.

NOTE:  Before running Mutex trap, Hung Job data collection, Hung subsystem data collection, and Job status trap options please make sure the following PTF for APAR MA49489 is applied to the system:

V7R4:  MF69518
V7R3:  MF69517
V7R2:  MF69516
NOTE:  Before running Mutex trap, Hung Job data collection, Hung subsystem data collection, and Job status trap options please make sure the following PTF for APAR MA50379 is applied to the system:
 
V7R5:  MF71435
V7R4:  MF71434
V7R3:  MF71436


Installing QMGTOOLS

http://www-01.ibm.com/support/docview.wss?uid=nas8N1011297

To access Internals menu, go to the MG menu via GO QMGTOOLS/MG command. From there, choose menu option for the Work Management menu.




Mutex Trap

http://www-01.ibm.com/support/docview.wss?uid=nas8N1019977


IPL Data Collection
  • -



    Collect IPL messages and flight recorder
    SST dump SID 228
    System snapshot http://www-01.ibm.com/support/docview.wss?uid=nas8N1010353
    DSPLOG OUTPUT(*PRTSECLVL) MSGID(CPF0918 CPF0901 CPF0903 CPF0905 CPF0923 CPF095A CPF0954 CPF0955 CPF0965 CPF0968 CPF0969 CPF0930 CPF0931 CPF0933 CPF0934 CPF0990 CPF0992 CPF0993 CPF0997 CPF0998 CPF1821 CPI0919 CPI091D CPI091E CPI091F CPI0980 CPI0990 CPI0995 CPI099A CPI099D CPI099F CPI0C04 CPI1183 CPI1184 CPA3702 CPC1292 CPP0E9F)


    1) Parameters are as follows.

    Output - *IFS (store data into the IFS), *PF (store data in to a physical file)
    Data library - if choosing to store data into physical file, the library to store the file
    IFS directory to store data - if choosing to store data in the IFS, the location in the IFS to store data
    Previous days to collect - how many days to go back for collecting system snapshot data


    IPL data collection command

    2) The next parameters are as follows.

    Signon user ID - the signon to the iSeries
    Signon user pass/verify user pass - the password for the iSeries signon
    SST user ID - user ID for SST signon
    SST user password/verify password - password for the SST signon
    Virtual device - *AUTO (auto select virtual device when trying to sign on) or specify a virtual device to sign on to the local System i.

    Signon and SST  credentials

    3) After entering the credentials, the program will attempt to sign on to the local System i and sign on to SST to dump the data.

    4) When done, a status message will tell where the data is stored. If there is no message, check your joblog.

    If *PF was specified, a save file would be created in the data library.
    If *IFS was specified, a zip file would be placed in the /tmp directory.



Hung Job Data Collection
  • -



    System snapshot http://www-01.ibm.com/support/docview.wss?uid=nas8N1010353
    SERVICEDOCS
    TASKINFO
    DSPJOBLOG OUTPUT(*PRINT) of hung job
    DSPJOB OUTPUT(*PRINT) OPTION(*ALL) of hung job
    CALL PGM(QWTDMPFR)


    1) Parameters are as follows.

    Output - *IFS (store data into the IFS), *PF (store data in to a physical file)
    Data library - if choosing to store data into physical file, the library to store the file
    IFS directory to store data - if choosing to store data in the IFS, the location in the IFS to store data
    Collect VLOGs - Y or N to collect VLOGs
    Collect PAL entries - Y or N to collect PAL entries
    Hung job info - information about the hung job



    2) When done, a status message will tell where the data is stored. If there is no message, check your joblog.

    If *PF was specified, a save file would be created in the data library.
    If *IFS was specified, a zip file would be placed in the /tmp directory.



Hung Subsystem Data Collection
  • -



    System snapshot http://www-01.ibm.com/support/docview.wss?uid=nas8N1010353
    SERVICEDOCS
    CALL PGM(QWTDMPFR)


    1) Parameters are as follows.

    Output - *IFS (store data into the IFS), *PF (store data in to a physical file)
    Data library - if choosing to store data into physical file, the library to store the file
    IFS directory to store data - if choosing to store data in the IFS, the location in the IFS to store data
    Collect VLOGs - Y or N to collect VLOGs
    Collect PAL entries - Y or N to collect PAL entries




    2) When done, a status message will tell where the data is stored. If there is no message, check your joblog.

    If *PF was specified, a save file would be created in the data library.
    If *IFS was specified, a zip file would be placed in the /tmp directory.



Advanced Job Scheduler Data Collection
  • -



    DSPJOBLOG OUTPUT(*PRINT) for job QIJSSCD
    DSPJOBJS OUTPUT(*PRINT) DETAIL(*FULL) AREA(*ALL)
    DSPLOGJS OUTPUT(*PRINT)
    DSPHSTJS OUTPUT(*PRINT)

    1) Parameters are as follows.

    Output - *IFS (store data into the IFS), *PF (store data in to a physical file)
    Data library - if choosing to store data into physical file, the library to store the file
    IFS directory to store data - if choosing to store data in the IFS, the location in the IFS to store data

    image-20220209083509-1

    2) When done, a status message will tell where the data is stored. If there is no message, check your joblog.

    If *PF was specified, a save file would be created in the data library.
    If *IFS was specified, a zip file would be placed in the /tmp directory.



Job Status Trap
  • -



    The intention of this trap is to capture data on jobs that are in specific status for a period of time. For example, if there is a job that intermittently stays in an ICFW status or MTXW status, this will help determine that and collect data.



    Parameters as follows :



     
    Function *Start or *Stop
    Job status The state of the job that you want to monitor, for example, ICFW, LCKW, etc.
    Job name The job to monitor - *ALL, generic*, or specific job name
    Delay between checks Time in seconds between checks. Default is 60 seconds.
    Maximum seconds in state If a job is found in the specified state, how long should it be in that state before we trigger
    End trap when triggered If a job is found if the specific state and it is determined that it matches the maximum seconds in the state, do you want the trap to end or keep running to monitor other jobs
    End job in detected state Do you want to end the job in the specified state
    Delete spool files from job When the job is ended, delete the spool files associated with that job?
    Dump data Dump data for the job in that state. Data dumped is :
    1) callstack
    2) joblog
    3) wrkactjob *all
    4) servicedocs
    Keep status check joblog This is for internal use for debugging purposes, create joblogs for the status check joblog
    Data area prefix to store data Data area to store job information
    Data library The library to store the data. The data will be in a file named JOBSTSxxxx where xxxx is a number in that library


Scan QHST file
  • -



    The intent of this function is scan a physical file member containing a history log for specific message IDs and output that to a spool file.

    There is a predefine set of message IDs to collect. Here is a list :
    - UPS messages (CPF1816 CPF1817 CPF1818 CPF1819 CPF181A CPF1821 CPI0961 CPI0962 CPI0963 CPI0964 CPI0965 CPI0966 CPI0973 CPI0974 CPI0975 CPI0976 CPI0978 CPI0979 CPI0981 CPI0994 CPPA73A CPPA73B CPPA73D CPPA740 CPPA743 CPPA744 CPPA745 CPPA891 CPPA892 CPPA893 CPPA894 CPPA895 CPPA896 CPPA897 CPPA898 CPPA899 CPP073A CPP073D CPP0730 CPP0733 CPP0734 CPP0735 CPP0891 CPP0892 CPP0893 CPP0894 CPP0895 CPP0896 CPP0897 CPP0898 CPP0899 CPP3502 CPP3504 CPP3505)

    - Storage threshold messages (CPF0907 CPF090A CPF3793 CPI099c CPI0999 CPI0953 CPI0954 CPI0955 CPI099D CPI112E CPI11AB CPI147B MCH2803)

    - IPL messages (CPF0918 CPF0901 CPF0903 CPF0905 CPF0923 CPF095a CPF0954 CPF0955 CPF0965 CPF0968 CPF0969 CPF0930 CPF0931 CPF0933 CPF0934 CPF0990 CPF0992 CPF0993 CPF0997 CPF0998 CPF1821 CPI0919 CPI091d CPI091E CPI091F CPI0980 CPI0990 CPI0995 CPI099A CPI099D CPI099F CPI0C04 CPI1183 CPI1184 CPA3702 CPC1292 CPP0E9F)

    - System value change (CPC1139 CPD1686 CPD1687 CPF1030 CPF1076 CPF1078 CPF1805 CPF1806 CPF1807 CPF1808 CPF1809 CPF1810 CPF1811 CPF1812 CPF1813 CPF1814 CPF1815 CPF18C0 CPF18C1 CPF18C4 CPF180A CPF1823 CPF1845 CPF1852 CPF1853 CPI1469)

    - ASP balance and disk reorganization messages (CPC18A1 CPC18A2 CPC18A3 CPC18A4 CPC18A5 CPC18A6 CPC18A7 CPD18AA CPI1470 CPI1472 CPI1473 CPI1474 CPI1475 CPI1476 CPI1477 CPI1478 CPI18A1 CPI18A2 CPI18A3 CPI18A4 CPI18A5 CPI18A6 CPI18A7 CPI18A8 CPF1888 CPF1889 CPF1891 CPF18AA CPF18AB CPF18AC CPF18AD CPF18AE CPF18AF CPF18B1 CPF18B2 CPF18B3)

    - DB cross reference messages (CPI0C04 CPF3954 CPF32A0 CPF32A1 CPF32A2 CPF32A3 CPF32A4 CPF32D1 CPF32D4 CPF327E CPI0990 CPI091D CPF32A6)

    - Damage messages (MCH1602 MCH604 CPF8100 CPF3113 CPF3171 CPF3173 CPF3174 CPF3176 CPF3245 CPF3272 CPF3763 CPF3949 CPF7034 CPF9804 CPDA186 CPDA429 CPD0CE2 CPA2201 AMQ7472 CPF5030)

    - Remote journal messages (CPF70D3 CPF70D4 CPF70D5 CPF70DB CPF70DC CPF70D7 CPC6983 CPC6984 CPF70C4 CPF70C5 CPI7012 CPI7016)

    - Access path rebuild messages (CPF3112 CPF3123 CPF3145)

    - Verify save messages (CPF0994 CPF0968 CPC3702 CPF3772 CPC3707 CPC9410 CPC370C CPC2356 CPF3763 CPF3770 CPF3771 CPF3777 CPF9410 CPF902E CPF3837 CPF2361)

    Also, you can put in an additional 10 message IDs to scan for.




     
    File/Library/Member The location of the history file
    Predefined message IDs Predefined message IDs :
    - *NONE (default)
    - UPS_MSG (UPS messages)
    - STG_MSG (Storage threshold messages)
    - IPL_MSG (IPL messages)
    - SYSVAL_MSG (System value change messages)
    -ASP_MSG (ASP balance and disk reorganization messages)
    - DBXREF_MSG (DB cross reference messages)
    - DAMAGE_MSG (Damage messages)
    - RMTJRN_MSG (Remote journal messages)
    - ACSPTH_MSG (Access path rebuild messages)
    - VFYSAV_MSG (Verify save messages)
    MSGID1 - MSGID10 Additional message ID to scan for


Sending diagnostic information to IBM Support

[{"Product":{"code":"SWG60","label":"IBM i"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"Not Applicable","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"Version Independent","Edition":"Standard","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
14 November 2023

UID

nas8N1020063