IBM Support

IBM Tivoli Storage Manager for Space Management (HSM) V7.1 known problems

Question & Answer


Question

This document contains known problems and limitations for Tivoli Storage Manager for Space Management (HSM) V7.1.

Answer

Tivoli Storage Manager for Space Management (HSM) known problems and limitations



Contents



Warnings

HSM warnings
  • Never kill a HSM process or daemon with signal 9 (SIGKILL). - If, for any reason, it is required to kill a running HSM process, use the command:

  • kill -SIGTERM <pid>
    When killing a running migration or recall signal SIGKILL can lead to inconsistent states for the files processed at that time. - A restart of the recall daemons can fix the recall problem. A restart of Spectrum Scale can fix the migrate problem. The restarts correct the states of affected files.
  • When node replication is used on the Spectrum Protect server neither restore stub files nor invoke dsmmigundelete when the secondary Spectrum Protect server is used. See APAR IC94316 for details.
  • Do not copy premigrated or stub files to a new or different file system by using block level commands (for example dd, cpio, mksysb, etc). Instead use the commands provided by Spectrum Protect. - Using block level commands can lead to changed file attributes causing reconciliation failures.
  • Do not use the NFSTIMEOUT option with HSM. Using the NFSTIMEOUT option on a system with HSM can lead to unpredictable behavior of the HSM applications, which includes applications stopping unexpectedly. The NFSTIMEOUT option is used for backing up NFS file systems, which cannot be managed by HSM. If this option is required for the backup-archive client, then specify different server stanzas for the backup-archive client and HSM.
  • The HSM recall daemon (dsmrecalld) uses the registered RPC program number 300781 (decimal). Do not use this RPC program number for other applications.
  • The HSM boot time script sets the file size resource limit to unlimited before starting HSM daemons.

Back to Contents




HSM GPFS warnings
  • HSM is not supported on a Spectrum Scale file system that is automatically mounted when accessed. In other words do not specify automount for the "-A" parameter of the Spectrum Scale mmcrfs or mmchfs commands. Specify either yes or no for this parameter. Check the file system configuration with the Spectrum Scale command mmlsfs <device name> -A.
  • Before HSM is uninstalled on a node managing a file system, transfer the ownership for the file system to another HSM node in the same cluster. Use the dsmmigfs takeover command to achieve this.
  • On a Spectrum Scale (GPFS) cluster with AIX, Linux x86 or Linux zSeries nodes, use HSM only on one of these platforms to manage a shared file system. Do not use HSM on different platforms to manage a file system shared among these platforms. All HSM nodes in a cluster must have the same OS level and HSM level installed.
  • HSM adds an entry for /etc/rc.gpfshsm to the /var/mmfs/etc/gpfsready script. This entry can be overwritten by a Spectrum Scale (GPFS) update or upgrade. Add the entry /etc/rc.gpfshsm after an update or upgrade again if necessary. For more information refer to the GPFS documentation.
  • If failover is enabled on the local node, failover is triggered in the following cases:
    • GPFS shutdown or failure
    • reboot
    The success of failover depends on whether there is an eligible node for taking over the file systems from the failing node. This means another node in the same cluster must have:
    • failover enabled
    • a synchronous time
    • GPFS running
    • mounted the file systems the failing node managed before.
    The same HSM level must be installed on both nodes, the failing and the one which takes over.
  • A migrated file that does not belong to a fileset is recalled when it is moved using the mv command into a fileset. The file state will change from "migrated" to "resident".
  • A migrated file that belongs to a fileset is recalled when it is moved out of this fileset. The file state will change from "migrated" to "resident".
  • HSM managed file systems belong to the local (home) Spectrum Scale cluster only. File systems belonging to remote clusters are not HSM managed.
  • If CSM/CFM is used for maintaining /etc/inittab, be sure to add the dsmwatchd's entry also to the file in the file server repository. Refer to the CSM/CFM documentation for more information.
  • n order to handle ENOSPC events without problems, use the following Spectrum Scale configuration:
    • Configure at least one disk for metadataOnly
    • Configure all remaining disks for dataAndMetadata or dataOnly
  • To enable recall request distribution the following requirements must be met on each participating cluster node:
    • the recall daemons must run
    • the node needs access to the corresponding filespace on the Spectrum Protect server. Use the ASNODENAME option to achieve this.
    • the Spectrum Protect option files have identical HSM configurations on the nodes except for the NODENAME option.
  • HSM does not support stub sizes which are multiples of the file system fragment size. Specify a value of zero or a multiple of the file system's block size.
  • For high workloads when for example many recall requests need to be processed at the same time: configure sufficient Spectrum Scale dmapiWorkerThreads and worker1Threads. See the Spectrum Scale documentation for details.
  • Do not unmount HSM managed file systems if they are busy.

Back to Contents

HSM AIX warnings

  • If a volume group containing HSM file systems is imported to an AIX system, make sure that the major number of the device does NOT change. Otherwise the handles of the files in the HSM file systems will change, leading to the inability to expire obsolete copies.

Back to Contents


HSM Linux warnings
  • Do not update HSM using "rpm -U" or "rpm -F". Follow the update procedure documented in the HSM user's guide instead.
  • Setting the DSM_LOG environment variable does not work with /etc/environment on SLES and RHEL. Instead use other appropriate configuration files.

Back to Contents

Known Problems and Limitations


Common HSM known problems and limitations
  • When a tape optimized recall process is terminated prematurely, a file which is being recalled can become unaccessible. In this case the exclusive right on the file has not been released by the recall process. Restart GPFS on all nodes to solve this issue. When JFS2 is HSM managed reboot AIX.
  • If space management version 7.1 has been added to a file system it has to be removed again before moving to lower HSM versions.
  • If ACL data of a premigrated file are modified, changes are not written to the Spectrum Protect server if the file is migrated after this change. To avoid losing modified ACL data, make sure the MIGREQUIRESBACKUP parameter of the management class used is set to YES (the default). This setting prevents the migration of a file where ACL data have been modified. The migration can take place after a backup of the changed file.
  • When a file name length plus path length exceeds 1024 bytes the file is not migrated. HSM's maximum length for a file name plus path is 1024 bytes.
  • The reconciliation process, including orphan check, is a long running task. It requires a considerable amount of main memory for a file system containing several millions of files in combination with several million objects migrated to the Spectrum Protect server.
  • Files with Acess Control Lists (ACLs) can be migrated, but the ACLs will not be written back during recall. This might result in the loss of ACL data if a stub file has been deleted and later recreated using the dsmmigundelete command. If a backup copy of the of the stub file exists on the TSM server, use the dsmc restore command. This will copy the file content and ACL back to the local file system.
  • HSM daemons write messages to the dsmerror.log file. This file can become very large. The default file system is root "/". It can run out of space. To prevent this configure a different file system with more capacity for the dsmerror.log. Specify this file system with the DSM_LOG environment variable.
  • Automigration does not migrate hidden files or directories. Hidden files or directories are files or directories that begin with a ".".
  • Files created 20 seconds or less before an out-of-space condition are no candidates for demand automigration.
  • The CFI (Complete File Index) file of the dsmscoutd cannot be created if the ulimit setting prevents the creation of large files.
  • In rare cases, the output of the dsmscoutd scanplan shows negative values. To solve this problem, restart the scout daemon with the command 'dsmscoutd restart'.
  • When the scout daemon is scanning the file system, information about files found is stored in the CFI file. The more files have been scanned, the more information is stored. Therefore the time required to add information about new files increases over time. As a result, the scan rate might decrease during the scan.
  • The "Space Management Agent" (hsmagent) does not start if DSM_DIR is set and no link pointing to the hsmagent.opt configuration file was set.

Back to Contents




HSM GPFS known problems and limitations
  • A file can not be recalled to GPFS with the same performance like the backup-archive client's restore. The reason is that the recall uses the DMAPI to acquire and release exclusive rights several times for a file during processing, whereas the restore does not. This leads to a notably overhead and a lower performance of the recall compared to the restore.
  • If an application reads a file that is currently being migrated, the application might only read zeros instead of the real data. This happens rarely when the reading of the file occurs exactly at the point when the migration process starts stubbing the file. In this situation, the file is not corrupted nor is the data lost. It is a synchronization problem where a read does not trigger a recall of the migrated file. A second read of the file recalls it and the correct data is presented to the application. The problem will be fixed by GPFS in a future release.
  • After uninstallation and installation of HSM problems mounting the HSM file systems can occur. If this happens, restart Spectrum Scale on all cluster nodes.
  • In case of initiating a takeover, make sure that the running jobs regarding the involved file system have been stopped.
  • Migration might fail for newly created files on large file systems with the following warning message:

  • ANS9288W File: <filename> of size 0 is too small to qualify for migration.
    The reason is GPFS has not committed the file changes on disk until that moment. Retry the migration a few seconds later.
  • When the HSMDISABLEAUTOMIGDAEMONS option is changed restart the dsmwatch daemon. To achieve this invoke:

  • kill -SIGTERM <dsmwatchd-pid>
  • If the primary or secondary GPFS cluster data repository server fails, HSM cannot properly update its failover information. The 'dsmmigfs q -f' command might show incorrect information after an HSM node failure.
  • If a HSM node is deleted from and added again to a GPFS cluster, the GPFS node number might change (use 'mmlscluster' to check). In this case, HSM node numbering must be adjusted by executing the following procedure:

  •   dsmmigfs stop
    Wait until all HSM daemons, except the dsmwatchd, have stopped before proceeding with the next step.
      rm /etc/adsm/SpaceMan/config/instance
      kill -SIGTERM <DSMWATCHD PID>
    Wait until the dsmwatchd has been restarted by the system, before proceeding to the next step.
      dsmmigfs start
    If there are file systems that have been managed by this node previously, issue the following command:
      dsmmigfs takeover </fs>  (for each of these file systems)
    Afterwards two entries for this node are displayed by the 'dsmmigfs q -f' command. One with the old number and failover deactivated and one for the new node number. If you observe problems with mounting HSM managed file systems afterwards, you must restart all GPFS cluster nodes.
  • Running dsmls or dsmdu on file systems belonging to remote clusters does not show correct values.
  • Streaming recall is not possible when the value "2" is specified for the options MAXRECALLDAEMONS and MINRECALLDAEMONS. Instead a normal mode recall is performed.
  • If migrated files having streaming mode or partial file recall activated are deleted and restored afterwards (using the commands "dsmc rest", "dsmc retr" or "dsmmigundelete"), the recall mode will be set to normal.
  • When a file system is full, demand migration occurs. Valid files in the file system are migrated and free space is available. However, a GPFS limitation results in an error message like 'No space left on device'.
  • In case HSM is globally deactivated on the owner node (GPFS HSM environment with at least 2 nodes), a recall attempt of a migrated file from another node will produce the following message:

  • "ANS4007E Error processing 'xxx': access to the object is denied"
     
    In order to correctly recall the migrated files, HSM must first be globally reactivated on the owner node. Note that the same ANS4007E message is displayed also in other cases, when the HSM environment is not correctly set up or a recall fails due to other reasons.
  • The restore of stub files does not work with multiple storage pools, or with files that have ACLs.
  • The ctime option of GPFS should be set to no (default) to prevent unwanted backups of files with the backup-archive client after GPFS file migration from pool to pool. 

Back to Contents

HSM AIX GPFS known problems and limitations
  • For files with a size between 4 and 8 KB, dsmmigrate fails with ANS9523E, ANS9999E. Files greater than 8 KB can be migrated.
  • The initial backup of a migrated file with ACLs will trigger a recall. The file is in the premigrated state afterwards.

Back to Contents




HSM AIX JFS2 known problems and limitations
  • HSM does not support local automounts for directories contained in HSM managed file systems by using the "autofs" file system type. The workaround is to set the environment variable COMPAT_AUTOMOUNT to 1 and restart the automounter.
  • Threshold migration might migrate slightly below the low threshold limit.
  • Threshold values of Ht=Lt=100% and Ht=Lt=0% should not be defined since threshold migration will not be reliable with these settings.
  • If stub file sizes are larger then the size of the original file sizes and these files are manually recalled using the dsmrecall command, these files will be recalled to resident state. Note: Stub files sizes can only be larger then the original file sizes if the option MINMIGFILESIZE is set with a value smaller than the value of the stub size.

Back to Contents




HSM Linux problems and limitations
  • When SELINUX is used in enforced or permissive mode the restore of stub files does not work. These files will be restored to resident state or in other words they will be restored completely.
  • When GPFS 3.4 is used HSM cannot be run with SELinux enabled in enforced mode. In the enforced mode, HSM commands will fail with the error message: "Cannot restore segment prot after reloc: Permission denied". This error is generated by the Linux dynamic linker code, due to the fact that GPFS 3.4 does not support SELinux enabled in enforced mode. - HSM on GPFS 3.5 is not affected by this restriction.
  • When GPFS is shut down on a cluster node where HSM daemons are active, then GPFS will not be able to unload all kernel modules on this node. This leads to DMAPI session problems with HSM daemons and it can cause possible issues if the GPFS level is upgraded on that node. The only way to avoid this behavior is to stop all running HSM daemons before shutting down GPFS on this node. The recommendation is to uninstall the HSM on the GPFS node where the upgrade shall take place. If no GPFS node upgrade is planned, then this issue does not cause problems affecting HSM operations.
  • In case of a GPFS restart, sometimes GPFS kernel modules are not unloaded. This can lead to DMAPI session problems with HSM daemons.

Back to Contents




HSM Java GUI known problems and limitations
  • Some JRE versions have problems transferring the focus to the GUI components depending on the version of the operating system you are using. For example, accessing shortcuts on a menu (for example, by pressing ALT-F to open the file menu) and combo-boxes in a modal dialog by clicking the mouse button might not get the focus correctly. To solve this problem, you need to transfer the focus manually to that component by pressing the TAB key (or CTRL-TAB) several times.
  • If you have set the DSM_DIR environment variable, you need to create a link in DSM_DIR pointing to the hsmagent.opt XML configuration file, otherwise the "Space Management Agent" will not start correctly. For example:

    ln -s /opt/tivoli/tsm/client/hsm/bin/hsmagent.opt  \
            $DSM_DIR/hsmagent.opt
 
  • Changes to the hsmagent.opt file will be effective only after the hsmagent is restarted.
  • "IBM Tivoli Enterprise Space Management Console" needs to be disconnected from all the TSM Client Nodes if you change dsm.opt or dsm.sys configuration files, otherwise the previous configuration will be used.
  • Adding Space Management to a file system might result in showing a yellow warning icon in the "state" column of the File System table regardless of whether the Master scout daemon is running. The "Space Management Agent" requires some time to update this information.
  • The "Stub File Size" option is disabled with this version. Please use the command line as workaround to change the size of the stub file that replaces a migrated file in a HSM managed file system.
  • The "Import Setting" menu item is disabled with this version. Customized settings like the list of client nodes, and other table view customization cannot be imported from other machines.
  • Secure Sockets Layer (SSL) is not yet supported on this version.
  • Status and warning messages for Monitor, Recall, Watch and Root daemons are not yet implemented.
  • The functionality to change the TSM password is not yet implemented.
  • TSM Client Node properties cannot yet be modified from the GUI.
  • It is not yet possible to register a new node on the TSM server.
  • It is not yet possible to start "IBM Tivoli Enterprise Space Management Console" remotely from a browser by using the "Space Management Agent".
  • To display the "IBM Tivoli Enterprise Space Management Console" in the system default language (if it is not English), install the related language pack as mentioned in the "Installing the product" section.
  • The non-root user should set the TSM system environment variable DSM_LOG to a directory with write permission (e.g. export DSM_LOG=/home/) and verify that the non-root user has write permission to the configuration and log files (dsmsm.cfg and dsmsm.log).
  • Russian, Czech, Hungarian and Polish are not localized in the JRE by Sun. For this reason, some system messages such as "Ok" and "Cancel" might be displayed in English regardless of whether the language pack is installed.
  • Some NLS messages (uil_nls.jar) are not provided for Russian, Czech, Hungarian and Polish languages. Messages that are affected are located in the status bar and by right-clicking (filter and sorter) the header columns of the Client Nodes table and File System table.
  • Czech, Hungarian and Polish languages requires the following font: -dt-*-medium-r-normal-*-15-100-100-100-m-70-iso8859-2
  • Russian language requires the following font: -dt-*-medium-r-normal-*-15-100-100-100-m-70-iso8859-5
  • The Java GUI supports Japanese, Korean, Simplified and Traditional Chinese. However, Java supports these encodings only on some Linux platforms. For more information see

  • http://www.oracle.com/technetwork/java/javase/config-417990.html
Back to Contents

[{"Product":{"code":"SSSR2R","label":"Tivoli Storage Manager for Space Management"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"--","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"}],"Version":"7.1","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
17 June 2018

UID

swg21646985