IBM Support

IBM Spectrum Protect for Virtual Environments: Data Protection for VMware – Overview of Change Block Tracking

Question & Answer


Question

How does the VE data mover work with VMware's Change Block Tracking (CBT) .

Answer


Overview

The Data Protection for VMware data mover uses the VMware vStorage API for Data Protection (VADP) to backup and restore virtual machines in a vCenter environment. The VADP is made up of two major components, the vSphere Management SDK and the Virtual Disk Development Kit (VDDK).

For more information about VADP, see the VMware FAQ at http://kb.vmware.com/kb/1021175.

Change Block Tacking (CBT) is a VMware feature that assists the data mover in performing incremental backups of virtual machines. The vSphere Management SDK is responsible for returning the CBT extents data (changed blocks) to the data mover. CBT identifies the initial allocated blocks and tracks blocks changed since the last backup.

Limitations for CBT:

  • The host must be ESX/ESXi 4.0 or later.
  • The virtual machine must be hardware version 7 or later.
  • I/O operations must go through the ESX/ESXi storage stack. All VMFS datastores are supported, whether backed by SAN, iSCSI, or local disk. Except for the initial allocated disks, NFS datastores are supported, as are virtual RDMs.
  • Physical RDMs and disks that are accessed directly from the guest OS (iSCSI or NFS) are not supported. Also, CBT is not supported if the VMDK is attached to a shared virtual SCSI bus.
  • The virtual machines VMDK must not be an independent disk, meaning unaffected by snapshots.
  • The virtual machine cannot be a template virtual machine.
On the first backup, if all of the aforementioned requirements are met, the data mover enables CBT and requests the allocated blocks on each VMDK. For this first backup the data mover uses a special CBT change ID of “*”. In addition, the following conditions must be met:
  • The VMDKs must be on a VMFS volume backed by SAN, iSCSI, vSAN, or local disks.
  • The virtual machine must not have pre-existing snapshots.
For non-VMFS volumes or NFS volumes CBT returns an error or a single extent covering the entire VMDK. If an error is thrown, the data mover is forced to create a synthetic extent covering the entire disk. This scenario means that a thin-provisioned VMDK is converted into a thick-provisioned VMDK. The previous statement is always true for thin-provisioned VMDKs on NFS datastores. Also, a thick eager zero provisioned VMDK returns a single extent covering the entire disk because all blocks are allocated and zeroed. After the first full backup the data mover saves the current CBT change ID and uses it to get the incremental data during the next backup.

Table 1: Change Block Tracking Flow
Backup type
New change ID
Old change ID
ID for CBT query
Result
full
changeID 0
none
*
All used blocks
incremental
changeID 1
changeID 0
changeID 0
All blocks since changeID 0
incremental
changeID 2
changeID 1
changeID 1
All block since changeID 1
….
….
….
….
….

VMware provides the following KB article that describes best practices to follow when using advanced transports and CBT: http://kb.vmware.com/kb/1035096. The following VMware KB article provides some help in dealing with CBT issues and possible failures: http://kb.vmware.com/kb/1020128.

The data mover automatically enables CBT when the aforementioned CBT requirements are met. You can confirm that CBT is enabled in one of two ways: From the data mover one can issue the “show vm all” command and see a detailed list of all the virtual machines in the inventory along with the CBT attribute “changeTracking”. This attribute has a value of “On” or “Off”.

Example:


44.vmName: xp-32(0)
hostAddress: na-3912a122d1e6
tsmNodeName: na-3912a122d1e6
displayName: xp-32(0)
ipAddress: 192.168.0.160
datacenter: DC Lab
hostSystem: oneshot.home.lan
guestFolder:
guestFullName: Microsoft Windows XP Professional (32-bit)
altGuestName:
guestId: winXPProGuest
uuid: 422b19f7-0322-8c04-727f-68897688087a, moref: vm-1406
instance uuid: 502ba39f-df31-022e-e006-96d3c918e2bb
guestState: notRunning connectionState: disconnected
changeTracking: On vmHWversion: vmx-08
toolsRunningStatus: guestToolsNotRunning
toolsVersion: 9349 toolsVersionStatus: guestToolsSupportedNew
consolidationNeeded: No
vmFaultTolerant: No
domainKeyword:
domainSelected: No
cluster: Clouds.SRQ.VM
vApp: Clouds vApp
resourcePool:
VMDK[1]Label: 'Hard disk 1' (Hard Disk 1)
VMDK[1]Name: '[datastore1-4] xp-32(0)/xp-32(0).vmdk'
VMDK[1]Status: Included



Another method can be found in the following IBM Technote. The method described uses the vSphere Client: http://www.ibm.com/support/docview.wss?uid=swg21516726.

Finally, a powered on virtual machine must go through a stun-unstun cycle (power on, resume after suspend, migrate, or snapshot create/delete/revert ) to enable or disable CBT so the data mover will use snapshot creates and deletes to accomplish this stun-unstun cycle.

Common CBT issues

The one issue often reported is that the first full backup returns the entire VMDK. As discussed above, if the datastore is NFS backed, CBT reports that the entire VMDK is allocated. This is just a limitation related to the NFS datastore and the ability to get the allocated blocks from NFS hardware. More information can be found in the VMware KB: http://kb.vmware.com/kb/2077787.

A second issue is that an incremental CBT request fails because the CBT change ID is invalid and the data mover is forced to take a new full backup. This issue can occur if CBT has been reset due to power failures, hard shutdowns, cold migration, or Storage vMotion. For more information, see the following VMware KBs: http://kb.vmware.com/kb/1020128 and http://kb.vmware.com/kb/2048201. The Storage vMotion issue is reported fixed by VMware with ESXi 5.5 Update 2. See http://kb.vmware.com/kb/2048201.

In addition, another potential issue is that CBT overstates changes and the data mover is backing up too much data. There are several possibilities here. The first, and often overlooked, explanation is that some in-guest applications, like an anti-virus application, run daily and make modifications to the guest hard disk. There are also known defects in the VMware's ESXi versions or the VDDK. For more information, see the following IBM Technotes: http://www.ibm.com/support/docview.wss?uid=swg21635006 and http://www.ibm.com/support/docview.wss?uid=swg21628701.

When CBT is already enabled and the virtual disk is extended across a 128 GB boundary, this can also cause CBT to return the incorrect size. VMware reports that resetting CBT will correct this problem. See VMware KB: http://kb.vmware.com/kb/2090639.

For example:
- disk grows from 20GB to 100GB : no impact
- disk grows from 20GB to > 128GB : impacted
- disk grows from 140GB to 200GB : no impact
- disk grows from 140GB to > 256GB : impacted
- disk grows from 400GB to 500GB : no impact
- disk grows from 400GB to >512GB : impacted

Lastly, we have seen errors from the CBT function due to improper ESXi reboots or shutdowns. The saved CBT change ID has become invalid and CBT will need to be reset. An error message similar to this will be found in the dsmerror.log. The data mover can perform the reset automatically but process requires CBT to be turned disabled and then enabled and this will put the virtual machine through two snapshot stun-unstun cycles.

Example:

ANS9365E VMware vStorage API error.
TSM function name : QueryChangedDiskAreas
TSM file : vmvisdk.cpp (3592)
API return code : 12
API error message : SOAP 1.1 fault: "":ServerFaultCode[no subcode]"A specified parameter was not correct."
ANS9365E VMware vStorage API error.
TSM function name : visdkPrintSOAPError
TSM file : vmvisdk.cpp (885)
API return code : 12
API error message : SOAP 1.1 fault: "":ServerFaultCode[no subcode]"Error caused by file /vmfs/volumes/4ade85fd-81f49624-57f5-000e0cdd0d21/winxp-32/winxp-32.vmdk"


Forcing a CBT reset

For invalid CBT change IDs a CBT reset is necessary. Run a single TSM backup with the testflag vmbackup_cbt_reset.

Example:

dsmc backup vm 'myvm' -testflag=vmbackup_cbt_reset.


Diagnosing problems

Collecting a data mover trace should include the following lines in the dsm.opt:


traceflag vm
tracefile vmbackup.trc


One additional file is the dsmvddk.opt and the following lines need to be change to “6” to enable trivia tracing in the VDDK API:


# 0-quiet, 1-panic, 2-error, 3-warning, 4-info, 5-verbose, 6-trivia

vixDiskLib.transport.LogLevel = "6"
vixDiskLib.nfc.LogLevel = "6"

[{"Product":{"code":"SSERB6","label":"IBM Spectrum Protect for Virtual Environments"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"Data Protection for VMware","Platform":[{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"All Versions","Edition":"All Editions","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
17 June 2018

UID

swg21681916