IBM Support

How to determine how many data movers are appropriate to protect a vSphere environment

Question & Answer


Question

When deploying IBM Spectrum Protect™ for Virtual Environments: Data Protection for VMware, how do you determine how many data movers are appropriate to protect a vSphere environment?

Answer

The goal of this article is to provide a quick estimate of the number of data movers needed to protect a vSphere environment for steady-state backup (that is, based on incremental forever backup technology) workload based on the size of the protected data.  Over time, the number of data movers can be adjusted based on observed behaviors.

Obtaining an exact data mover sizing can be a difficult task as many factors must be taken into consideration: the compute and network structure (what is traditionally considered as "feeds and speeds"), the use cases (steady-state backup as opposed to initial full backups or various recovery scenarios) and the nature of the data (for example, average daily change rate).  As many of these answers require observation, it is much more practical to determine the number of data movers based on the estimated size of the protected environment and adjust accordingly.

 The general, simple rule is to use a data mover for every 100 TB of vSphere data.

 For example, if the total virtual machine size of your vSphere environment is 150 TB, it is recommended to start with two data movers to protect this environment.  Note that the total virtual machine size of your vSphere environment can simply be the total used size reported by the vSphere Web Client or via the VMware PowerCLI (for example, get-vm | Select Name, UsedSpaceGB; refer to VMware documentation on what this value represents in your environment)

 The 100 TB vSphere data general rule was obtained using the following assumptions:

  • 10 GbE (or HotAdd / SAN equivalent) available on all data paths in the environment, specifically the path from the datastore to the data mover and then to the IBM Spectrum Protect server
  • 5% average daily change rate
  • 8 hour backup window
 

Physical or Virtual Data Movers

Another common question is whether the data movers should be physical servers or virtual machines.  Again, many factors need to be considered in choosing the appropriate solution but two simple rules can be applied:

  1. If you plan on using LAN-free technology for moving data from the data mover into the IBM Spectrum Protect server, then you will need to use a physical data mover, otherwise consider using a virtual data mover if this requirement does not exist in your environment.
  2. Regardless of the type of data mover, the data mover must have appropriate, dedicated resources available (see sections below)


Data mover locality

One other important consideration in determining the initial number of data movers is data mover locality, that is, if there are any requirements related to the placement of data movers in the context of the protected vSphere data.  Consider the following:

  • Each vCenter to IBM Spectrum Protect server must have dedicated data movers.  For example, if you have two vCenters that are being protected, you will need at least two data movers, regardless of the aggregate size of the virtual machines in the two vCenters.
  • If you are using virtual data movers and plan to use the HotAdd transport, each host cluster must have a dedicated data mover.  For example, if you have an environment with two host clusters, it would require two data movers regardless of the aggregate size of the virtual machines in the two host clusters.

Data mover hardware specification

The following hardware specification is suggested for each data mover:

  • CPU
  • 16 cores (2.8 GHz) if using client-side deduplication
  • 8 cores (2.8 GHz) if using server-side deduplication
  • 8 GB RAM
  • 10 GbE network
 

Recommended option file (dsm.opt) settings

The following options are recommended in the data mover option file (dsm.opt / dsm.sys).

servername <server_stanza_name> ** Linux only

nodename <node_name>

passwordaccess generate

tcpserveraddress <spectrum_protect_server.company.com>

tcpport 1500

httpport 1585 ** Must be unique for each node

commmethod tcpip

errorlogame dsmerror.spve.log

schedlogname dsmsched.spve.log

managedservices schedule webclient

** vm processing options

vmtagdefaultdatamover <datamover_name>

vmchost <vcenter.company.com>

vmbackuptype fullvm

vmfulltype vstor

vmvstortransport NBD

vmskipmaxvirtualdisks yes

vmprocessvmwithprdm yes

vmtagdatamover yes

vmmaxvirtualdisks 8

** vm backup / restore tuning options
vmmaxparallel 8

vmlimitperhost 8

vmlimitperdatastore 4

vmmaxbackupsession 8

vmmaxrestoresession 6

[{"Product":{"code":"SSERB6","label":"IBM Spectrum Protect for Virtual Environments"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"Data Protection for VMware","Platform":[{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"Version Independent","Edition":"All Editions","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
02 November 2020

UID

swg22007197