A system is fault-tolerant if it can continue to perform
despite parts failing. Fault tolerance helps to make your remote-boot
infrastructure more robust.
In the case of OS deployment servers,
the whole system is fault-tolerant if the OS deployment servers back
up each other. When a server fails, other servers handle the requests
from the down server.
Implementing fault tolerance at the Tivoli® Provisioning Manager for Images level
does not mean that your whole network infrastructure is fault-tolerant.
You can implement fault-tolerances at all levels:
- At the
physical level, by having redundant power sources (if all OS deployment servers are
out of power at the same time, fault-tolerance at the product level
is useless)
- At the network level, by having backup network
links, and backup
active elements (the backup server must be able to reach remote-boot targets)
- At the network operating system level, by having multiple network
domains, or by running OS deployment servers outside
of your domain architecture (OS deployment servers should
not be all linked to the same NT PDC, or the same NFS server)
- At
the DHCP level, by having multiple DHCP servers on the same
subnet
- At the Tivoli Provisioning Manager for Images level,
by implementing the fault-tolerance instructions.
- At the operating
system level. If Tivoli Provisioning Manager for Images is
able to survive to a severe problem, but then the operating system
cannot find its network server, fault tolerance is useless
The
following sections present information about how to implement
fault tolerance at the DHCP and Tivoli Provisioning Manager for Images levels.
Other levels are beyond the scope of this document.