Overview of planning process

You can plan for high availability within a data center with PowerHA® SystemMirror® Standard Edition for AIX® and for multisite high availability and disaster recovery with PowerHA SystemMirror Enterprise Edition for AIX.

Your major goal throughout the planning process is to eliminate single points of failure. A single point of failure exists when a critical cluster function is provided by a single component. If that component fails, the cluster has no other way of providing that function, and the application or service dependent on that component becomes unavailable.

For example, if all the data for a critical application resides on a single disk, and that disk fails, that disk is a single point of failure for the entire cluster. Clients cannot access that application until the data on the disk is restored. Likewise, if dynamic application data is stored on internal disks rather than on external disks, it is not possible to recover an application by having another cluster node take over the disks. Therefore, identifying necessary logical components required by an application, such as file systems and directories (which could contain application data and configuration variables), is an important prerequisite for planning a successful cluster.

Realize that, while your goal is to eliminate all single points of failure, you may have to make some compromises. Usually a cost is associated with eliminating a single point of failure. For example, purchasing an additional hardware device to serve as backup for the primary device increases cost. The cost of eliminating a single point of failure could be compared against the cost of losing services if that component fail. Again, the purpose of the PowerHA SystemMirror is to provide a cost-effective, highly available computing platform that can meet future processing demands.

Note: It is important that failures of cluster components be remedied as soon as possible. Depending on your configuration, PowerHA SystemMirror might not be able to handle a second failure, due to lack of resources.