Planning array configurations

When you plan your network, consideration must be given to the RAID configuration that you use. The system supports distributed RAID 1, 5, and 6 array configurations.

Distributed array

The concept of distributed RAID arrays is to distribute an array with width W across a set of X drives. For example, you might have a 9+P+Q+R RAID 6 array that is distributed across a set of 40 drives. The array type and width define the level of redundancy. In this example, a 25% capacity overhead for parity and rebuild areas exists.

If an array stride needs to be rebuilt, two component strips must be read to rebuild the data for the third component. The set size defines how many drives are used by the distributed array. Performance and usable capacity scale according to the number of drives in the set. The other key feature of a distributed array is that instead having a hot spare, the set includes distributed rebuild area strips that are also distributed across the set of drives. The data and distributed rebuild areas are distributed such that if one drive in the set fails redundancy can be restored by rebuilding data on to the spare distributed rebuild areas at a rate much greater than the rate of a single component.

Note: When you determine the array configuration for your system, plan to create arrays of 256 KiB strip size only.

Distributed arrays of NVMe drives support are used to create large-scale internal managed disks. As a result, rebuild times are dramatically reduced, which decreases the volumes' exposure to the extra load of recovering redundancy. Because the capacity of these managed disks is potentially so great, when they are configured in the system the overall limits change to allow them to be virtualized. For every distributed array, the space for 16 MDisk extent allocations is reserved. Therefore, 15 other MDisk identities are removed from the overall pool of 4096. Distributed arrays also aim to provide a uniform performance level. A distributed array can contain multiple drive classes if the drives are similar (for example, the drives have the same attributes, but the capacities are larger) to achieve this performance. All the drives in a distributed array must come from the same I/O group to maintain a simple configuration model.

A distributed array has the following key benefits:

Quicker rebuild times with less impact to host I/O.
More user flexibility in defining how many drives are used by an array (for example, a user can create 9+P arrays with 24 drives without having four drives left unused).
Distributed rebuild areas means that no idle drives exist in the system, and thus performance improves slightly.
Degraded distributed RAID arrays can use the rebuild-in-place process (if supported) to restore redundancy, copying or reconstructing the data directly back into the replaced member drive.

One disadvantage of a distributed array is that the array redundancy is covering a greater number of components. Therefore, mean time between failure (MTBF) is reduced. Quicker rebuild times improve MTBF; however, limits to how widely distributed an array can be before the MTBF becomes unacceptable remain.

Distributed array expansion

Distributed array expansion allows the conversion of a small, not-very-distributed array into a larger distributed array while it preserves volume configuration and restriping for optimal performance. Expansion offers the option of getting better rebuild performance with an existing configuration without the migration steps that might require excess capacity. Expanding a distributed array is preferable over creating a new small array.

Expansion can increase the capacity of an array, but it cannot change the basic parameter of stripe width. When you plan a distributed array configuration, it is necessary to plan for future array requirements. Additionally, a distributed array that might fit within the extent limit (16*128 K extents) at a particular extent size might not fit if you expand it over time. Planning your extent size for the future is also important. The minimum (and recommended) storage pool extent size is 4096 MiB.

Expansion also benefits NVMe arrays for the same reasons. However, thin provisioned (compressing) NVMe drives add an extra layer of complexity when you calculate the available capacity in the array during an expansion. When you plan for the possible expansion of thin provisioned NVMe arrays, the drives must be the same physical and logical size. When you expand a thin provisioned NVMe array, the usable capacity is not immediately available, and the availability of new usable capacity does not track with logical expansion progress. The expansion process monitors usable capacity usage and analyzes the changes that are caused by the actions it takes during data restriping. This information is used to release the correct amount of usable capacity as it becomes available.