Subscription-based application failover between managed clusters

Use this application-based failover method when a managed cluster becomes unavailable, due to any reason.

Before you begin

  • When primary cluster is in a state other than Ready, check the actual status of the cluster as it might take some time to update.
    1. Navigate to RHACM console > Infrastructure > Clusters > Cluster list tab.
    2. Check the status of both the managed clusters individually before performing a failover operation.

    However, failover operation can still be run when the cluster you are failing over to is in a Ready state.

Procedure

  1. Enable fencing on the Hub cluster.
    1. Open CLI terminal and edit the DRCluster resource, where <drcluster_name> is your unique name.
      CAUTION:
      Once the managed cluster is fenced, all communication from applications to the Fusion Data Foundation external storage cluster fails and some Pods will be in an unhealthy state (for example, CreateContainerError, CrashLoopBackOff) on the cluster that is now fenced.
      oc edit drcluster <drcluster_name>
      apiVersion: ramendr.openshift.io/v1alpha1
      kind: DRCluster
      metadata:
      [...]
      spec:
        ## Add this line
        clusterFence: Fenced
        cidrs:
        [...]
      [...]
      Example output:
      drcluster.ramendr.openshift.io/ocp4perf1 edited
    2. Verify the fencing status on the Hub cluster for the Primary-managed cluster, replacing <drcluster_name> is your unique identifier.
      oc get drcluster.ramendr.openshift.io <drcluster_name> -o jsonpath='{.status.phase}{"\n"}'
      Example output: Fenced
    3. Verify that the IPs that belong to the OpenShift Container Platform cluster nodes are now in the blocklist.
      ceph osd blocklist ls

      Example output

      cidr:10.1.161.1:0/32 2028-10-30T22:30:03.585634+0000
      cidr:10.1.161.14:0/32 2028-10-30T22:30:02.483561+0000
      cidr:10.1.161.51:0/32 2028-10-30T22:30:01.272267+0000
      cidr:10.1.161.63:0/32 2028-10-30T22:30:05.099655+0000
      cidr:10.1.161.129:0/32 2028-10-30T22:29:58.335390+0000
      cidr:10.1.161.130:0/32 2028-10-30T22:29:59.861518+0000
  2. On the Hub cluster, navigate to Applications.
  3. Click the Actions menu at the end of application row to view the list of available actions.
  4. Click Failover application.
  5. After the Failover application popup is shown, select Policy and Target cluster to which the associated application will failover in a disaster.
  6. Click the Select subscription group dropdown to verify the default selection or modify this setting.
    By default, the subscription group that replicates for the application resources is selected.
  7. Check the status of the Failover readiness.
    • If the status is Ready with a green tick, it indicates that the target cluster is ready for failover to start. Proceed to step 8.
    • If the status is Unknown or Not ready, then wait until the status changes to Ready.
  8. Click Initiate.
    All the system workloads and their available resources are now transferred to the target cluster.
  9. Close the modal window and track the status using the Data policy column on the Applications page.
  10. Verify that the activity status shows as FailedOver for the application.
    1. Go to Applications > Overview.
    2. In the Data policy column, click the policy link for the application you applied the policy to.
    3. On the Data Policy popover page, click the View more details link.