How to perform 'fstrim' operation on the RBD PV's

How To

Summary

How to perform fstrim manually on the RBD PVC by identifying the correct PVC and its backend rbd image

Environment

IBM Fusion Data Foundation (FDF) 4.x

Red Hat OpenShift Data Foundation (ODF) 4.x

Steps

Fusion Data Foundation provides the "ReclaimSpace" feature which allows you to enable the automatic reclaiming of freed-up space from RBD PersistentVolumes. Please check out the documentation for FDF and ODF.
In a traditional file system, a file deletion will mark the respective inode pointers in the parent folder's directory as unused, but will not delete the data in the data blocks, the same behavior applies to Ceph as well. Ceph, like the traditional file system, does not erase the object when the file/data is deleted from the PVs, and the object remains on the RBD device.
The ceph df reports incorrect available space and the same is reflected on the OpenShift UI and causes confusion.
To check the actual current utilization we can use "df -h" command or discard the mount option. The discard mount option behaves similarly to fstrim and it's to clean up objects on the backend when the files get deleted to use the TRIM support by the underlying disks.
Using the discard option can cause performance degradation by enabling TRIM for blocks that are actually disabled, hence it is not enabled by default and is NOT Recommended.
If the capacity needs to be reclaimed, one can perform the fstrim operation on the mounted path of the PV.
The following steps are needed to reclaim the space from one of the PV's:

1) Find the PVC from which data was deleted:

$ oc get pvc -A | grep pvc-179feda8-8a94-419e-b309-6cf6b1a22d7d
openshift-logging          elasticsearch-elasticsearch-cdm-ag8jhaoy-1     Bound    pvc-179feda8-8a94-419e-b309-6cf6b1a22d7d   187Gi      RWO            ocs-storagecluster-ceph-rbd   5h43m

2) Debug inside the node that is hosting this pod:

$ oc get po -o wide | grep ag8jhaoy-1
elasticsearch-cdm-ag8jhaoy-1-6fcd7cf5fb-n7jxl   2/2     Running     0       4m   1.1.1.1   dell-r740xd-1.gsslab.pnq2.redhat.com
$ oc debug no/dell-r740xd-1.gsslab.pnq2.redhat.com
$ chroot /host
$ sudo -i

3) Find the mount path using the PVC name:

# df -kh | grep pvc-179feda8-8a94-419e-b309-6cf6b1a22d7d
/dev/rbd2          184G  1.5G  182G   1% /var/lib/kubelet/pods/229fcec9-b166-4eed-9604-dd84603129a7/volumes/kubernetes.io~csi/pvc-179feda8-8a94-419e-b309-6cf6b1a22d7d/mount

4) Run 'fstrim' on the path:

# fstrim -v /var/lib/kubelet/pods/229fcec9-b166-4eed-9604-dd84603129a7/volumes/kubernetes.io~csi/pvc-179feda8-8a94-419e-b309-6cf6b1a22d7d/mount

We already have an internal bug BZ 1783780 to track this issue.

Root Cause

Ceph does not delete the object when deleting the file, the same as the traditional file system, and the object remains on the RBD device.
Also, a new write will either overwrite these objects or create new ones, as required.
Therefore, the objects are still present in the pool, a 'ceph df' will show the pool being occupied with the objects, even though those are not used.

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB66","label":"Technology Lifecycle Services"},"Business Unit":{"code":"BU070","label":"IBM Infrastructure"},"Product":{"code":"SSSEWFV","label":"Storage Fusion Data Foundation"},"ARM Category":[{"code":"a8m3p000000UoIPAA0","label":"Support Reference Guide"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Tips

How to perform 'fstrim' operation on the RBD PV's

How To

Summary

Environment

Steps

Root Cause

Document Location

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?