Cassandra Disk and Memory Usage - Adding Memory Resources

Cassandra kernel file system cache: Cassandra JVMs perform large amounts of disk I/O operations, for example, writing new data, compacting existing SSTables, and reading for queries. Cassandra relies on the kernel file system cache for optimizing reads. Recent and frequently used files are kept in the memory cache. It would be impractical to keep all of the hundreds of GBs of metric data Cassandra stores in memory at scale. Therefore, it is impossible to completely eliminate disk reads. By default, Cassandra containers are given 16 GB of RAM, set in their Kubernetes resource requests and limits. After the JVM, approximately 6 GB remains for file caching. This file cache size satisfies many of the most common reads from memory, for example: recent metric data or frequently accessed tables like topology or events. Operations like SSTable compaction, are also generally able to be completed from memory. SSTable compaction involves merging the many immutable files Cassandra stores data in into a smaller number of larger files.

Metric summarization and baselines: Metric summarization and baselines in particular can be heavy on the disk reads. For more information on metric summarization, see Configuring summarization.; While metric summarization is generally requesting recent data, it is possible for other operations to claim the file cache, for example, SSTable compaction. Given the scale of the metric summarization requests, this hit on reads is larger than normal UI query loads.; Technology preview: The baseline feature detects anomalies in the behavior of data. For baselines, training is performed on the metrics, this requires queries against the entire set of raw data for the metrics that are being baselined. All of the data that is required for baselineing will not be in memory. For more information about baselines and enabling the baseline feature, see Managing baselines.

Cassandra I/O: Cassandra I/O is unique compared to many traditional databases. SSTable compaction results in large amounts of reads/writes, but the read/writes are generally sequential I/O since it is reading and writing straight through files. This makes it ideal for traditional hard disk drive arrays that do well on sequential operations. However, queries can be much more random in their I/O pattern, for example, searching the headers of SSTables and pulling specific rows of data from the tables. For this reason, it is recommended by the Cassandra community to tune down the read ahead setting. This helps to reduce the wasted I/O reading data that is never used in queries, for more information, see Optimizing disk performance for Cassandra. The result is that the disks needed to support Cassandra, need the ability to do both large amounts of reads/writes, as well as handle random I/O well. This is why we do not recommend any network based storage solutions, as their performance and ability to handle the bandwidth is generally insufficient.

Increase the Cassandra memory request and limit

In containers like Docker and Kubernetes, the kernel filesystem cache is limited to the space within the container. For example, if you run Cassandra on a system with 64 GB of RAM, but the Cassandra container memory limit is 16 GB, the additional 48 GB of RAM will be unavailable to Cassandra to optimize the disk I/O. In order to take advantage of the additional RAM on the system, the Cassandra memory request and limit need to be increased.

Since this kernel filesystem cache is included in the container resource usage, it is recommended by the Kubernetes community to set the memory request and limit equal for containers that do disk I/O and therefore use the kernel filesystem cache. This is because of the nature of the kernel filesystem cache. Unused RAM is wasted RAM; therefore the kernel tries to use as much memory as it can to optimize disk I/O by keeping the information in memory. Therefore, the container will always eventually almost reach the limit as it files cache. So requesting less than the limit will misrepresent the amount of memory the container will use on a system. Note, the kernel filesystem cache memory (buffers/cache) is still available to be freed and given to things that need it, like processes.

Increasing the memory of your Cassandra containers, and therefore the kernel filesystem cache space available to Cassandra, will allow disk I/O to be reduced as more (but not all) operations will be satisfied from memory. This can be done by running:

Run the following command:

kubectl edit
        statefulset my_release_name-cassandra

Modifying both the resource requests and limits. The existing size1 settings are:

 resources:
          limits:
            cpu: "6"
            memory: 16Gi
          requests:
            cpu: "4"
            memory: 16Gi

To double the total memory available and add approximately 16 GB to the kernel filesystem cache, follow this example:

 resources:
          limits:
            cpu: "6"
            memory: 32Gi
          requests:
            cpu: "4"
            memory: 32Gi

Future upgrades or helm updates can reset these values if the yaml is not modified as well. The definitions for the resources are contained in the _resources.tpl of the Cloud App Management charts, located here: ibm-cloud-appmgmt-prod/charts/cassandra/templates/_resources.tpl. Modify the values in the size corresponding to your environment deployment. For example:

size1:
  replicas: 3
  cassandraHeapSize: "8G"
  cassandraHeapNewSize: "2G"
  cassandraConcurrentCompactors: 4
  cassandraMemtableFlushWriters: 2
  resources:
    requests:
      memory: "16Gi"
      cpu: "4"
    limits:
      memory: "16Gi"
      cpu: "6"

To double the total memory available and add roughly 16 GB to the kernel file system cache:

size1:
  replicas: 3
  cassandraHeapSize: "8G"
  cassandraHeapNewSize: "2G"
  cassandraConcurrentCompactors: 4
  cassandraMemtableFlushWriters: 2
  resources:
    requests:
      memory: "32Gi"
      cpu: "4"
    limits:
      memory: "32Gi"
      cpu: "6"

Note: This custom modification needs to be done for your charts in all subsequent releases. If the resources are not modified in the charts, the values reset and you need to run

kubectl edit
statefulset my_release_name-cassandra

. It is also possible to modify the values during the helm upgrade instead of kubectl edit.