Cold storage of events

May 15, 2024

ID 264690

In KUMA, you can configure the migration of legacy data from a ClickHouse cluster to the cold storage. Cold storage can be implemented using the local disks mounted in the operating system or the Hadoop Distributed File System (HDFS). Cold storage is enabled when at least one cold storage disk is specified. If a cold storage disk is not configured and the server runs out of disk space, the storage service is stopped. If both hot storage and cold storage are configured, and space runs out on the cold storage disk, the KUMA storage service is stopped. We recommend avoiding such situations.

Cold storage disks can be added or removed.

After changing the cold storage settings, the storage service must be restarted. If the service does not start, the reason is specified in the storage log.

If the cold storage disk specified in the storage settings has become unavailable (for example, out of order), this may lead to errors in the operation of the storage service. In this case, recreate a disk with the same path (for local disks) or the same address (for HDFS disks) and then delete it from the storage settings.

Rules for moving the data to the cold storage disks

When cold storage is enabled, KUMA checks the storage terms of the spaces once an hour:

  • If the storage term for a space on a ClickHouse cluster expires, the data is moved to the cold storage disks. If a cold storage disk is misconfigured, the data is deleted.
  • If the storage term for a space on a cold storage disk expires, the data is deleted.
  • If the ClickHouse cluster disks are 95% full, the biggest partitions are automatically moved to the cold storage disks. This can happen more often than once per hour.
  • Audit events are generated when data transfer starts and ends.

During data transfer, the storage service remains operational, and its status stays green in the ResourcesActive services section of the KUMA web interface. When you hover the mouse pointer over the status icon, a message indicating the data transfer appears. When a cold storage disk is removed, the storage service has the yellow status.

Special considerations for storing and accessing events

  • When using HDFS disks for cold storage, protect your data in one of the following ways:
    • Configure a separate physical interface in the VLAN, where only HDFS disks and the ClickHouse cluster are located.
    • Configure network segmentation and traffic filtering rules that exclude direct access to the HDFS disk or interception of traffic to the disk from ClickHouse.
  • Events located in the ClickHouse cluster and on the cold storage disks are equally available in the KUMA web interface. For example, when you search for events or view events related to alerts.
  • Storing events or audit events on cold storage disks is not mandatory; to disable this functionality, specify 0 (days) in the Cold retention period or Audit cold retention period field in the storage settings.

Special considerations for using HDFS disks

  • Before connecting HDFS disks, create directories for each node of the ClickHouse cluster on them in the following format: <HDFS disk host>/<shard ID>/<replica ID>. For example, if a cluster consists of two nodes containing two replicas of the same shard, the following directories must be created:
    • hdfs://hdfs-example-1:9000/clickhouse/1/1/
    • hdfs://hdfs-example-1:9000/clickhouse/1/2/

    Events from the ClickHouse cluster nodes are migrated to the directories with names containing the IDs of their shard and replica. If you change these node settings without creating a corresponding directory on the HDFS disk, events may be lost during migration.

  • HDFS disks added to storage operate in the JBOD mode. This means that if one of the disks fails, access to the storage will be lost. When using HDFS, take high availability into account and configure RAID, as well as storage of data from different replicas on different devices.
  • The speed of event recording to HDFS is usually lower than the speed of event recording to local disks. The speed of accessing events in HDFS, as a rule, is significantly lower than the speed of accessing events on local disks. When using local disks and HDFS disks at the same time, the data is written to them in turn.

In this section

Removing cold storage disks

Disconnecting, archiving, and connecting partitions

Did you find this article helpful?
What can we do better?
Thank you for your feedback! You're helping us improve.
Thank you for your feedback! You're helping us improve.