Kaspersky Next XDR Expert
Cold storage of events

In KUMA, you can configure the migration of legacy data from a ClickHouse cluster to the cold storage. Cold storage can be implemented using the local disks mounted in the operating system or the Hadoop Distributed File System (HDFS). Cold storage is enabled when at least one cold storage disk is specified. If you use several storages, on each node with data, mount a cold storage disk or HDFS disk in the directory that you specified in the storage configuration settings.If a cold storage disk is not configured and the server runs out of disk space in hot storage, the storage service is stopped. If both hot storage and cold storage are configured, and space runs out on the cold storage disk, the KUMA storage service is stopped. We recommend avoiding such situations by adding custom event storage conditions in hot storage.

Cold storage disks can be added or removed. If you have added multiple cold storage disks, data is written to them in a round-robin manner. If data to be written to disk would take up more space than is available on that disk, this data and all subsequent data is written round-robin to the next cold storage disks. If you added only two cold storage disks, the data is written to the drive that has free space left.

After changing the cold storage settings, the storage service must be restarted. If the service does not start, the reason is specified in the storage log.

If the cold storage disk specified in the storage settings has become unavailable (for example, out of order), this may lead to errors in the operation of the storage service. In this case, recreate a disk with the same path (for local disks) or the same address (for HDFS disks) and then delete it from the storage settings.

Rules for moving the data to the cold storage disks

You can configure the storage conditions for events in hot storage of the ClickHouse cluster by setting a limit based on retention time or on maximum storage size. When cold storage is used, every 15 minutses and after each Core restart, KUMA checks if the specified storage conditions are satisfied:

  1. KUMA gets the partitions for the storage being checked and groups the partitions by cold storage disks and spaces.
  2. For each space, KUMA checks whether the specified storage condition is satisfied.
  3. If the condition is satisfied (for example, if the space contains that exceed their retention time, or if the size of the storage has reached or exceeded limit specified in the condition), KUMA transfers all partitions with the oldest date to cold storage disks or deletes these partitions if no cold storage disk is configured or if it is configured incorrectly. This action is repeated while the configured storage condition remains satisfied in the space; for example, if after deleting partitions for a date, the storage size still exceeds the maximum size specified in the condition.

    KUMA generates audit events when data transfer starts and ends, or when data is removed.

  4. If retention time is configured in KUMA, whenever partitions are transferred to cold storage disks, it is checked whether the configured conditions are satisfied on the disk. If events are found on the disk that have been stored for longer than the Event retention time, which is counted from the moment the events were received in KUMA, the solution deletes these events or all partitions for the oldest date.

    KUMA generates audit events when it deletes data.

If the ClickHouse cluster disks are 95% full, the biggest partitions are automatically moved to the cold storage disks. This can happen more often than once per hour.

During data transfer, the storage service remains operational, and its status stays green in the ResourcesActive services section of the KUMA web console. When you hover over the status icon, a message is displayed about the data transfer. When a cold storage disk is removed, the storage service has the yellow status.

Special considerations for storing and accessing events

  • When using HDFS disks for cold storage, protect your data in one of the following ways:
    • Configure a separate physical interface in the VLAN, where only HDFS disks and the ClickHouse cluster are located.
    • Configure network segmentation and traffic filtering rules that exclude direct access to the HDFS disk or interception of traffic to the disk from ClickHouse.
  • Events located in the ClickHouse cluster and on the cold storage disks are equally available in the KUMA web console. For example, when you search for events or view events related to alerts.
  • You can disable the storage of events or audit events on cold storage disks. To do so, specify the following in storage settings:
    • If you do not want to store events on cold storage disks, do one of the following:
      • If in the Storage condition options field, you have a gigabyte or percentage based storage condition selected, in the Event retention time, specify 0.
      • If in the Storage condition options field, you have a storage condition in days, in the Event retention time field, specify the same number of days as in the Storage condition options field.
    • If you do not want to store audit events on cold storage disks, in the Cold storage period for audit events field, specify 0 (days).

Special considerations for using HDFS disks

  • Before connecting HDFS disks, create directories for each node of the ClickHouse cluster on them in the following format: <HDFS disk host>/<shard ID>/<replica ID>. For example, if a cluster consists of two nodes containing two replicas of the same shard, the following directories must be created:
    • hdfs://hdfs-example-1:9000/clickhouse/1/1/
    • hdfs://hdfs-example-1:9000/clickhouse/1/2/

    Events from the ClickHouse cluster nodes are migrated to the directories with names containing the IDs of their shard and replica. If you change these node settings without creating a corresponding directory on the HDFS disk, events may be lost during migration.

  • HDFS disks added to storage operate in the JBOD mode. This means that if one of the disks fails, access to the storage will be lost. When using HDFS, take high availability into account and configure RAID, as well as storage of data from different replicas on different devices.
  • The speed of event recording to HDFS is usually lower than the speed of event recording to local disks. The speed of accessing events in HDFS, as a rule, is significantly lower than the speed of accessing events on local disks. When using local disks and HDFS disks at the same time, the data is written to them in turn.
  • HDFS is used only as distributed file data storage of ClickHouse. Compression mechanisms of ClickHouse, not HDFS, are used to compress data.
  • The ClickHouse server must have write access to the corresponding HDFS storage.

In this section

Removing cold storage disks

Detaching, archiving, and attaching partitions