Storage
A KUMA storage is used to store normalized events so that they can be quickly and continually accessed from KUMA for the purpose of extracting analytical data. Access speed and continuity are ensured through the use of the ClickHouse technology. This means that a storage is a ClickHouse cluster bound to a KUMA storage service.
Storage components: clusters, shards, replicas, and keepers.
A cluster is a logical group of machines that possess all accumulated normalized KUMA events. It consists of one or more logical shards.
A shard is a logical group of machines that possess a specific portion of all normalized events accumulated in the cluster. It consists of one or more replicas. Increasing the number of shards lets you do the following:
- Accumulate more events by increasing the total number of servers and disk space.
- Absorb a larger stream of events by distributing the load associated with an influx of new events.
- Reduce the time taken to search for events by distributing search areas among multiple machines.
A replica is a machine that is a member of the logical shard and possesses a copy of the data of this shard. If there are multiple replicas, there are multiple copies (data is replicated). Increasing the number of replicas lets you do the following:
- Improve fault tolerance.
- Distribute the total load related to data searches among multiple machines (although it's best to increase the number of shards for this purpose).
A keeper is an optional role of a replica that involves the replica's participation in coordinating data replication throughout the entire cluster. There must be at least one replica with this role for the entire cluster. It is recommended to have 3 keeper replicas. The number of replicas involved in coordinating replication must be an odd number.
When choosing a ClickHouse cluster configuration, consider the specific event storage requirements of your organization. For more information, please refer to the ClickHouse documentation.
In repositories, you can create spaces. The spaces enable to create a data structure in the cluster and, for example, store the events of a certain type together.
Page top