Restoring the Raft cluster from the current state of a node

To restore the Raft cluster from the current state of a selected node:

  1. Prepare the cluster for recovery:
    1. Choose the node to become the initial node of the new cluster. On this node, you need to stop the KUMA Core service, but without deleting the working directory and the systemd file.
    2. Stop the KUMA Core service on all nodes in the cluster:

      sudo systemctl stop kuma-core-<KUMA_Core_service_ID>.service

    3. Delete the working directories on all nodes in the cluster except the chosen initial node:

      sudo rmdir <directory_name>

    4. On all nodes in the cluster except the initial node, remove the systemd services.

    At this point, all KUMA Core services are stopped and the servers are ready for cluster recovery.

  2. Restore the Raft cluster with a single KUMA Core:
    1. On the initial node of the new Raft cluster, create the following file:

      sudo touch /opt/kaspersky/kuma/core/<service_id>/raft/.reset

    2. On the initial node of the Raft cluster, remove the --raft.join parameter in the systemd file and apply the changes by running the following command:

      sudo systemctl daemon-reload

    3. On the initial node of the Raft cluster, start the KUMA Core:

      sudo systemctl start kuma-core-<KUMA Core service ID>.service

    The Raft cluster is restored with a single KUMA Core. On the rest of the nodes, the KUMA core services are stopped and the nodes are removed from the cluster.

  3. If you need a high-availability Raft cluster, perform the restoration procedure on the rest of the cluster nodes:

The cluster is restored. Certificates of services, such as collectors, correlators, and storages, do not need to be reset.

Page top