You can use the following functionality to monitor the status of all services except cold storage and agent:
Viewing Victoria Metrics alerts
Users with the General administrator role can configure thresholds for KUMA services, and if a specified threshold is exceeded, the following changes take place:
Green means the service is running and accessible from the Core server.
Red means the service is not running or is not accessible from the Core server.
Yellow is the status that applies to all services except the agent. The yellow status means that the service is running, but there are errors in the service log, or there are alerts for the service from Victoria Metrics. You can view the error message by hovering the mouse cursor over the status of the service in the Active services section.
Purple is the status that is applied to running services whose configuration file in the database has changed, but that have no other errors. If a service has an incorrect configuration file and has errors, for example, from Victoria Metrics, status of the service is yellow.
Gray means that if a deleted tenant had a running service that is still running, that service is displayed with a gray status on the Active services page. Services with the gray status are kept when you delete the tenant to let you copy the ID and remove services on your servers. Only the General administrator can delete services with the gray status. When a tenant is deleted, the services of that tenant are assigned to the Main tenant.
The following examples show how to monitor service status.
If the collector service has a yellow status in the Active services section and you see the Enrichment errors increasing message, you can:
Go to Metrics → <service type> → <service name> → Enrichment → Errors section of KUMA for the service with the yellow status, find out which enrichment is causing errors, and view the chart to find out when the problem started and how it evolved.
The likely cause of the enrichment errors may be DNS server unavailability or CyberTrace enrichment errors, therefore you can check your DNS or CyberTrace connection settings.
If the collector service has a yellow status in the Active services section and you see the Output Event Loss increasing message, you can:
Go to the Metrics → <service type> → <service name> → IO → Output Event Loss section of KUMA for the service with the yellow status and view the chart to find out when the problem started and how it evolved.
The likely cause of the enrichment errors may be a buffer overflow or unavailability of destination, therefore you can check the availability and the connection of the destination or find out why the buffer capacity is exceeded.
Configuring service monitoring
To configure the services:
In the KUMA web console, go to the Settings → Service monitoring section.
KUMA monitors the status of services in accordance with the specified parameters.
In the Active services section, you can filter services by statuses or enter a word from the error text, for example, "QPS" or "buffer", in the search field and press ENTER. This results in a list of services with errors. Special characters ", },{, are not allowed in the search string and will produce irrelevant results.
Disabling service monitoring
To disable service monitoring:
In the KUMA web console, go to the Settings → Service monitoring section.
If you want to disable service monitoring only for collectors, in the Service monitoring. Thresholds setting window, under Collectors, select the Disable connector errors check box.
This disables only the analysis of the Connector errors metric for collectors.
If you want to disable monitoring for all services, in the Service monitoring. Thresholds setting window, select the Disable check box.
KUMA service monitoring is disabled, and services do not get the yellow status.