Event parsing settings

You can configure the rules for converting incoming events to the KUMA format when creating event parsing rules in the normalizer settings window, on the Normalization scheme tab. Available event parsing settings are listed in the table below.

Available event parsing settings

Setting

Description

Name

Name of the parsing rule. Maximum length of the name: 128 Unicode characters. The name of the main parsing rule is used as the name of the normalizer.

Required setting.

Tenant

The name of the tenant that owns the resource.

This setting is not available for extra parsing rules.

Parsing method

The type of incoming events. Depending on the selected parsing method, you can use the predefined event field matching rules or define your own rules. When you select some parsing methods, additional settings may become available they you must specify. Available parsing methods:

json
This parsing method is used to process JSON data where each object, including its nested objects, occupies a single line in a file.

When processing files with hierarchically structured data, you can reference the fields of nested objects using the dot notation. For example, the username parameter from the string "user": {"username": "system: node: example-01"} can be accessed by using the user.username query.

Files are processed line by line. Multi-line objects with nested structures may be normalized incorrectly.

In complex normalization schemes where additional normalizers are used, all nested objects are processed at the first normalization level, except for cases when the extra normalization conditions are not specified and, therefore, the event being processed is passed to the extra normalizer in its entirety.

You can use \n and \r\n as newline characters. Strings must be UTF-8 encoded.

If you want to send the raw event for advanced normalization, at each nesting level in the Advanced event parsing window, select Yes in the Keep raw event drop-down list.
cef
This parsing method is used to process CEF data.

If you select this parsing method, you can use the predefined rules for converting events to the KUMA format by clicking Apply default mapping.
regexp
This parsing method is used to create custom rules for processing data in a format using regular expressions.

You must add a regular expression (RE2 syntax) with named capturing groups to the field under Normalization. The name of the capturing group and its value are considered the field and value of the raw event that can be converted to an event field in KUMA format.

To add event handling rules:
1. If necessary, copy an example of the data you want to process to the Event examples field. We recommend completing this step.
2. In the field under Normalization, add a RE2 regular expression with named capturing groups, for example, "(?P<name>regexp)". The regular expression added to the field under Normalization must exactly match the event. When designing the regular expression, we recommend using special characters that match the starting and ending positions of the text: ^, $.
  You can add multiple regular expressions or remove regular expressions. To add a regular expression, click Add regular expression. To remove a regular expression, click the delete icon next to it.
3. Click the Copy field names to the mapping table button.
  Capture group names are displayed in the KUMA field column of the Mapping table. You can select the corresponding KUMA field in the column opposite each capturing group. If you followed the CEF format when naming the capturing groups, you can use automatic CEF mapping by selecting the Use CEF syntax for normalization check box.
Event handling rules are added.
syslog
This parsing method is used to process data in syslog format.

If you select this parsing method, you can use the predefined rules for converting events to the KUMA format by clicking Apply default mapping.

To parse events in rfc5424 format with a structured-data section, in the Keep extra fields drop-down list, select Yes. This makes the values from the structured-data section available in the Extra fields.
csv
This parsing method is used to create custom rules for processing CSV data.

When choosing this parsing method, you must specify the separator of values in the string in the Delimiter field. Any single-byte ASCII character can be used as a delimiter for values in a string.

xml
This parsing method is used to process XML data in which each object, including nested objects, occupies a single line in a file. Files are processed line by line.

If you want to send the raw event for advanced normalization, at each nesting level in the Advanced event parsing window, select Yes in the Keep raw event drop-down list.

If you select this parsing method, under XML attributes, you can specify the key XML attributes to be extracted from tags. If an XML structure has multiple XML attributes with different values in the same tag, you can identify the necessary value by specifying the key of the value in the Source column of the Mapping table.

To add key XML attributes:
1. Click + Add field.
2. This opens a window; in that window, specify the path to the XML attribute.
You can add multiple XML attributes or remove XML attributes. To remove an individual XML attribute, click the delete icon next to it. To remove all XML attributes, click Reset.

If XML key attributes are not specified, then in the course of field mapping the unique path to the XML value will be represented by a sequence of tags.

Tag numbering

Starting with KUMA 2.1.3, you can use automatic tag numbering in XML events. This lets you parse an event with the identical tags or unnamed tags, such as <Data>.

As an example, we will number the tags of the EventData attribute of the Microsoft Windows PowerShell event ID 800.

<Event xmlns="http://schemas .microsoft.com/win/2004/08/events/event">

<System>

<Provider Name="Microsoft-Windows-ActiveDirectory_DomainService" Guid="{0e8478c5-3605-4e8c-8497-1e730c959516}" EventSourceName="NTDS" />

<EventID Qualifiers="0000">0000</EventID>

<Version>@</Version>

<Level>4</Level>

<Task>15</Task>

<Opcode>0</Opcode>

<Keywords >0x8080000000000000</Keywords>

<TimeCreated SystemTime="2000-01-01T00:00:00.659495900Z" />

<EventRecordID>55647</EventRecordID>

<Correlation />

<Execution ProcessID="1" ThreadID="1" />

<Channel>service</Channel>

<Computer>computer</Computer>

<Security UserID="0000" />

</System>

<EventData>

<Data>583</Data>

<Data>36</Data>

<Data>192.168.0.1:5084</Data>

<Data>level</Data>

<Data>name, lDAPDisplayName</Data>

<Data />

<Data>5545</Data>

<Data>3</Data>

<Data>0</Data>

<Data>0</Data>

<Data>0</Data>

<Data>15</Data>

<Data>none</Data>

</EventData>

</Event>

To parse events with identical tags or unnamed tags, you need to configure tag numbering and data mapping for numbered tags with KUMA event fields.

KUMA 3.0.x supports using XML attributes and tag numbering at the same time in the same extra normalizer. If an XML attribute contains unnamed tags or identical tags, we recommend using tag numbering. If the XML attribute contains only named tags, we recommend using XML attributes.

To use XML attributes and tag numbering in extra normalizers, you must sequentially enable the Keep raw event setting in each extra normalizer along the path that the event follows to the target extra normalizer, and in the target extra normalizer itself.

For an example of how tag numbering works, you can refer to the MicrosoftProducts normalizer. The Keep raw event setting is enabled sequentially in both AD FS and 424 extra normalizers.

To set up the parsing of events with unnamed or identical tags:
1. Open an existing normalizer or create a new normalizer.
2. In the Basic event parsing window of the normalizer, in the Parsing method drop-down list, select xml.
3. In the Tag numbering field, click + Add field.
4. In the displayed field, enter the full path to the tag to whose elements you want to assign a number, for example, Event.EventData.Data. The first tag gets number 0. If the tag is empty, for example, <Data />, it is also assigned a number.
5. To configure data mapping, under Mapping, click + Add row and do the following:
  1. In the displayed row, in the Source field, enter the full path to the tag and the index of the tag. For example, for the Microsoft Windows PowerShell event ID 800 from the example above, the full paths to tags and tag indices are as follows:
    - Event.EventData.Data.0
    - Event.EventData.Data.1
    - Event.EventData.Data.2 and so on.
  2. In the KUMA field drop-down list, select the field in the KUMA event that will receive the value from the numbered tag after parsing.
6. Save changes in one of the following ways:
  - If you created a new normalizer, click Save.
  - If you edited an existing normalizer, in the collector to which the normalizer is linked, click Update configuration.
Parsing is configured.
netflow5
This parsing method is used to process data in the NetFlow v5 format.

If you select this parsing method, you can use the predefined rules for converting events to the KUMA format by clicking Apply default mapping. If the netflow5 parsing method is selected for the main parsing, extra normalization is not available.

The default mapping rules for the netflow5 parsing method do not specify the protocol type in KUMA event fields. When parsing data in NetFlow format, on the Enrichment normalizer tab, you must create a constant data enrichment rule that adds the netflow value to the DeviceProduct target field.
netflow9
This parsing method is used to process data in the NetFlow v9 format.

If you select this parsing method, you can use the predefined rules for converting events to the KUMA format by clicking Apply default mapping. If the netflow9 parsing method is selected for the main parsing, extra normalization is not available.

The default mapping rules for the netflow9 parsing method do not specify the protocol type in KUMA event fields. When parsing data in NetFlow format, on the Enrichment normalizer tab, you must create a constant data enrichment rule that adds the netflow value to the DeviceProduct target field.
sflow5
This parsing method is used to process data in sflow5 format.

If you select this parsing method, you can use the predefined rules for converting events to the KUMA format by clicking Apply default mapping. If the sflow5 parsing method is selected for the main parsing, extra normalization is not available.
ipfix
This parsing method is used to process IPFIX data.

If you select this parsing method, you can use the predefined rules for converting events to the KUMA format by clicking Apply default mapping. If the ipfix parsing method is selected for the main parsing, extra normalization is not available.

The default mapping rules for the ipfix parsing method do not specify the protocol type in KUMA event fields. When parsing data in NetFlow format, on the Enrichment normalizer tab, you must create a constant data enrichment rule that adds the netflow value to the DeviceProduct target field.
sql
The normalizer uses this parsing method to process data obtained by making a selection from the database.

Required setting.

Keep raw event

Keeping raw events in the newly created normalized event. Available values:

Don't save—do not save the raw event. This is the default setting.
Only errors—save the raw event in the Raw field of the normalized event if errors occurred when parsing it. This value is useful for debugging because an event having a non-empty Raw field indicates a problem.
If fields containing the names *Address or *Date* do not comply with normalization rules, these fields are ignored. No normalization error occurs in this case, and the values of the fields are not displayed in the Raw field of the normalized event even if the Keep raw event → Only errors option was selected.
Always—always save the raw event in the Raw field of the normalized event.

Required setting. This setting is not available for extra parsing rules.

Keep extra fields

Keep fields and values for which no mapping rules are configured. This data is saved as an array in the Extra event field. Normalized events can be searched and filtered based on the data stored in the Extra field.

Filtering based on data from the Extra event field

By default, no extra fields are saved.

Required setting.

Description

Description of the resource. Maximum length of the description: 4000 Unicode characters.

This setting is not available for extra parsing rules.

Event examples

Example of data that you want to process.

This setting is not available for the following parsing methods: netflow5, netflow9, sflow5, ipfix, and sql.

If the event was parsed successfully, and the type of the data obtained from the raw event matches the type of the KUMA field, the Event examples field is filled with data obtained from the raw event. For example, the "192.168.0.1" value in quotation marks does not appear in the SourceAddress field. However, the 192.168.0.1 value is displayed in the Event examples field.

Mapping

Settings for configuring the mapping of source event fields to fields of the event in the KUMA format:

Source lists the names of the raw event fields that you want to convert into KUMA event fields.
Next to field names in the Source column, clicking opens the Conversion window, in which you can click Add conversion to create rules for modifying the source data before writing them to the KUMA event fields. You can reorder and delete created rules. To change the position of a rule, click next to it. To delete a rule, click next to it.

Available conversions
Conversions are modifications that are applied to a value before it is written to the event field. You can select one of the following conversion types from the drop-down list:
- entropy is used for converting the value of the source field using the information entropy calculation function and placing the conversion result in the target field of the float type. The result of the conversion is a number. Calculating the information entropy allows detecting DNS tunnels or compromised passwords, for example, when a user enters the password instead of the login and the password gets logged in plain text.
- lower—is used to make all characters of the value lowercase
- upper—is used to make all characters of the value uppercase
- regexp – used to convert a value using a specified RE2 regular expression. When you select this type of conversion, a field is displayed in which you must specify the RE2 regular expression.
- substring is used to extract characters in a specified range of positions. When you select this type of conversion, the Start and End fields are displayed, in which you must specify the range of positions.
- replace—is used to replace specified character sequence with the other character sequence. When you select this type of conversion, the following fields are displayed:
  - Replace chars specifies the sequence of characters to be replaced.
  - With chars is the character sequence to be used instead of the character sequence being replaced.
- trim removes the specified characters from the beginning and from the end of the event field value. When you select this type of conversion, the Chars field is displayed in which you must specify the characters. For example, if a trim conversion with the Micromon value is applied to Microsoft-Windows-Sysmon, the new value is soft-Windows-Sys.
- append appends the specified characters to the end of the event field value. When you select this type of conversion, the Constant field is displayed in which you must specify the characters.
- prepend prepends the specified characters to the beginning of the event field value. When you select this type of conversion, the Constant field is displayed in which you must specify the characters.
- replace with regexp is used to replace RE2 regular expression results with the specified character sequence. When you select this type of conversion, the following fields are displayed:
  - Expression is the RE2 regular expression whose results you want to replace.
  - With chars is the character sequence to be used instead of the character sequence being replaced.
- Converting encoded strings to text:
  - decodeHexString—used to convert a HEX string to text.
  - decodeBase64String—used to convert a Base64 string to text.
  - decodeBase64URLString—used to convert a Base64url string to text.
  When converting a corrupted string or if conversion error occur, corrupted data may be written to the event field.
  
  During event enrichment, if the length of the encoded string exceeds the size of the field of the normalized event, the string is truncated and is not decoded.
  
  If the length of the decoded string exceeds the size of the event field into which the decoded value is to be written, the string is truncated to fit the size of the event field.
Conversions when using the extended event schema

Whether or not a conversion can be used depends on the type of extended event schema field being used:
- For an additional field of the "String" type, all types of conversions are available.
- For fields of the "Number" and "Float" types, the following types of conversions are available: regexp, substring, replace, trim, append, prepend, replaceWithRegexp, decodeHexString, decodeBase64String, and decodeBase64URLString.
- For fields of "Array of strings", "Array of numbers", and "Array of floats" types, the following types of conversions are available: append and prepend.
KUMA field lists fields of KUMA events. You can search for fields by entering their names.
Label is a unique custom label for event fields that begin with DeviceCustom* and Flex*.

You can add new table rows or delete table rows. To add a new table row, click Add row. To delete a single row in the table, click cross next to it. To delete all table rows, click Clear all.

If you have loaded data into the Event examples field, the table will have an Examples column containing examples of values carried over from the raw event field to the KUMA event field.

If the size of the KUMA event field is less than the length of the value placed in it, the value is truncated to the size of the event field.

Extended event schema

When normalizing events, extended event schema fields can be used in addition to standard KUMA event schema fields. When using extended event schema fields, the general limit for the maximum size of an event that can be processed by the collector is the same, 4 MB. Information about the types of extended event schema fields is shown in the table below.

Using many unique fields of the extended event schema can reduce the performance of the system, increase the amount of disk space required for storing events, and make the information difficult to understand.

We recommend consciously choosing a minimal set of additional fields of the extended event schema that you want to use in normalizers and correlation.

To use the fields of the extended event schema:

Open an existing normalizer or create a new normalizer.
Specify the basic settings of the normalizer.
Click Add row.
For the Source setting, enter the name of the source field in the raw event.

For the KUMA field, specify the name of the extended event schema field to be created.

Fields of the extended data model of normalized events:

Click OK, then click Save to save the event normalizer.

Field name Specified in the KUMA field setting	Data type	Availability in the normalizer	Description
`S.<field name>`	String	All types	Field of the "String" type
`N.<field name>`	Number	All types	Field of the "Number" type
`F.<field name>`	Float	All types	Field of the "Float" type
`SA.<field name>`	Array of strings	KV, JSON	Field of the "Array of strings" type The order of the array elements is the same as the order of the elements of the raw event.
`NA.<field name>`	Array of integers	KV, JSON	A field of the "Array of integers" type. The order of the array elements is the same as the order of the elements of the raw event.
`FA.<field name>`	Array of floats	KV, JSON	Field of the "Array of floats" type The order of the array elements is the same as the order of the elements of the raw event.

The normalizer is saved, and the additional field is created. After saving the normalizer, the additional field can be used in other normalizers and KUMA resources.

If the data in the fields of the raw event does not match the type of the KUMA field, the value is not saved during the normalization of events if type conversion cannot be performed. For example, the string test cannot be written to the DeviceCustomNumber1 KUMA field of the Number type.

If you want to minimize the load on the storage server when searching events, preparing reports, and performing other operations on events in storage, use KUMA event schema fields as your first preference, extended event schema fields as your second preference, and the Extra fields as your last resort.

Page top