Filtering rules are criteria that Feed Utility uses to filter the original feed files.
Filtering rules are specified for each feed in a Filters
element. Each filtering rule is set in a Field
element: the field name is specified in the name
attribute and the filtering criteria are specified in the value
attribute. A field can have only one filtering rule associated with it; you cannot have two Field
parameters for one field.
The feeds of email type are additionally filtered by subject of an email and an email sender. See section "Filtering rules for feeds of email type" below.
The following is an example of filtering rules for a feed. These rules specify that the output feed must include only records that have the popularity
field equal to 4
or 5
and the mask
field containing .ru
or .com
.
<Feed> ... <Filters> <Field name="popularity" value="4;5"/> <Field name="mask" value=".ru;.com"/> </Filters> ... <Feed> |
Feed Utility ignores leading and terminating space symbols, or tab symbols in the value of the "value"
attribute.
Only those records that match all the specified criteria are included in the output file. If a filtering criterion is specified for a field, and the field is missing from a record, Feed Utility will not include this record in the output file.
Defining filtering criteria for numeric values
Numeric values are integers. Decimal values are not supported.
You can define filtering criteria for numeric fields in the following ways:
value="*"
A field can have any value.
For example, <Field name="type" value="*"/>
means that the type
field can have any value.
value="%value%"
Exact numeric value. A field must be equal to %value%
.
For example, <Field name="popularity" value="1"/>
means that the popularity
field must be equal to 1
.
value="%value1%;%value2%"
One of several numeric values. A field can have one of the specified numeric values (%value1%
or %value2%
).
You can specify additional values using ";"
as a delimiter.
For example, <Field name="popularity" value="1;3"/>
means that the popularity
field must be equal to 1
or 3
, but not 2
.
value="[%value1%;%value2%]"
Range of numeric values.
A field can have one of the values in the specified range between %value1%
and %value2%
.
For example, <Field name="popularity" value="[1;3]"/>
means that the popularity
field must have a value from 1
to 3
, including 2
.
value="[%value1%;*]"
or value="[*;%value1%]"
Open range of numeric values. Same as range, but an asterisk (*
) specifies infinity.
For example, <Field name="popularity" value="[2;*]"/>
means that the value of the popularity
field must be greater than or equal to 2
.
Defining filtering criteria for strings
You can define filtering criteria for string fields in the following ways:
value="*"
A field can have any value.
For example, <Field name="mask" value="*"/>
means that the mask
field can have any value.
%string%
"A field must contain the specified string.
For example, <Field name="geo" value="ru"/>
means that the value of the geo
field must contain "ru"
.
value="%string1%;%string2%"
Contains one or more of the specified strings.
For example, <Field name="geo" value="ru;us"/>
means that the value of the geo
field must contain "ru"
or "us"
, or both "ru"
and "us"
.
Defining filtering criteria for dates
Date values in feeds are formatted either in the pattern "DD.MM.YYYY"
(for example, "26.04.2014"
), in the pattern "YYYY-MM-DD"
(for example "2014-04-26"
), or in the pattern "MM/DD/YYYY"
(for example, "04/26/2014"
).
You can define filtering criteria for fields with dates in the following ways:
value="*"
A field can have any value.
For example, <Field name="last_seen" value="*"/>
means that the last_seen
field can have any value.
value="%date%"
A field must contain the specified date.
For example, <Field name="first_seen" value="14.10.2015"/>
means that the first_seen
field value must be 14 October 2015.
value="[%date1%;%date2%]"
A field must contain the date in the specified range.
For example, <Field name="first_seen" value="[01.02.2013;01.02.2015]"/>
means the first_seen
field value must be from 1 February 2013 to 1 February 2015.
value="[%date1%;*]"
or value="[*;%date1%]"
Open range of dates. Same as range of dates, that is, value="[%date1%;%date2%]"
. But an asterisk (*
) specifies infinity.
For example, <Field name="first_seen" value="[*;10.12.2015]"/>
means that the first_seen
field value must be on or before 10 December 2015.
Excluding records with missing fields
In the original feed files, some records can have extra fields or can lack some fields. For records with extra fields, Feed Utility includes only those fields that are specified in the RequiredFields
element of feed rules for a specified feed. For records that lack some fields, Feed Utility includes such records in the output if they contain at least one of the fields specified in the RequiredFields
element. If some fields specified in the RequiredFields
element are missing from a record in the original feed, the record in the processed feed will not contain them.
If you want to exclude records with missing fields from the output, you must create filtering rules for all required fields.
In the following example, Feed Utility will include records that have popularity
, or mask
, or both popularity
and mask
, fields.
<RequiredFields>popularity;mask</RequiredFields> |
If you want Feed Utility to include only those records that have both popularity
and mask
, create a filtering rule for both fields. You can specify criteria for field values, or use an asterisk (*)
to specify any value.
In the following example, only records that have both fields (mask
and popularity
) are included in the resulting feed.
<Filters> <Field name="popularity" value="*"/> <Field name="mask" value="*"/> </Filters> <RequiredFields>popularity;mask</RequiredFields> |
You can specify exact criteria, in the same manner. The following example instructs Feed Utility to include only records that have the popularity
field with a value of 5
and the mask
field with any value.
<Filters> <Field name="popularity" value="5"/> <Field name="mask" value="*"/> </Filters> <RequiredFields>popularity;mask</RequiredFields> |
Filtering rules for feeds of email type
In the kl_feed_util configuration file, specify one or more filtering rules in the MailboxConnection/Filters
element. This element is optional.
If you have added at least one filtering rule, specify the following attributes in the MailboxConnection/Filters/Filter
element:
Specify the subject
value (subject of the email message) and/or the from
value (sender of the email message).
The mail server can store the From
field in two variants: sender@mail.ru or sender<sender@mail.ru>. If in the Filter element the condition
attribute has match
, the value will be compared with the sender@mail.ru
value (if the From
field has sender@mail.ru), or will be compared with the value in parentheses (if the From
field has sender<sender@mail.ru>).
Set the following filter condition values:
contains
(the value from the email message must contain the value from this field).not_contains
(the value from the email message must not contain the value from this field).The not_contains
filter has priority over the contains
filter.
Values to be compared are case-insensitive.
match
(the value from the email message must be equal to the value from this field).not_match
(the value from the email message must not be equal to the value from this field).The following is an example of filtering rules for a feed of the email type:
<Filters> <Filter field="from" condition="not_match">techsupport@ya.ru</Filter> <Filter field="subject" condition="contains">Best IoCs ever</Filter> </Filters> |