Various malicious software attempts to hide its activity by using URL obfuscation techniques (using national domain names, including those with single characters, representing IP addresses in octal notation, repeated slashes, etc.). In this case, the same content can be frequently accessed via technically different addresses (for example, addresses that differ in scheme, port, or character case in a URL address).
As a result, when matching a URL with the lists of indicators of compromise (IoCs) in their initial form, this leads to a problem of threat omission, because no matching with IoCs occurs.
For example, github.com@520966948 is an obfuscated IP address 31.13.83.36 that actually belongs to facebook.com.
CyberTrace has two advantage features:
The Kaspersky data feeds cannot allow 13 variants of a URL with a different normalization variant, because this will lead to an unreasonable increase of the feed's size. However, if the user sends us a known URL in a specific format, we can transform it, search for matches in the feeds, and detect it by using normalization.
At the moment, 13 rules of URL normalization are used. The following are the examples of applying these rules:
.
" and "..
") according to the algorithm described in RFC 3986, section 5.2.4 Remove Dot Segments (https://www.ietf.org/rfc/rfc3986.txt):http://www.example.com/../a/b/../c/./d.html => http://www.example.com/a/b/c/d.html
http://example.com => example.com
тест.рф => xn--e1aybc.xn--p1ai
www
prefix:www.example.com => example.com
example.com//dir/test.html => example.com/dir/test.html
example.com/ => example.com
login:password@example.com => example.com
example.com:80/index => example.com/index
#fragment
reference:example.com#fragment => example.com
example.com./index.html => example.com/index.html
EXAMPLE.COM => example.com
0112.0175.0117.0150 => 74.125.79.104
For closing the groups of a malicious URL, the feeds use eight types of entries that are divided into masked and unmasked entries.
Matching a normalized URL with the entries from the databases on the basis of the URL should be performed regarding the purpose of certain types of entries. Using URL normalization and masks provides an increase in the feed's detection rate, as well as minimizing the supplied data volume and decreasing false positives.
Detailed information is provided in Kaspersky Threat Intelligence Data Feeds Implementation Guide.
Page top