[IntelMQ-dev] Shadowserver parser: Bad mapping for malware events

Thomas Hungenberg th at cert-bund.de
Fri Jan 26 11:01:30 CET 2024


Hi Kamil,

I thought about this again in more detail.
The classification attributes should describe the incident with getting more specific from taxonomy to identifier.
So for feeds like Open-SNMP, it makes sense to set the classification.identifer to the feed's name like this:

         'classification.taxonomy': 'vulnerable',
         'classification.type': 'vulnerable-system',
         'classification.identifier': 'open-snmp',

However, for malware events my proposal of setting the classification.identifier to the feed's name
does not make sense as a feedname like "event4-microsoft-sinkhole" is not a specific description
of the incident itself but rather the type of source of the information.

So I think it is best to keep writing the malware name ("infection" or "tag") to classification.identifier
as this is a specific description of the individual incident.
However, the malware name ("infection" or "tag") needs also be stored in malware.name for the malware name mapping to work.
"family" should instead be stored in extra.

So the neccessary changes for event_sinkhole and event_sinkhole_dns look like:

-        ('malware.name', 'family', validate_to_none),
+        ('malware.name', 'infection', validate_to_none),
-        ('extra.', 'infection', validate_to_none),
+        ('extra.', 'family', validate_to_none),

For event_sinkhole_http:

-        ('classification.identifier', 'tag'),
-        ('malware.name', 'family', validate_to_none),
+        ('classification.identifier', 'infection', validate_to_none),
+        ('malware.name', 'infection', validate_to_none),
          ('extra.', 'tag', validate_to_none),
-        ('extra.', 'infection', validate_to_none),
+        ('extra.', 'family', validate_to_none),

For event_sinkhole_http_referer:

      'optional_fields':
-        ('malware.name', 'family', validate_to_none),
+        ('classification.identifier', 'infection', validate_to_none),
+        ('malware.name', 'infection', validate_to_none),
-        ('extra.', 'infection', validate_to_none),
+        ('extra.', 'family', validate_to_none),

      'constant_fields': {
-        'classification.taxonomy': 'other',
-        'classification.type': 'other',
-        'classification.identifier': 'sinkhole-http-referer',
+        'classification.taxonomy': 'malicious-code',
+        'classification.type': 'infected-system',
+        'protocol.application': 'http',


For some other feeds like "malware_url", I have also added the missing
"validate_to_none" flag to make it consistent with all feeds.

Please find attached the corrected patch for _config.py included with IntelMQ 3.2.1
and the complete file.

I will now have a look at the json schema.


Kind regards
Thomas


On 25.01.24 08:36, Kamil Mankowski wrote:
> Hi Thomas,
> 
> I've got answer from ShadowServer with the proposed mapping changes. Could you have a look if this diff looks like solving the issue?
> 
> In my eyes it's still mixing the value of "malware.name" - once it's 'family', once 'infection', but it may also be a difference in data available in 
> reports.
> 
> event4_microsoft_sinkhole:
> ***************
> *** 1,5 ****
> --- 1,6 ----
>    {
>       "constant_fields" : {
> +       "classification.identifier" : "event4-microsoft-sinkhole",
>          "classification.taxonomy" : "malicious-code",
>          "classification.type" : "infected-system"
>       },
> ***************
> *** 7,17 ****
>       "file_name" : "event4_microsoft_sinkhole",
>       "optional_fields" : [
>          [
> -          "classification.identifier",
> -          "infection",
> -          "validate_to_none"
> -       ],
> -       [
>             "malware.name",
>             "family",
>             "validate_to_none"
> --- 8,13 ----
> 
> event4_microsoft_sinkhole_http:
> ***************
> *** 1,5 ****
> --- 1,6 ----
>    {
>       "constant_fields" : {
> +       "classification.identifier" : "event4-microsoft-sinkhole-http",
>          "classification.taxonomy" : "malicious-code",
>          "classification.type" : "infected-system",
>          "protocol.application" : "http"
> ***************
> *** 8,17 ****
>       "file_name" : "event4_microsoft_sinkhole_http",
>       "optional_fields" : [
>          [
> -          "classification.identifier",
> -          "tag"
> -       ],
> -       [
>             "malware.name",
>             "family",
>             "validate_to_none"
> --- 9,14 ----
> 
> event6_sinkhole:
> ***************
> *** 1,5 ****
> --- 1,6 ----
>    {
>       "constant_fields" : {
> +       "classification.identifier" : "event6-sinkhole",
>          "classification.taxonomy" : "malicious-code",
>          "classification.type" : "infected-system"
>       },
> ***************
> *** 7,17 ****
>       "file_name" : "event6_sinkhole",
>       "optional_fields" : [
>          [
> -          "classification.identifier",
> -          "infection",
> -          "validate_to_none"
> -       ],
> -       [
>             "malware.name",
>             "family",
>             "validate_to_none"
> --- 8,13 ----
> 
> event6_sinkhole_http:
> ***************
> *** 1,5 ****
> --- 1,6 ----
>    {
>       "constant_fields" : {
> +       "classification.identifier" : "event6-sinkhole-http",
>          "classification.taxonomy" : "malicious-code",
>          "classification.type" : "infected-system",
>          "protocol.application" : "http"
> ***************
> *** 8,17 ****
>       "file_name" : "event6_sinkhole_http",
>       "optional_fields" : [
>          [
> -          "classification.identifier",
> -          "tag"
> -       ],
> -       [
>             "malware.name",
>             "family",
>             "validate_to_none"
> --- 9,14 ----
> 
> event6_sinkhole_http_referer:
> ***************
> *** 1,8 ****
>    {
>       "constant_fields" : {
>          "classification.identifier" : "event6-sinkhole-http-referer",
> !       "classification.taxonomy" : "other",
> !       "classification.type" : "other"
>       },
>       "feed_name" : "Sinkhole-Events-HTTP-Referer IPv6",
>       "file_name" : "event6_sinkhole_http_referer",
> --- 1,8 ----
>    {
>       "constant_fields" : {
>          "classification.identifier" : "event6-sinkhole-http-referer",
> !       "classification.taxonomy" : "malicious-code",
> !       "classification.type" : "infected-system"
>       },
>       "feed_name" : "Sinkhole-Events-HTTP-Referer IPv6",
>       "file_name" : "event6_sinkhole_http_referer",
> 
> event_honeypot_brute_force:
> ***************
> *** 1,5 ****
> --- 1,6 ----
>    {
>       "constant_fields" : {
> +       "classification.identifier" : "honeypot-brute-force",
>          "classification.taxonomy" : "intrusion-attempts",
>          "classification.type" : "brute-force"
>       },
> ***************
> *** 7,16 ****
>       "file_name" : "event4_honeypot_brute_force",
>       "optional_fields" : [
>          [
> -          "classification.identifier",
> -          "application"
> -       ],
> -       [
>             "destination.account",
>             "username",
>             "validate_to_none"
> --- 8,13 ----
> 
> event_honeypot_darknet:
> ***************
> *** 1,5 ****
> --- 1,6 ----
>    {
>       "constant_fields" : {
> +       "classification.identifier" : "honeypot-darknet",
>          "classification.taxonomy" : "other",
>          "classification.type" : "other"
>       },
> ***************
> *** 7,17 ****
>       "file_name" : "event4_honeypot_darknet",
>       "optional_fields" : [
>          [
> -          "classification.identifier",
> -          "tag",
> -          "validate_to_none"
> -       ],
> -       [
>             "malware.name",
>             "infection",
>             "validate_to_none"
> --- 8,13 ----
> 
> event_sinkhole:
> ***************
> *** 1,5 ****
> --- 1,6 ----
>    {
>       "constant_fields" : {
> +       "classification.identifier" : "sinkhole",
>          "classification.taxonomy" : "malicious-code",
>          "classification.type" : "infected-system"
>       },
> ***************
> *** 7,17 ****
>       "file_name" : "event4_sinkhole",
>       "optional_fields" : [
>          [
> -          "classification.identifier",
> -          "infection",
> -          "validate_to_none"
> -       ],
> -       [
>             "malware.name",
>             "family",
>             "validate_to_none"
> --- 8,13 ----
> 
> event_sinkhole_http:
> ***************
> *** 1,5 ****
> --- 1,6 ----
>    {
>       "constant_fields" : {
> +       "classification.identifier" : "sinkhole-http",
>          "classification.taxonomy" : "malicious-code",
>          "classification.type" : "infected-system",
>          "protocol.application" : "http"
> ***************
> *** 8,17 ****
>       "file_name" : "event4_sinkhole_http",
>       "optional_fields" : [
>          [
> -          "classification.identifier",
> -          "tag"
> -       ],
> -       [
>             "malware.name",
>             "family",
>             "validate_to_none"
> --- 9,14 ----
> 
> event_sinkhole_http_referer:
> ***************
> *** 1,8 ****
>    {
>       "constant_fields" : {
>          "classification.identifier" : "sinkhole-http-referer",
> !       "classification.taxonomy" : "other",
> !       "classification.type" : "other"
>       },
>       "feed_name" : "Sinkhole-Events-HTTP-Referer IPv4",
>       "file_name" : "event4_sinkhole_http_referer",
> --- 1,8 ----
>    {
>       "constant_fields" : {
>          "classification.identifier" : "sinkhole-http-referer",
> !       "classification.taxonomy" : "malicious-code",
> !       "classification.type" : "infected-system"
>       },
>       "feed_name" : "Sinkhole-Events-HTTP-Referer IPv4",
>       "file_name" : "event4_sinkhole_http_referer",
> 
> Best regards
> 
> // Kamil Mańkowski <mankowski at cert.at> - T: +43 676 898 298 7204
> // CERT Austria - https://www.cert.at/
> // CERT.at GmbH, FB-Nr. 561772k, HG Wien
> 
> On 1/24/24 13:02, Kamil Mankowski wrote:
>> Thanks, I'm forwarding it to the ShadowServer for the corrections
>>
>> Best regards
>>
>> // Kamil Mańkowski <mankowski at cert.at> - T: +43 676 898 298 7204
>> // CERT Austria - https://www.cert.at/
>> // CERT.at GmbH, FB-Nr. 561772k, HG Wien
>>
>> On 1/24/24 12:15, Thomas Hungenberg wrote:
>>> Hi Kamil,
>>>
>>> I had a quick look at the mapping. Unfortunately, it is not correct.
>>>
>>> The following changes should be applied to the mapping for ALL sinkhole related feeds:
>>>
>>> ==================================================
>>>         "constant_fields" : {
>>>            "classification.taxonomy" : "malicious-code",
>>>            "classification.type" : "infected-system"
>>> +         "classification.identifier" : "example: event4-sinkhole", # set CI to feed name (with dashes) like with other feeds
>>>         },
>>>
>>>         "optional_fields" : [
>>>            [
>>> -            "classification.identifier",
>>> +            "malware.name",
>>>               "infection",
>>>               "validate_to_none"
>>>            ],
>>>
>>>            [
>>> -            "malware.name",
>>> +            "extra.",
>>>               "family",
>>>               "validate_to_none"
>>>            ],
>>>
>>> -         [
>>> -            "extra.",
>>> -            "infection",
>>> -            "validate_to_none"
>>> -         ],
>>> ==================================================
>>>
>>>
>>> I also noticed that classification.taxonomy and classification.type are set to "other"
>>> for some sinkhole feeds like this:
>>>
>>>     "event_sinkhole_http_referer" : {
>>>        "constant_fields" : {
>>>           "classification.identifier" : "event-sinkhole-http-referer",
>>>           "classification.taxonomy" : "other",
>>>           "classification.type" : "other"
>>>
>>>
>>> This should be changed to:
>>>
>>>           "classification.taxonomy" : "malicious-code",
>>>           "classification.type" : "infected-system",
>>>
>>>
>>> Kind regards
>>> Thomas
>>>
>>>
>>> On 24.01.24 11:23, Kamil Mankowski via IntelMQ-dev wrote:
>>>> Hi,
>>>>
>>>> thanks for the patch, could you please have a look if it is correct in the incoming ShadowServer parser mapping? 
>>>> https://github.com/The-Shadowserver-Foundation/report_schema/blob/main/intelmq.json
>>>>
>>>> I'm pretty sure I was working with them to clean up such discrepancies, but we may have missed something. I don't want the next release to revert 
>>>> your changes unintentionally.
>>>>
>>>> Best regards
>>>>
>>>> // Kamil Mańkowski <mankowski at cert.at> - T: +43 676 898 298 7204
>>>> // CERT Austria - https://www.cert.at/
>>>> // CERT.at GmbH, FB-Nr. 561772k, HG Wien
>>>>
>>>> On 1/24/24 10:28, Thomas Hungenberg via IntelMQ-dev wrote:
>>>>> Hi all,
>>>>>
>>>>> the parsers for malware events provided by different sources usually store
>>>>> the malware name in malware.name and classification.identifier is left blank
>>>>> (or set to the feed's name).
>>>>> When using the malware name mapping, a harmonized malware name is subsequently
>>>>> written to classification.identifier. So finally you have the original name
>>>>> in malware.name and the harmonized name in classification.identifier.
>>>>>
>>>>> Formerly (in the version we initially provided), the Shadowserver parser
>>>>> also stored the malware name in malware.name, see e.g.
>>>>> <https://github.com/certtools/intelmq/blob/c61ff2fd4232d6937f3815377b75f682a6fcf790/intelmq/bots/parsers/shadowserver/_config.py>
>>>>> line 387
>>>>>
>>>>> However, for some time the Shadowserver parser now writes the malware name
>>>>> ("infection") to classification.identifier and "family" to malware.name instead.
>>>>> This is bad for several reasons:
>>>>> - it is not consistent with parsers for other malware feeds
>>>>> - it breaks deduplicators matching on malware.name
>>>>> - the malware name mapping overwrites classification.identifier with the
>>>>>    value of "family" (which often is empty)
>>>>>
>>>>>
>>>>> Here is a patch (for the version included with IntelMQ 3.2.1) to fix this problem
>>>>> and make malware events parsed by the Shadowserver parser consistent with other
>>>>> parsers again:
>>>>>
>>>>> ===============================================
>>>>> diff --git a/_config.py.orig b/_config.py
>>>>> index bea3d0c..431bcb9 100644
>>>>> --- a/_config.py.orig
>>>>> +++ b/_config.py
>>>>> @@ -867,10 +867,9 @@ event_sinkhole = {
>>>>>           ('source.port', 'src_port', convert_int),
>>>>>       ],
>>>>>       'optional_fields': [
>>>>> -        ('classification.identifier', 'infection', validate_to_none),
>>>>> -        ('malware.name', 'family', validate_to_none),
>>>>> +        ('malware.name', 'infection', validate_to_none),
>>>>> +        ('extra.', 'family', validate_to_none),
>>>>>           ('extra.', 'tag', validate_to_none),
>>>>> -        ('extra.', 'infection', validate_to_none),
>>>>>           ('protocol.transport', 'protocol'),
>>>>>           ('source.asn', 'src_asn', invalidate_zero),
>>>>>           ('source.geolocation.cc', 'src_geo'),
>>>>> @@ -899,6 +898,7 @@ event_sinkhole = {
>>>>>       'constant_fields': {
>>>>>           'classification.taxonomy': 'malicious-code',
>>>>>           'classification.type': 'infected-system',
>>>>> +        'classification.identifier': 'sinkhole-events',
>>>>>       },
>>>>>   }
>>>>>
>>>>> @@ -944,10 +944,9 @@ event_sinkhole_http = {
>>>>>           ('source.port', 'src_port', convert_int),
>>>>>       ],
>>>>>       'optional_fields': [
>>>>> -        ('classification.identifier', 'tag'),
>>>>> -        ('malware.name', 'family', validate_to_none),
>>>>> +        ('malware.name', 'infection', validate_to_none),
>>>>> +        ('extra.', 'family', validate_to_none),
>>>>>           ('extra.', 'tag', validate_to_none),
>>>>> -        ('extra.', 'infection', validate_to_none),
>>>>>           ('protocol.transport', 'protocol'),
>>>>>           ('source.asn', 'src_asn', invalidate_zero),
>>>>>           ('source.geolocation.cc', 'src_geo'),
>>>>> @@ -982,6 +981,7 @@ event_sinkhole_http = {
>>>>>       'constant_fields': {
>>>>>           'classification.taxonomy': 'malicious-code',
>>>>>           'classification.type': 'infected-system',
>>>>> +        'classification.identifier': 'sinkhole-http-events',
>>>>>           'protocol.application': 'http',
>>>>>       },
>>>>>   }
>>>>> @@ -992,9 +992,9 @@ event_sinkhole_http_referer = {
>>>>>           ('time.source', 'timestamp', add_UTC_to_timestamp),
>>>>>       ],
>>>>>       'optional_fields': [
>>>>> -        ('malware.name', 'family', validate_to_none),
>>>>> +        ('malware.name', 'infection', validate_to_none),
>>>>> +        ('extra.', 'family', validate_to_none),
>>>>>           ('extra.', 'tag', validate_to_none),
>>>>> -        ('extra.', 'infection', validate_to_none),
>>>>>           ('protocol.transport', 'protocol'),
>>>>>           ('extra.', 'http_referer_ip', validate_ip),
>>>>>           ('extra.', 'http_referer_port', convert_int),
>>>>> ===============================================
>>>>>
>>>>>
>>>>> Kind regards
>>>>> Thomas
>>>>>
>>>>> _______________________________________________
>>>>> IntelMQ-dev mailing list
>>>>> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
>>>>> https://intelmq.readthedocs.io/
>>>>
>>>> _______________________________________________
>>>> IntelMQ-dev mailing list
>>>> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
>>>> https://intelmq.readthedocs.io/
>>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: _config.py
Type: text/x-python
Size: 193804 bytes
Desc: not available
URL: <http://lists.cert.at/pipermail/intelmq-dev/attachments/20240126/22130463/attachment-0001.py>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: _config.py.diff
Type: text/x-patch
Size: 9003 bytes
Desc: not available
URL: <http://lists.cert.at/pipermail/intelmq-dev/attachments/20240126/22130463/attachment-0001.bin>


More information about the IntelMQ-dev mailing list