[IntelMQ-dev] Shadowserver parser: Bad mapping for malware events

Kamil Mankowski mankowski at cert.at
Wed Jan 24 13:02:20 CET 2024


Thanks, I'm forwarding it to the ShadowServer for the corrections

Best regards

// Kamil Mańkowski <mankowski at cert.at> - T: +43 676 898 298 7204
// CERT Austria - https://www.cert.at/
// CERT.at GmbH, FB-Nr. 561772k, HG Wien

On 1/24/24 12:15, Thomas Hungenberg wrote:
> Hi Kamil,
> 
> I had a quick look at the mapping. Unfortunately, it is not correct.
> 
> The following changes should be applied to the mapping for ALL sinkhole 
> related feeds:
> 
> ==================================================
>         "constant_fields" : {
>            "classification.taxonomy" : "malicious-code",
>            "classification.type" : "infected-system"
> +         "classification.identifier" : "example: event4-sinkhole",    # 
> set CI to feed name (with dashes) like with other feeds
>         },
> 
>         "optional_fields" : [
>            [
> -            "classification.identifier",
> +            "malware.name",
>               "infection",
>               "validate_to_none"
>            ],
> 
>            [
> -            "malware.name",
> +            "extra.",
>               "family",
>               "validate_to_none"
>            ],
> 
> -         [
> -            "extra.",
> -            "infection",
> -            "validate_to_none"
> -         ],
> ==================================================
> 
> 
> I also noticed that classification.taxonomy and classification.type are 
> set to "other"
> for some sinkhole feeds like this:
> 
>     "event_sinkhole_http_referer" : {
>        "constant_fields" : {
>           "classification.identifier" : "event-sinkhole-http-referer",
>           "classification.taxonomy" : "other",
>           "classification.type" : "other"
> 
> 
> This should be changed to:
> 
>           "classification.taxonomy" : "malicious-code",
>           "classification.type" : "infected-system",
> 
> 
> Kind regards
> Thomas
> 
> 
> On 24.01.24 11:23, Kamil Mankowski via IntelMQ-dev wrote:
>> Hi,
>>
>> thanks for the patch, could you please have a look if it is correct in 
>> the incoming ShadowServer parser mapping? 
>> https://github.com/The-Shadowserver-Foundation/report_schema/blob/main/intelmq.json
>>
>> I'm pretty sure I was working with them to clean up such 
>> discrepancies, but we may have missed something. I don't want the next 
>> release to revert your changes unintentionally.
>>
>> Best regards
>>
>> // Kamil Mańkowski <mankowski at cert.at> - T: +43 676 898 298 7204
>> // CERT Austria - https://www.cert.at/
>> // CERT.at GmbH, FB-Nr. 561772k, HG Wien
>>
>> On 1/24/24 10:28, Thomas Hungenberg via IntelMQ-dev wrote:
>>> Hi all,
>>>
>>> the parsers for malware events provided by different sources usually 
>>> store
>>> the malware name in malware.name and classification.identifier is 
>>> left blank
>>> (or set to the feed's name).
>>> When using the malware name mapping, a harmonized malware name is 
>>> subsequently
>>> written to classification.identifier. So finally you have the 
>>> original name
>>> in malware.name and the harmonized name in classification.identifier.
>>>
>>> Formerly (in the version we initially provided), the Shadowserver parser
>>> also stored the malware name in malware.name, see e.g.
>>> <https://github.com/certtools/intelmq/blob/c61ff2fd4232d6937f3815377b75f682a6fcf790/intelmq/bots/parsers/shadowserver/_config.py>
>>> line 387
>>>
>>> However, for some time the Shadowserver parser now writes the malware 
>>> name
>>> ("infection") to classification.identifier and "family" to 
>>> malware.name instead.
>>> This is bad for several reasons:
>>> - it is not consistent with parsers for other malware feeds
>>> - it breaks deduplicators matching on malware.name
>>> - the malware name mapping overwrites classification.identifier with the
>>>    value of "family" (which often is empty)
>>>
>>>
>>> Here is a patch (for the version included with IntelMQ 3.2.1) to fix 
>>> this problem
>>> and make malware events parsed by the Shadowserver parser consistent 
>>> with other
>>> parsers again:
>>>
>>> ===============================================
>>> diff --git a/_config.py.orig b/_config.py
>>> index bea3d0c..431bcb9 100644
>>> --- a/_config.py.orig
>>> +++ b/_config.py
>>> @@ -867,10 +867,9 @@ event_sinkhole = {
>>>           ('source.port', 'src_port', convert_int),
>>>       ],
>>>       'optional_fields': [
>>> -        ('classification.identifier', 'infection', validate_to_none),
>>> -        ('malware.name', 'family', validate_to_none),
>>> +        ('malware.name', 'infection', validate_to_none),
>>> +        ('extra.', 'family', validate_to_none),
>>>           ('extra.', 'tag', validate_to_none),
>>> -        ('extra.', 'infection', validate_to_none),
>>>           ('protocol.transport', 'protocol'),
>>>           ('source.asn', 'src_asn', invalidate_zero),
>>>           ('source.geolocation.cc', 'src_geo'),
>>> @@ -899,6 +898,7 @@ event_sinkhole = {
>>>       'constant_fields': {
>>>           'classification.taxonomy': 'malicious-code',
>>>           'classification.type': 'infected-system',
>>> +        'classification.identifier': 'sinkhole-events',
>>>       },
>>>   }
>>>
>>> @@ -944,10 +944,9 @@ event_sinkhole_http = {
>>>           ('source.port', 'src_port', convert_int),
>>>       ],
>>>       'optional_fields': [
>>> -        ('classification.identifier', 'tag'),
>>> -        ('malware.name', 'family', validate_to_none),
>>> +        ('malware.name', 'infection', validate_to_none),
>>> +        ('extra.', 'family', validate_to_none),
>>>           ('extra.', 'tag', validate_to_none),
>>> -        ('extra.', 'infection', validate_to_none),
>>>           ('protocol.transport', 'protocol'),
>>>           ('source.asn', 'src_asn', invalidate_zero),
>>>           ('source.geolocation.cc', 'src_geo'),
>>> @@ -982,6 +981,7 @@ event_sinkhole_http = {
>>>       'constant_fields': {
>>>           'classification.taxonomy': 'malicious-code',
>>>           'classification.type': 'infected-system',
>>> +        'classification.identifier': 'sinkhole-http-events',
>>>           'protocol.application': 'http',
>>>       },
>>>   }
>>> @@ -992,9 +992,9 @@ event_sinkhole_http_referer = {
>>>           ('time.source', 'timestamp', add_UTC_to_timestamp),
>>>       ],
>>>       'optional_fields': [
>>> -        ('malware.name', 'family', validate_to_none),
>>> +        ('malware.name', 'infection', validate_to_none),
>>> +        ('extra.', 'family', validate_to_none),
>>>           ('extra.', 'tag', validate_to_none),
>>> -        ('extra.', 'infection', validate_to_none),
>>>           ('protocol.transport', 'protocol'),
>>>           ('extra.', 'http_referer_ip', validate_ip),
>>>           ('extra.', 'http_referer_port', convert_int),
>>> ===============================================
>>>
>>>
>>> Kind regards
>>> Thomas
>>>
>>> _______________________________________________
>>> IntelMQ-dev mailing list
>>> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
>>> https://intelmq.readthedocs.io/
>>
>> _______________________________________________
>> IntelMQ-dev mailing list
>> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
>> https://intelmq.readthedocs.io/
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.cert.at/pipermail/intelmq-dev/attachments/20240124/2fed34a5/attachment.sig>


More information about the IntelMQ-dev mailing list