[IntelMQ-dev] Shadowserver parser: Bad mapping for malware events

Kamil Mankowski mankowski at cert.at
Wed Jan 24 11:23:08 CET 2024


Hi,

thanks for the patch, could you please have a look if it is correct in 
the incoming ShadowServer parser mapping? 
https://github.com/The-Shadowserver-Foundation/report_schema/blob/main/intelmq.json

I'm pretty sure I was working with them to clean up such discrepancies, 
but we may have missed something. I don't want the next release to 
revert your changes unintentionally.

Best regards

// Kamil Mańkowski <mankowski at cert.at> - T: +43 676 898 298 7204
// CERT Austria - https://www.cert.at/
// CERT.at GmbH, FB-Nr. 561772k, HG Wien

On 1/24/24 10:28, Thomas Hungenberg via IntelMQ-dev wrote:
> Hi all,
> 
> the parsers for malware events provided by different sources usually store
> the malware name in malware.name and classification.identifier is left 
> blank
> (or set to the feed's name).
> When using the malware name mapping, a harmonized malware name is 
> subsequently
> written to classification.identifier. So finally you have the original name
> in malware.name and the harmonized name in classification.identifier.
> 
> Formerly (in the version we initially provided), the Shadowserver parser
> also stored the malware name in malware.name, see e.g.
> <https://github.com/certtools/intelmq/blob/c61ff2fd4232d6937f3815377b75f682a6fcf790/intelmq/bots/parsers/shadowserver/_config.py>
> line 387
> 
> However, for some time the Shadowserver parser now writes the malware name
> ("infection") to classification.identifier and "family" to malware.name 
> instead.
> This is bad for several reasons:
> - it is not consistent with parsers for other malware feeds
> - it breaks deduplicators matching on malware.name
> - the malware name mapping overwrites classification.identifier with the
>    value of "family" (which often is empty)
> 
> 
> Here is a patch (for the version included with IntelMQ 3.2.1) to fix 
> this problem
> and make malware events parsed by the Shadowserver parser consistent 
> with other
> parsers again:
> 
> ===============================================
> diff --git a/_config.py.orig b/_config.py
> index bea3d0c..431bcb9 100644
> --- a/_config.py.orig
> +++ b/_config.py
> @@ -867,10 +867,9 @@ event_sinkhole = {
>           ('source.port', 'src_port', convert_int),
>       ],
>       'optional_fields': [
> -        ('classification.identifier', 'infection', validate_to_none),
> -        ('malware.name', 'family', validate_to_none),
> +        ('malware.name', 'infection', validate_to_none),
> +        ('extra.', 'family', validate_to_none),
>           ('extra.', 'tag', validate_to_none),
> -        ('extra.', 'infection', validate_to_none),
>           ('protocol.transport', 'protocol'),
>           ('source.asn', 'src_asn', invalidate_zero),
>           ('source.geolocation.cc', 'src_geo'),
> @@ -899,6 +898,7 @@ event_sinkhole = {
>       'constant_fields': {
>           'classification.taxonomy': 'malicious-code',
>           'classification.type': 'infected-system',
> +        'classification.identifier': 'sinkhole-events',
>       },
>   }
> 
> @@ -944,10 +944,9 @@ event_sinkhole_http = {
>           ('source.port', 'src_port', convert_int),
>       ],
>       'optional_fields': [
> -        ('classification.identifier', 'tag'),
> -        ('malware.name', 'family', validate_to_none),
> +        ('malware.name', 'infection', validate_to_none),
> +        ('extra.', 'family', validate_to_none),
>           ('extra.', 'tag', validate_to_none),
> -        ('extra.', 'infection', validate_to_none),
>           ('protocol.transport', 'protocol'),
>           ('source.asn', 'src_asn', invalidate_zero),
>           ('source.geolocation.cc', 'src_geo'),
> @@ -982,6 +981,7 @@ event_sinkhole_http = {
>       'constant_fields': {
>           'classification.taxonomy': 'malicious-code',
>           'classification.type': 'infected-system',
> +        'classification.identifier': 'sinkhole-http-events',
>           'protocol.application': 'http',
>       },
>   }
> @@ -992,9 +992,9 @@ event_sinkhole_http_referer = {
>           ('time.source', 'timestamp', add_UTC_to_timestamp),
>       ],
>       'optional_fields': [
> -        ('malware.name', 'family', validate_to_none),
> +        ('malware.name', 'infection', validate_to_none),
> +        ('extra.', 'family', validate_to_none),
>           ('extra.', 'tag', validate_to_none),
> -        ('extra.', 'infection', validate_to_none),
>           ('protocol.transport', 'protocol'),
>           ('extra.', 'http_referer_ip', validate_ip),
>           ('extra.', 'http_referer_port', convert_int),
> ===============================================
> 
> 
> Kind regards
> Thomas
> 
> _______________________________________________
> IntelMQ-dev mailing list
> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
> https://intelmq.readthedocs.io/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.cert.at/pipermail/intelmq-dev/attachments/20240124/84eb083a/attachment.sig>


More information about the IntelMQ-dev mailing list