[IntelMQ-dev] Shadowserver parser: Bad mapping for malware events
Thomas Hungenberg
th at cert-bund.de
Wed Jan 24 12:15:30 CET 2024
Hi Kamil,
I had a quick look at the mapping. Unfortunately, it is not correct.
The following changes should be applied to the mapping for ALL sinkhole related feeds:
==================================================
"constant_fields" : {
"classification.taxonomy" : "malicious-code",
"classification.type" : "infected-system"
+ "classification.identifier" : "example: event4-sinkhole", # set CI to feed name (with dashes) like with other feeds
},
"optional_fields" : [
[
- "classification.identifier",
+ "malware.name",
"infection",
"validate_to_none"
],
[
- "malware.name",
+ "extra.",
"family",
"validate_to_none"
],
- [
- "extra.",
- "infection",
- "validate_to_none"
- ],
==================================================
I also noticed that classification.taxonomy and classification.type are set to "other"
for some sinkhole feeds like this:
"event_sinkhole_http_referer" : {
"constant_fields" : {
"classification.identifier" : "event-sinkhole-http-referer",
"classification.taxonomy" : "other",
"classification.type" : "other"
This should be changed to:
"classification.taxonomy" : "malicious-code",
"classification.type" : "infected-system",
Kind regards
Thomas
On 24.01.24 11:23, Kamil Mankowski via IntelMQ-dev wrote:
> Hi,
>
> thanks for the patch, could you please have a look if it is correct in the incoming ShadowServer parser mapping?
> https://github.com/The-Shadowserver-Foundation/report_schema/blob/main/intelmq.json
>
> I'm pretty sure I was working with them to clean up such discrepancies, but we may have missed something. I don't want the next release to revert your
> changes unintentionally.
>
> Best regards
>
> // Kamil Mańkowski <mankowski at cert.at> - T: +43 676 898 298 7204
> // CERT Austria - https://www.cert.at/
> // CERT.at GmbH, FB-Nr. 561772k, HG Wien
>
> On 1/24/24 10:28, Thomas Hungenberg via IntelMQ-dev wrote:
>> Hi all,
>>
>> the parsers for malware events provided by different sources usually store
>> the malware name in malware.name and classification.identifier is left blank
>> (or set to the feed's name).
>> When using the malware name mapping, a harmonized malware name is subsequently
>> written to classification.identifier. So finally you have the original name
>> in malware.name and the harmonized name in classification.identifier.
>>
>> Formerly (in the version we initially provided), the Shadowserver parser
>> also stored the malware name in malware.name, see e.g.
>> <https://github.com/certtools/intelmq/blob/c61ff2fd4232d6937f3815377b75f682a6fcf790/intelmq/bots/parsers/shadowserver/_config.py>
>> line 387
>>
>> However, for some time the Shadowserver parser now writes the malware name
>> ("infection") to classification.identifier and "family" to malware.name instead.
>> This is bad for several reasons:
>> - it is not consistent with parsers for other malware feeds
>> - it breaks deduplicators matching on malware.name
>> - the malware name mapping overwrites classification.identifier with the
>> value of "family" (which often is empty)
>>
>>
>> Here is a patch (for the version included with IntelMQ 3.2.1) to fix this problem
>> and make malware events parsed by the Shadowserver parser consistent with other
>> parsers again:
>>
>> ===============================================
>> diff --git a/_config.py.orig b/_config.py
>> index bea3d0c..431bcb9 100644
>> --- a/_config.py.orig
>> +++ b/_config.py
>> @@ -867,10 +867,9 @@ event_sinkhole = {
>> ('source.port', 'src_port', convert_int),
>> ],
>> 'optional_fields': [
>> - ('classification.identifier', 'infection', validate_to_none),
>> - ('malware.name', 'family', validate_to_none),
>> + ('malware.name', 'infection', validate_to_none),
>> + ('extra.', 'family', validate_to_none),
>> ('extra.', 'tag', validate_to_none),
>> - ('extra.', 'infection', validate_to_none),
>> ('protocol.transport', 'protocol'),
>> ('source.asn', 'src_asn', invalidate_zero),
>> ('source.geolocation.cc', 'src_geo'),
>> @@ -899,6 +898,7 @@ event_sinkhole = {
>> 'constant_fields': {
>> 'classification.taxonomy': 'malicious-code',
>> 'classification.type': 'infected-system',
>> + 'classification.identifier': 'sinkhole-events',
>> },
>> }
>>
>> @@ -944,10 +944,9 @@ event_sinkhole_http = {
>> ('source.port', 'src_port', convert_int),
>> ],
>> 'optional_fields': [
>> - ('classification.identifier', 'tag'),
>> - ('malware.name', 'family', validate_to_none),
>> + ('malware.name', 'infection', validate_to_none),
>> + ('extra.', 'family', validate_to_none),
>> ('extra.', 'tag', validate_to_none),
>> - ('extra.', 'infection', validate_to_none),
>> ('protocol.transport', 'protocol'),
>> ('source.asn', 'src_asn', invalidate_zero),
>> ('source.geolocation.cc', 'src_geo'),
>> @@ -982,6 +981,7 @@ event_sinkhole_http = {
>> 'constant_fields': {
>> 'classification.taxonomy': 'malicious-code',
>> 'classification.type': 'infected-system',
>> + 'classification.identifier': 'sinkhole-http-events',
>> 'protocol.application': 'http',
>> },
>> }
>> @@ -992,9 +992,9 @@ event_sinkhole_http_referer = {
>> ('time.source', 'timestamp', add_UTC_to_timestamp),
>> ],
>> 'optional_fields': [
>> - ('malware.name', 'family', validate_to_none),
>> + ('malware.name', 'infection', validate_to_none),
>> + ('extra.', 'family', validate_to_none),
>> ('extra.', 'tag', validate_to_none),
>> - ('extra.', 'infection', validate_to_none),
>> ('protocol.transport', 'protocol'),
>> ('extra.', 'http_referer_ip', validate_ip),
>> ('extra.', 'http_referer_port', convert_int),
>> ===============================================
>>
>>
>> Kind regards
>> Thomas
>>
>> _______________________________________________
>> IntelMQ-dev mailing list
>> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
>> https://intelmq.readthedocs.io/
>
> _______________________________________________
> IntelMQ-dev mailing list
> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
> https://intelmq.readthedocs.io/
More information about the IntelMQ-dev
mailing list