[IntelMQ-dev] Shadowserver parser: Bad mapping for malware events
Kamil Mankowski
mankowski at cert.at
Wed Jan 24 13:02:20 CET 2024
Thanks, I'm forwarding it to the ShadowServer for the corrections
Best regards
// Kamil Mańkowski <mankowski at cert.at> - T: +43 676 898 298 7204
// CERT Austria - https://www.cert.at/
// CERT.at GmbH, FB-Nr. 561772k, HG Wien
On 1/24/24 12:15, Thomas Hungenberg wrote:
> Hi Kamil,
>
> I had a quick look at the mapping. Unfortunately, it is not correct.
>
> The following changes should be applied to the mapping for ALL sinkhole
> related feeds:
>
> ==================================================
> "constant_fields" : {
> "classification.taxonomy" : "malicious-code",
> "classification.type" : "infected-system"
> + "classification.identifier" : "example: event4-sinkhole", #
> set CI to feed name (with dashes) like with other feeds
> },
>
> "optional_fields" : [
> [
> - "classification.identifier",
> + "malware.name",
> "infection",
> "validate_to_none"
> ],
>
> [
> - "malware.name",
> + "extra.",
> "family",
> "validate_to_none"
> ],
>
> - [
> - "extra.",
> - "infection",
> - "validate_to_none"
> - ],
> ==================================================
>
>
> I also noticed that classification.taxonomy and classification.type are
> set to "other"
> for some sinkhole feeds like this:
>
> "event_sinkhole_http_referer" : {
> "constant_fields" : {
> "classification.identifier" : "event-sinkhole-http-referer",
> "classification.taxonomy" : "other",
> "classification.type" : "other"
>
>
> This should be changed to:
>
> "classification.taxonomy" : "malicious-code",
> "classification.type" : "infected-system",
>
>
> Kind regards
> Thomas
>
>
> On 24.01.24 11:23, Kamil Mankowski via IntelMQ-dev wrote:
>> Hi,
>>
>> thanks for the patch, could you please have a look if it is correct in
>> the incoming ShadowServer parser mapping?
>> https://github.com/The-Shadowserver-Foundation/report_schema/blob/main/intelmq.json
>>
>> I'm pretty sure I was working with them to clean up such
>> discrepancies, but we may have missed something. I don't want the next
>> release to revert your changes unintentionally.
>>
>> Best regards
>>
>> // Kamil Mańkowski <mankowski at cert.at> - T: +43 676 898 298 7204
>> // CERT Austria - https://www.cert.at/
>> // CERT.at GmbH, FB-Nr. 561772k, HG Wien
>>
>> On 1/24/24 10:28, Thomas Hungenberg via IntelMQ-dev wrote:
>>> Hi all,
>>>
>>> the parsers for malware events provided by different sources usually
>>> store
>>> the malware name in malware.name and classification.identifier is
>>> left blank
>>> (or set to the feed's name).
>>> When using the malware name mapping, a harmonized malware name is
>>> subsequently
>>> written to classification.identifier. So finally you have the
>>> original name
>>> in malware.name and the harmonized name in classification.identifier.
>>>
>>> Formerly (in the version we initially provided), the Shadowserver parser
>>> also stored the malware name in malware.name, see e.g.
>>> <https://github.com/certtools/intelmq/blob/c61ff2fd4232d6937f3815377b75f682a6fcf790/intelmq/bots/parsers/shadowserver/_config.py>
>>> line 387
>>>
>>> However, for some time the Shadowserver parser now writes the malware
>>> name
>>> ("infection") to classification.identifier and "family" to
>>> malware.name instead.
>>> This is bad for several reasons:
>>> - it is not consistent with parsers for other malware feeds
>>> - it breaks deduplicators matching on malware.name
>>> - the malware name mapping overwrites classification.identifier with the
>>> value of "family" (which often is empty)
>>>
>>>
>>> Here is a patch (for the version included with IntelMQ 3.2.1) to fix
>>> this problem
>>> and make malware events parsed by the Shadowserver parser consistent
>>> with other
>>> parsers again:
>>>
>>> ===============================================
>>> diff --git a/_config.py.orig b/_config.py
>>> index bea3d0c..431bcb9 100644
>>> --- a/_config.py.orig
>>> +++ b/_config.py
>>> @@ -867,10 +867,9 @@ event_sinkhole = {
>>> ('source.port', 'src_port', convert_int),
>>> ],
>>> 'optional_fields': [
>>> - ('classification.identifier', 'infection', validate_to_none),
>>> - ('malware.name', 'family', validate_to_none),
>>> + ('malware.name', 'infection', validate_to_none),
>>> + ('extra.', 'family', validate_to_none),
>>> ('extra.', 'tag', validate_to_none),
>>> - ('extra.', 'infection', validate_to_none),
>>> ('protocol.transport', 'protocol'),
>>> ('source.asn', 'src_asn', invalidate_zero),
>>> ('source.geolocation.cc', 'src_geo'),
>>> @@ -899,6 +898,7 @@ event_sinkhole = {
>>> 'constant_fields': {
>>> 'classification.taxonomy': 'malicious-code',
>>> 'classification.type': 'infected-system',
>>> + 'classification.identifier': 'sinkhole-events',
>>> },
>>> }
>>>
>>> @@ -944,10 +944,9 @@ event_sinkhole_http = {
>>> ('source.port', 'src_port', convert_int),
>>> ],
>>> 'optional_fields': [
>>> - ('classification.identifier', 'tag'),
>>> - ('malware.name', 'family', validate_to_none),
>>> + ('malware.name', 'infection', validate_to_none),
>>> + ('extra.', 'family', validate_to_none),
>>> ('extra.', 'tag', validate_to_none),
>>> - ('extra.', 'infection', validate_to_none),
>>> ('protocol.transport', 'protocol'),
>>> ('source.asn', 'src_asn', invalidate_zero),
>>> ('source.geolocation.cc', 'src_geo'),
>>> @@ -982,6 +981,7 @@ event_sinkhole_http = {
>>> 'constant_fields': {
>>> 'classification.taxonomy': 'malicious-code',
>>> 'classification.type': 'infected-system',
>>> + 'classification.identifier': 'sinkhole-http-events',
>>> 'protocol.application': 'http',
>>> },
>>> }
>>> @@ -992,9 +992,9 @@ event_sinkhole_http_referer = {
>>> ('time.source', 'timestamp', add_UTC_to_timestamp),
>>> ],
>>> 'optional_fields': [
>>> - ('malware.name', 'family', validate_to_none),
>>> + ('malware.name', 'infection', validate_to_none),
>>> + ('extra.', 'family', validate_to_none),
>>> ('extra.', 'tag', validate_to_none),
>>> - ('extra.', 'infection', validate_to_none),
>>> ('protocol.transport', 'protocol'),
>>> ('extra.', 'http_referer_ip', validate_ip),
>>> ('extra.', 'http_referer_port', convert_int),
>>> ===============================================
>>>
>>>
>>> Kind regards
>>> Thomas
>>>
>>> _______________________________________________
>>> IntelMQ-dev mailing list
>>> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
>>> https://intelmq.readthedocs.io/
>>
>> _______________________________________________
>> IntelMQ-dev mailing list
>> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
>> https://intelmq.readthedocs.io/
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.cert.at/pipermail/intelmq-dev/attachments/20240124/2fed34a5/attachment.sig>
More information about the IntelMQ-dev
mailing list