Hi all,
Thanks for the comments. I've forwarded the thread to ShadowServer, and they also have just joined the list (represented by @elsif, who works on the IntelMQ integration), so we can discuss the feedback directly.
@Thomas - answering the question about completed schema changes, I spoke with elsif about that a few weeks ago, and schema changelog is available at https://github.com/The-Shadowserver-Foundation/report_schema/blob/main/compl...
Best regards
// Kamil Mańkowski mankowski@cert.at - T: +43 676 898 298 7204 // CERT Austria - https://www.cert.at/ // CERT.at GmbH, FB-Nr. 561772k, HG Wien
On 1/29/24 09:49, Thomas Hungenberg wrote:
Hi all,
On 26.01.24 15:30, Sebix wrote:
Originally, the intended use of classification.identifier and malware.name was:
- malware.name contained the original (and unprocessed) malware name.
It was as specific as possible. It can have the malware variant. For example, "b157-rL".
- The classification.* fields should be usable for aggregation,
de-duplication, statistics etc.
- For malware events, the parsers could write the malware family (e.g.
"zeus") or the malware name to the identifier.
- The family took precedence, but if not known, the more specific
malware.name could be used instead.
- It was always up to the user to replace the identifier with a more
generic malware family, e.g. using the public malware name mapping and malpedia.
At least until 2022, IntelMQ and all its parsers fit this concept. It may still be the case, given the recent significant changes.
@Sebastian: Thanks for summarizing this well-proven concept!
The changes in the Shadowserver parser config must have happened somewhen between January and August 2022. Most likely with the adoption to the changes in the Shadowserver feeds like the move from "botnet drone" to "sinkhole events"?
In Januar 2022, the original (unprocessed) malware name ("infection" or "type") was still written to malware.name and "family" to extra. classification.identifier was left blank and could be set e.g. with a malware name mapping modify expert:
============================== drone = { 'optional_fields': [ ('malware.name', 'infection'), ('extra.', 'family', validate_to_none), ], 'constant_fields': { # classification.identifier will be set to (harmonized) malware name by modify expert }, ==============================
This fits the concept mentioned above.
However, in August 2022 "infection" was no longer stored in malware.name but used as classification.identifier and malware.name was set to "family":
============================== event_sinkhole = { 'optional_fields': [ ('classification.identifier', 'infection', validate_to_none), ('malware.name', 'family', validate_to_none), ==============================
Unfortunately, this is the opposite of the well-proven concept.
With the changes I proposed last week (2024-01-26), we return to the former well-proven concept with storing "infection" (or "type") in malware.name and "family" in "extra.family" like until 2022. This makes the Shadowserver parser consistent with other parsers for malware events (like ctip or anubis) again.
Additionally, we store "infection" (or "type") in classification.identifier as well to make sure every event processed by the parser has a classification.identifier. However, the classification.identifier can later be replaced e.g. with a harmonized malware name using the malware name mapping.
Kind regards Thomas