Dear Marius,

On 2/22/21 9:46 PM, Marius Urkis wrote:

Could you give some real-life examples of incorrect usage? I think quite a lot of cases can be covered by infected-system when hash taken from infected device, or malware-distribution otherwise.

In IntelMQ most usages of "malware" were actually "malware-distribution", e.g. websites spreading malware files. Some other usages were infected-system.
For a full diff https://github.com/certtools/intelmq/compare/650a450...2588a284e31582e6cbfa72aac483dd17198527ce should give a good but very long overview.

From other hand, RIST stands exclusively for Incident taxonomy and classification attributes helps only with Incident classification. However Incident management is not the only service/activity of CSIRTs, so if malicious code analysis is related not to the some incident, RSIT taxonomy would not help here, and probably such a case should be handled not as an incident?

Yes. The question is, how much do we want to support such use-cases in IntelMQ. AFAIK the primary use-case of IntelMQ is incident handling, not processing malware files/hashes. The RSIT does not give us the means to classify malware, independent of which tool is used. But this does not mean, that processing malware files/hashes is not possible or allowed with IntelMQ. IMO it's totally legit to process that data with IntelMQ, but it's not the the tool's primary intent. A quick view over IntelMQ's bots and feeds shows, that only the two cases I mentioned in my first mail are not related to incidents (the GitHub feed and FireEye).

best regards
Sebastian

Best regards

On 22/02/2021 12:24, Sebastian Wagner wrote:
Dear IntelMQ community,

sorry for cross-posting, but I think this topic should be discussed in a wider group.

IntelMQ always followed the Reference Security Incident Taxonomy (short: RSIT)[0] and its predecessor for its 'classification.taxonomy/type' fields. The Classification column in the RSIT corresponds to our "classification.taxonomy" field, and the RSIT's second column (currently called Incident examples) corresponds to our "classification.type" field. "classification.identifier" is an optional third level free-text field to give more specific context.[1]

Due to historical reasons and changes on both sides - IntelMQ as well as the RSIT -, IntelMQ's classification scheme deviated a bit from the RSIT over time. I'm working on aligning them again for 3.0, which works straightforward in most cases. But for one case, I need your input.

The predecessor of the RSIT (the eCSIRT.net taxonomy)[2] used the malicious code taxonomy differently: To classify malware itself into categories, like virus, worm, trojan, etc. The RSIT never did that, as classifying malware is never unambiguous and there are plenty of existing classification scheme out there, which do this already. Also, the focus of the RSIT is different, as it classifies the incidents/events, not malware samples.

And for this reason, IntelMQ had (until < 3.0.0) the classification.type "malware" in IntelMQ. Most of the usages were wrong anyway, and should have been infected-device, malware-distribution or something else anyway. There is only one usage in IntelMQ, which can not be changed. And that one is really about malware itself (or: the hashes of samples) as used in the GitHub Feed parser[3] and the FireEye Parser[4]. But the issue is more generic, as we need to decide anyway, how we want to deal with such malware-IoCs.

A malware (hash) does not fit into the RSIT. It's neither an Infected System, a C2 Server, Malware Distribution nor Malware Configuration. It's just a malware (hash). I see four options:

1) Deviate from the RSIT and just use 'classification.taxonomy' = 'Malicious Code' and 'classification.type' = 'malware'
2) Deviate slightly less from the RSIT and use 'classification.taxonomy' = 'other' and 'classification.type' = 'malware'
3) Adhere strictly to the RSIT and use 'classification.taxonomy' = 'other' and 'classification.type' = 'other' and "classification.identifier" = 'malware'
4) IntelMQ does not support this use case

In cases 1) and 2) "classification.identifier" could be used to specify what the event is about, e.g. "hash", or the malware family.

I'm currently in favor of option 2), as we can keep the meaning of "Malicious Code" in sync with the RSIT and still support the use-case sufficiently. But my opinion could change during the discussion :)

Do you see any more options than I listed above? What do you favor?

best regards
Sebastian

[0]: https://github.com/enisaeu/Reference-Security-Incident-Taxonomy-Task-Force/blob/5479e71/working_copy/humanv1.md
[1]: https://intelmq.readthedocs.io/en/latest/dev/data-harmonization.html#classification
[2]: https://www.trusted-introducer.org/Incident-Classification-Taxonomy.pdf
[3]: https://github.com/certtools/intelmq/blob/f7507ca2643fe8ddb3817c9be1209504ef8cc1f9/intelmq/bots/parsers/github_feed/parser.py
[4]: https://github.com/certtools/intelmq/pull/1745
-- 
// Sebastian Wagner <wagner@cert.at> - T: +43 1 5056416 7201
// CERT Austria - https://www.cert.at/
// Eine Initiative der nic.at GmbH - https://www.nic.at/
// Firmenbuchnummer 172568b, LG Salzburg

-- 
// Sebastian Wagner <wagner@cert.at> - T: +43 1 5056416 7201
// CERT Austria - https://www.cert.at/
// Eine Initiative der nic.at GmbH - https://www.nic.at/
// Firmenbuchnummer 172568b, LG Salzburg