[Intelmq-dev] Data Harmonization - Fields with multiple values

L. Aaron Kaplan kaplan at cert.at
Thu Nov 9 15:51:10 CET 2017


> On 09 Nov 2017, at 13:37, Sebastian Wagner <wagner at cert.at> wrote:
> 
> Hi,
> 
> On 11/09/2017 05:05 AM, Knight, Alexander wrote:
>> > And: what use cases do we have?
>> My particular use case at the moment is to have lists of IP addresses, IP networks and possibly FQDN’s.
> As far as I know IntelMQ was not intended to be used like that (a design decision), but to have one singe source and destination per event. I hope Aaron can give more details on this.

Yes, indeed, Sebastian is correct here.
To give some historic context to the discussion: when we (Tomas, me, Mauro, ...) started IntelMQ some years ago, we explicitly wanted to keep a very very simple format.
In addition, we intentionally wanted to be as compatible as possible to the Abusehelper format. In fact, back then we documented the format of Abusehelper, for some weird non-native-english-speaker reason , named the format "Data harmonisation Ontology" (DHO) and made the first documentation on the Abusehelper wiki [1]. By now the Abusehelper DHO differs from IntelMQ's DHO :(

Part of that was to have the simple (KISS - keep it simple, stupid) principle of having *one event per IP address or fqdn**. Two different IPs? --> please make two events.
Even though these events might be tied together via the same fqdn.

The mapping and matching (and the the relationships between events) is outside of the scope of the format, since it would bring in complexity.

That was the design decision back then.

The other option would have been to adopt formats such as STIX as internal formats, which seemed overkill back then.


> If there are more than one source, the event can be split, so you have two events, each with one source.
That would be exactly the way to do it in IntelMQ.

> Other formats like IDEA from warden do have this possibility.
> 
indeed.

Alexander, in case you are interested in having other internal formats, that should be possible. But not trivial. Basically we do have the event/message classes [2] which could be a starting point for abstracting other internal formats such as IDEA. However, I prefer to make clear translator bots between different formats. And yes, some of them might be lossy and not be able to maintain internal structures.

The benefit that you get from KISS is that the tool stays useable for many people... .it was really a design decision.

I hope I could clarify things a bit.

Best,
Aaron.


[1] https://github.com/abusesa/abusehelper/wiki/Data-Harmonization-Ontology
[2] https://github.com/certtools/intelmq/blob/develop/intelmq/lib/message.py


--
// L. Aaron Kaplan <kaplan at cert.at> - T: +43 1 5056416 78
// CERT Austria - https://www.cert.at/
// Eine Initiative der nic.at GmbH - http://www.nic.at/
// Firmenbuchnummer 172568b, LG Salzburg






-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.cert.at/pipermail/intelmq-dev/attachments/20171109/01f0b192/attachment.sig>


More information about the Intelmq-dev mailing list