[Intelmq-dev] Data Harmonization - Fields with multiple values

Knight, Alexander Alexander.Knight at anz.com
Thu Nov 9 05:05:35 CET 2017


Hi,

And: what use cases do we have?
My particular use case at the moment is to have lists of IP addresses, IP networks and possibly FQDN's.

How to define the types of the values inside the list?
The values will be those that conform to IPAddress, IPNetwork and FQDN for their respective type. It could be represented as a vertical bar or comma separated list within a string or it could be a proper python list.

How should the "API" look like
The API should function as a regular python list. That being said, I don't imagine doing any complex operations with the list - I will have access to all the values within the parser and will be able to add them all to the event at once.

When should the list be converted to a string (or maybe also a JSON-list)?
My main usage will be outputting the events to Mongo - in that case a JSON-list will work. But overall I am happy to use strings to represent the list for all outputs if it makes it easier. I can simply split the values out after receiving the event on the other end.


My end use case is marking up the events as indicators in STIX. One of the teams most vital sources will have many source IPs/Networks/FQDNs per indicator, and thus I would like to be able to send a list of these values as one event.

Regards,
Alex

From: Sebastian Wagner [mailto:wagner at cert.at]
Sent: Wednesday, 8 November 2017 10:59 PM
To: Knight, Alexander; intelmq-dev at lists.cert.at
Subject: Re: [Intelmq-dev] Data Harmonization - Fields with multiple values

Hi,

On 11/03/2017 06:26 AM, Knight, Alexander wrote:

At the Deepsec conference Sebastian mentioned updating the harmonization to allow for fields with multiple values. Has this issue been progressed at all?
The use case was the field abuse_contact which could be a list and then be concatenated (if necessary) with commas.
Technically it is not hard to do it. In the develop branch I already have something similar (and more complex): a dictionary type named JSONDict.
So, not directly, but some changes that should make a change easier.

There are some questions popping up that need to be clarified first:
* How to define the types of the values inside the list? E.g. for the abuse_contact it has to be a list of strings/email addresses
* How should the "API" look like, or in other words: what should happen for the in and setitem-operations etc
* When should the list be converted to a string (or maybe also a JSON-list)? E.g. for postgres output the abuse_contact could either be a json-list or a comma separated list, depending on the table's definition, but for NoSQL-databases and files it can be just the list itself.

And: what use cases do we have? That's good to know before thinking about how we implement that all:

We will require multiple values for some fields in our events,
What is in these fields? (type and/or example values) Where do you put that that and how do you want to work with in (inside intelmq)?

I'd like to hear opinions of other users and developers too!

Sebastian
P.S.: I do have specific ideas, but don't want to bias others ;)


--

// Sebastian Wagner <wagner at cert.at><mailto:wagner at cert.at> - T: +43 1 5056416 7201

// CERT Austria - https://www.cert.at/

// Eine Initiative der nic.at GmbH - https://www.nic.at/

// Firmenbuchnummer 172568b, LG Salzburg

"This e-mail and any attachments to it (the "Communication") is, unless otherwise stated, confidential, may contain copyright material and is for the use only of the intended recipient. If you receive the Communication in error, please notify the sender immediately by return e-mail, delete the Communication and the return e-mail, and do not read, copy, retransmit or otherwise deal with it. Any views expressed in the Communication are those of the individual sender only, unless expressly stated to be those of Australia and New Zealand Banking Group Limited ABN 11 005 357 522, or any of its related entities including ANZ Bank New Zealand Limited (together "ANZ"). ANZ does not accept liability in connection with the integrity of or errors in the Communication, computer virus, data corruption, interference or delay arising from or in respect of the Communication."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cert.at/pipermail/intelmq-dev/attachments/20171109/6b8f36c2/attachment.html>


More information about the Intelmq-dev mailing list