<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Before going too far down this road, I'd be looking at the

      suitability or adaptability of STIX / TAXII 2.1

      (<a class="moz-txt-link-freetext" href="https://oasis-open.github.io/cti-documentation/stix/intro">https://oasis-open.github.io/cti-documentation/stix/intro</a>).<br>

    </p>

    <p>The STIX steering committee has spent years iterating and

      debating the data model for STIX. They've already done a lot of

      the hard work on how entities should reference one another, how

      TLP is implemented, consistent taxonomies, appropriate metadata

      and so on. There's also Python libs available, so it's more a case

      of working out how to integrate rather than reinvent.<br>

    </p>

    <p> TAXII provides a HTTP-based transport layer for STIX (or other

      data formats) which you can operate via push, pull, or otherwise

      relay via some sort of chained series of TAXII servers.</p>

    <p>As a bonus, it would give sharing inter-operation between IntelMQ

      and other platforms which also implement STIX / TAXII. MISP is one

      of those

      (<a class="moz-txt-link-freetext" href="https://www.misp-project.org/2020/06/24/MISP.2.4.128.released.html">https://www.misp-project.org/2020/06/24/MISP.2.4.128.released.html</a>)

      so there'd be some good experiences to draw from their developers

      I feel.</p>

    <p>Best regards,</p>

    <p>Chris<br>

    </p>

    <div class="moz-cite-prefix">On 31/03/2021 2:56 am, Sebastian

      Waldbauer wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:692d6097-d968-5942-6231-e17abcccc53e@cert.at">Dear

      IntelMQ Developers and Users,

      <br>

      <br>

      nowadays security incidents are more important than 10 years ago.

      As IntelMQ can be used as core element for automated security

      incident handling, we would like to provide a way to share

      information with other intelmq instances. This proposal is also an

      alternative to IEP03 insofar as solving the "multiple values" is

      possible by using UUIDs so "link" related events in a

      backwards-compatible manner.

      <br>

      <br>

      If you're interested, please let us know, so we could organize a

      hackathon for further discussions about the specification of the

      meta-information.

      <br>

      Previously this idea was discussed in [0] and [1].

      <br>

      <br>

      [0]

<a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/blob/version-3.0-ideas/docs/architecture-3.0.md#user-content-general-requirements">https://github.com/certtools/intelmq/blob/version-3.0-ideas/docs/architecture-3.0.md#user-content-general-requirements</a><br>

      [1] <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/1521">https://github.com/certtools/intelmq/issues/1521</a>

      <br>

      # IEP04: Internal Data Format: Meta Information and Data Exchange

      <br>

      To ease data exchange between two or more IntelMQ instances,

      adding some meta-information to the events can make this sharing

      easier in certain regards.

      <br>

      "Linking" events could be based on the same theory as `git` using

      it - with parent hashes ( we would call it UUID ).

      <br>

      <br>

      ### TL;DR

      <br>

      Communication between one or more IntelMQ instances & exchange

      data with a backwards-compatible format. P2P or centralized

      architecture is a big topic, which has to be discussed after the

      format is being set.

      <br>

      <br>

      ### Why is metadata important?

      <br>

      Short and simple. To avoid race conditions & being able to

      discard/drop already processed events from other instances.

      <br>

      <br>

      ### Meta information

      <br>

      Metadata is used to transfer some general data, which is not

      likely related to the event itself. It's more or less just an

      information to keep events clear & sortable.

      <br>

      <br>

      A message could look like:

      <br>

      <br>

      {

      <br>

          "meta": {

      <br>

              "version": 1, # protocol version, so we are allowed to

      fallback to old versions too

      <br>

              "uuid": {

      <br>

                 current: "cert_at:aaaa-bbbb-cccc-dddd" # format to be

      decided

      <br>

                 parent: "cert_at:xxxx-yyyy-zzzz-ffff" # format to be

      discussed, if not set -> current is the parent uuid

      <br>

              },

      <br>

              "type": "event",

      <br>

              "format": "intelmq", # i. e. this field could contain "n6"

      or "idea", so the receiving component can decode on demand.

      <br>

          },

      <br>

          "payload": { # normal intelmq data

      <br>

              "source.ip": "127.0.0.1",

      <br>

              "source.fqdn": "example.com",

      <br>

              "raw": base64-blob

      <br>

          }

      <br>

      }

      <br>

      <br>

      Tell us your opinion about adding non-standardized

      meta-information fields ( i. e. RTIR ticket number, origin, other

      local contact informationen ... and so on )

      <br>

      <br>

      #### The UUID

      <br>

      For the UUID there are multiple options:

      <br>

      1. Generate a random 128 bit UUID

      <br>

      2. A list of entities, which dealt with this event already. For

      example if an event was passed on from cert-at to cert-ee, the

      field could look like `!cert-at!cert-ee`. A message sending loop

      can be detected if the own name is already in this field upon

      reception.

      <br>

      3. Using CyCat: `publisher-short-name:project-short-name:UUID`.

      For example:

      `cert-at:intelmq:72ddb00c-2d0a-4eea-b7ac-ae122b8e6c3b`, or

      `cert-pl:n6:f60c9fb9-81f9-4e0b-8a44-ea41326a15b3`. Some more

      research and discussion is required before the implementation of

      this option. Have a look at

      <a class="moz-txt-link-freetext" href="https://www.cycat.org/services/concept/">https://www.cycat.org/services/concept/</a> for more details.

      <br>

      4. A hash: A benefit using a hash is that we're able to

      recalculate them on every intelmq instance.

      <br>

      <br>

      ### Exporting events to other systems

      <br>

      In IntelMQ 2.x the events only comprise of the "payload" and no

      meta information. For local storages like file output or

      databases, the meta information may not be relevant in some

      use-cases. So it needs to be possible to export events *without*

      meta information, which is also the backwards-compatible

      behaviour.

      <br>

      <br>

      The "type" field exists in the current format as "__type" in the

      flat payload structure. In the output bots there's currently a

      boolean parameter `message_with_type` to include the field

      `__type` in the "export".

      <br>

      For optionally exporting meta-information like uuid or format, a

      similar logic could be used.

      <br>

      <br>

      ### How can data exchange work?

      <br>

      This now depends on how IntelMQ instances can communicate, either

      Peer-to-peer or via a central data hub. Both of them do have pro's

      and con's.

      <br>

      <br>

      #### P2P ( Peer 2 Peer )

      <br>

      Decentralized network

      <br>

      + Less downtimes: A downtime of one instance, does not affect the

      whole network

      <br>

      + Better privacy: data is not shared to an unrelated instance

      <br>

      + More secure: data can optionally be encrypted (key-exchange

      between instances?)

      <br>

      + Decentralized and local maintenance

      <br>

      ~ Network latency depends on server locations

      <br>

      - Networking issues may occur

      <br>

      <br>

      How would data exchange looks like between two instances:

      <br>

      1) Instance A has events which should be relayed to Instance B

      & C, because they're not sure who the actually receiver should

      be

      <br>

      2) Instance A ensures all messages have a UUID

      <br>

      3) Instance A sends the data to Instance B & Instance C

      <br>

      4) Instance B checks the data & they're sure that the data

      should be for Instance C

      <br>

      5) Instance C receives data from Instance A & Instance B

      <br>

      6) Instance C checks the UUID, which is the same & drops the

      package from Instance B

      <br>

      <br>

      #### (Central) Data hub

      <br>

      + Less maintenance: Is maintained by the hub administrator

      <br>

      + Central data storage (reports can optionally be cached to be

      downloaded later)

      <br>

      ~ Central data analysis (e.g. statistics) is possible

      <br>

      ~ Network latency depends on server locations

      <br>

      - point of failure: if network problems occur, no exchange is

      possible

      <br>

      <br>

      As already seen above, data exchange here would be less

      complicated. The sending may look like:

      <br>

      1) Instance A has events which should be relayed to Instance B

      (e.g. different country)

      <br>

      2) Instance A ensures all messages have a UUID

      <br>

      3) Instance A sends these messages to the data hub

      <br>

      <br>

      The reception side can look like:

      <br>

      1) Instance B connects to central instance

      <br>

      2) Instance B queries and downloads all available messages

      <br>

      3) Upon reception, all messages are de-duplicated based on the

      UUID:

      <br>

        a) If the UUID is already known, discard the message

      <br>

        b) If the UUID has not been seen before, continue with

      processing

      <br>

      <br>

      To sum up, both exchange variants are useful. More research is

      needed, i. e. a mixed infrastructure with centralized parts but

      can be decentralized too. However, this shall not be neither the

      purpose nor the aim of this IEP.

      <br>

      <br>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

    </blockquote>

  </body>

</html>