<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Hi all,<br>

    </p>

    <p>Thanks everybody for your valuable feedback to our proposal. If I

      may conclude, there were no objections on the sections "Storage"

      and "Internal handling". Either these proposals are overwhelmingly

      good or nobody dared to respond :)</p>

    <p>Regarding the format I see that we have differing opinions,

      especially on the representation of comments in the UI. However,

      the discussion on this topic stalled without clear end. Let me

      summarize what the current situation is, as of IntelMQ 2.2.x +

      Manager with JSON as configuration format:</p>

    <p>- tricky to edit directly because<br>

        - JSON is a bit picky on it's syntax. E.g. in a list or

      dictionary there must not be a comma after the last element, which

      is nasty when adding, removing or rearranging parameters<br>

        - JSON and its syntax are not meant for configuration, e.g. the

      adding of the syntax elements []{} can be nasty.<br>

      - currently there is no way to add comments<br>

        - JSON doesn't have comments by itself<br>

        - IntelMQ + Manager don't support comments by itself either,

      even as data within JSON. E.g. by using special parameter names

      like "parameter1-comment" as Bernhard suggested.<br>

    </p>

    <p>And we have the two use-cases of editing via IntelMQ Manager and

      editing as text directly. Both ways are supported and should be

      possible in a reasonable way. By reducing the downsides of direct

      editing, we could make the life of various IntelMQ users easier.<br>

    </p>

    <p>Both TOML and YAML solve the problem of the tricky-to-edit

      format. YAML-libraries for Python also support comments which can

      be <i>preserved</i>, even when the file is edited by other means

      (intelmqctl as well as IntelMQ Manager).<br>

    </p>

    <p>If we choose TOML, and an IntelMQ user uses comments in the file,

      the comments <em>will be gone</em> if either intelmqctl or

      IntelMQ Manager (resp. the API) changes the file.<br>

      If we choose YAML, and an IntelMQ user uses comments in the file,

      the comments <em>will not be gone</em> if intelmqctl changes the

      file. The IntelMQ Manager needs fixes as well to preserve

      comments[0], and showing them in the Manager could be implemented

      as well.<br>

    </p>

    <p>Then we have the issue of the complexity of TOML/YAML itself,

      compared to each other. Bernhard noted that YAML is too complex,

      while Aaron and Filip didn't share the opinion - please correct me

      if I'm wrong. Staying with JSON means that we have no comments at

      all, but the user can't even attempt to add comments. The

      complexity of parsing and writing for the tools is relatively

      small, as JSON is made for machine-readability.<br>

    </p>

    <p>As far as the discussion has gone so far, we have more "consent"

      for YAML, and less for TOML and leaving it as is. Please speak up

      if you think that my summary is wrong.<br>

    </p>

    <p>best regards,<br>

      Sebastian<br>

    </p>

    <p>[0] Regarding the changes in the IntelMQ Manager Frontend, we

      (CERT.at) desire help from other community members to implement

      these features.<br>

    </p>

    <div class="moz-cite-prefix">On 12/10/20 1:17 PM, Birger Schacht

      wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:0a9ac6ea-15e0-ab39-e943-1751f0e1f6de@cert.at">Dear

      IntelMQ developers and users,

      <br>

      <br>

      below are a couple of ideas how to (hopefully) make configuration

      of IntelMQ easier. Feel free to give feedback, voice concerns or

      simply ask if there is something unclear. We plan to evaluate the

      feedback that emerged in two weeks (after the christmas holidays).

      <br>

      <br>

      <br>

      # IntelMQ Configuration Handling (IntelMQ Enhancement Proposal 01)

      <br>

      <br>

      ## Format

      <br>

      <br>

      ### JSON

      <br>

      <br>

      At the moment, the configuration format of IntelMQ is JSON[^1]. It

      <br>

      is parsed using the Python json library, which is part of the

      Python

      <br>

      Standard Library.  The downside of JSON is, that is is hard to

      read

      <br>

      and and write for humans and it cannot contain comments.

      <br>

      <br>

      [^1]: <a class="moz-txt-link-freetext" href="https://docs.python.org/3/library/index.html">https://docs.python.org/3/library/index.html</a>

      <br>

      <br>

      ### YAML

      <br>

      <br>

      There is a proposal[^2] to use YAML as the default configuration

      <br>

      format. YAML provides way better readability for humans and YAML

      <br>

      supports single line comments. There are two Python YAML libraries

      <br>

      out there, the one being PyYAML[^3] and the other being

      <br>

      ruamel.yaml[^4]. The former is a project by the YAML project

      itself.

      <br>

      The latter is a fork of the former and had much more activity over

      <br>

      the years and better support of the standard. It seems that pyyaml

      <br>

      caught up in the last few years. We don't need any edge cases, so

      <br>

      both libraries would be good for configuration files. According to

      <br>

      this issue[^5] pyyaml does not support “editing YAML whilst

      <br>

      maintaining comments”, which might be a deal breaker, but this

      issue

      <br>

      is from 2016, this might have changed. On the other hand, IntelMQ

      <br>

      does not edit configuration at the moment. pyyaml and ruamel.yaml

      <br>

      are available as package in all relevant Linux distributions.

      <br>

      <br>

      [^2]:

<a class="moz-txt-link-freetext" href="https://github.com/gethvi/intelmq/blob/ideas/docs/Ideas.md#changing-configuration-format-to-yaml">https://github.com/gethvi/intelmq/blob/ideas/docs/Ideas.md#changing-configuration-format-to-yaml</a><br>

      [^3]: <a class="moz-txt-link-freetext" href="https://pyyaml.org/">https://pyyaml.org/</a>

      <br>

      [^4]: <a class="moz-txt-link-freetext" href="https://yaml.readthedocs.io/en/latest/">https://yaml.readthedocs.io/en/latest/</a>

      <br>

      [^5]: <a class="moz-txt-link-freetext" href="https://github.com/yaml/pyyaml/issues/46">https://github.com/yaml/pyyaml/issues/46</a>

      <br>

      <br>

      ### INI

      <br>

      <br>

      The Python Standard Library also ships configparser[^6], which is

      a

      <br>

      “configuration language which provides a structure similar to

      what’s

      <br>

      found in Microsoft Windows INI files”. The files can contain

      <br>

      comments, it comes with a [DEFAULT] section, which can be used for

      <br>

      default values and the configuration files can contain variables.

      <br>

      One downside is that all the configurations are Strings, which

      means

      <br>

      we would have to do parsing ourself.

      <br>

      <br>

      [^6]: <a class="moz-txt-link-freetext" href="https://docs.python.org/3/library/configparser.html">https://docs.python.org/3/library/configparser.html</a>

      <br>

      <br>

      ### toml

      <br>

      <br>

      Tom's Obvious, Minimal Language is another contender for the role

      of

      <br>

      IntelMQs configuration file format. It looks similar to the INI

      file

      <br>

      format, but comes with various data types. It also allows

      comments.

      <br>

      There is a Python library[^7] that seems to be very active. toml

      is

      <br>

      also used as the format for the proposed pyproject.toml file and

      by

      <br>

      the rust community for their package configuration files. toml's

      <br>

      syntax for dictionaries is hard to read/write, harder than with

      <br>

      JSON.

      <br>

      <br>

      [^7]: <a class="moz-txt-link-freetext" href="https://pypi.org/project/toml/">https://pypi.org/project/toml/</a>

      <br>

      <br>

      ### Further information

      <br>

      <br>

      * The summary on file formats on the PEP518 proposition:

      <br>

        <a class="moz-txt-link-freetext" href="https://www.python.org/dev/peps/pep-0518/#other-file-formats">https://www.python.org/dev/peps/pep-0518/#other-file-formats</a>

      <br>

      * At the moment we are leaning towards YAML. Regarding the

      library,

      <br>

        we would choose ruamel.yaml, because it seems to have a more

      <br>

        active upstream and it can retain comments when it modifies a

      yaml

      <br>

        file.

      <br>

      <br>

      ## Storage

      <br>

      <br>

      This part is about the question where do we store the

      <br>

      configuration?.

      <br>

      <br>

      The ideas document[^8] on GitHub already proposes to remove the

      <br>

      pipeline.conf and specifying the destination pipelines in the

      <br>

      individual bot configuration part.  The declaration of the source

      <br>

      queue can be dropped then as well, as it follows a rule anyway.

      <br>

      <br>

      In addition to that, to make the setup of IntelMQ easier, the

      <br>

      defaults.conf should be dropped. Default values should be set in

      the

      <br>

      Bot classes respectively in the IntelMQ process managers, but

      there

      <br>

      is no need for a separate file.

      <br>

      <br>

      Another question is, if every bot should have their own

      <br>

      configuration file.  Some users wish to be able to start a bot

      <br>

      without having to rely on IntelMQ, but at the moment, the bot gets

      <br>

      the configuration from IntelMQ's runtime.conf.  If we want to

      <br>

      support the request to be able to pass individual configurations

      to

      <br>

      bots, we could allow users to pass a separate configuration file

      to

      <br>

      the bot (i.e. using `-c /path/to/config.$ext`). If that file is

      not

      <br>

      set or does not contain the bots id, it is ignored and IntelMQ's

      <br>

      runtime.conf is used as usual.  If it does exists, the global

      <br>

      runtime.conf is still parsed (if it exists - it should also be

      <br>

      possible to run a bot without a runtime.conf) but only the values

      <br>

      that are not set in the individual configuration file are

      <br>

      considered.  This individual configuration file would also allow a

      <br>

      bot to be run in a docker environment without having to set any

      <br>

      environment variables. This would make configuration handling

      <br>

      probably easier, because then configuration settings could be

      stored

      <br>

      in a file (and managed by a configuration management system) and

      the

      <br>

      configuration file could contain comments.

      <br>

      <br>

      Proposal:

      <br>

      <br>

      * IntelMQ gets one global configuration file for all the bots and

      <br>

        the pipeline.conf will be removed

      <br>

      * This global configuration file is

      <br>

        `${PREFIX}/etc/intelmq/intelmq.$ext`. If it does not exists or

      <br>

        does not define any bots, IntelMQ should exit gracefully.

      <br>

        The file extension depends on the chosen format.

      <br>

      * The global configuration file contains an array of bot

      <br>

        configurations with bot-ids as keys.

      <br>

      * Every bot reads the global configuration file and extracts their

      <br>

        own settings (as usual).

      <br>

      * Every bot handles 0 to n `-c /path/to/configurationfile.$ext`

      <br>

        flags, which are treated the same way as the global

      configuration

      <br>

        file.

      <br>

        The further ahead the configuration file in the commandline, the

      <br>

        stronger the content (this allows us to have multiple non-global

      <br>

        configuration files (i.e.  for multiple groups))

      <br>

        Example:

      <br>

        ```

      <br>

        > botcommand bot-id -c /etc/bots/botname.$ext -c

      /etc/bots/groups/group_foo.$ext

      <br>

        ```

      <br>

      * Every bot also consults the environment and the values that are

      <br>

        set their overwrite the values in any configuration file

      <br>

      <br>

      * There are also configuration files which list settings that are

      <br>

        not bot specific, i.e. via a reserved key default (successor of

      <br>

        the defaults.conf file) or group:id, those are also handled like

      <br>

        other configuration files, but the bot does not compare its name

      to

      <br>

        the key of the configuration.

      <br>

      <br>

      All the evaluated configuration formats provide the possibility to

      <br>

      arrange the configuration parameters in hierarchies. To make the

      <br>

      configuration files more readable, IntelMQ should make use of this

      <br>

      hierarchy instead of denoting the different hierarchy levels with

      <br>

      underscores. So instead of writing `http_proxy` the http parameter

      <br>

      would have a childparameter proxy. For backwards compatibility and

      <br>

      cases where the underscore does not imply hierarchy, the

      underscore

      <br>

      notation will still work. In addition, IntelMQ should also make

      use

      <br>

      of environment variables - those are still denoted using an

      <br>

      underscore as delimiter and are prepended with `INTELMQ`:

      <br>

      `INTELMQ_HTTP_PROXY`.

      <br>

      <br>

      [^8]: <a class="moz-txt-link-freetext" href="https://github.com/gethvi/intelmq/blob/ideas/docs/Ideas.md">https://github.com/gethvi/intelmq/blob/ideas/docs/Ideas.md</a>

      <br>

      <br>

      ### Caveats

      <br>

      <br>

      There are configuration settings, that do not really concern the

      <br>

      bot- for example the type of process manager, that should be used

      to

      <br>

      run the bot. In an ideal setup, the bot should be totally

      <br>

      indifferent as to if it runs in a Docker container, on bare metal,

      <br>

      in a SystemD unit file or with SupervisorD. This decision should

      <br>

      only concern the tool managing all the bots (intelmqctl or in the

      <br>

      future intelmq-api (which at the moment uses intelmqctl)). Another

      <br>

      example is the enabled setting. At the moment, those are part of

      the

      <br>

      individual bot configuration, but it might make sense to move them

      <br>

      to a management.conf configuration file which is only for managing

      <br>

      the individual bots, but not for configuring their parameters

      (this

      <br>

      file would then also (for every bot) have a field that lists the

      <br>

      configuration files the bot should consider when reading its

      <br>

      configuration). On the other hand, this might make the

      configuration

      <br>

      more complex again, now that we are trying to merge pipeline.conf

      <br>

      and runtime.conf.  We could also decide to make those

      configuration

      <br>

      settings be part of the global configuration file, given that the

      <br>

      individual bots should anyway simply ignore settings they do not

      <br>

      know how to handle.

      <br>

      <br>

      ### Overriding by command line parameters

      <br>

      <br>

      If needed, a user can override specific bot settings using the -p

      <br>

      switch (i.e.  `-p redis_cache=example.com`). This should be easy

      to

      <br>

      implement, in the best case scenario this is only one line of

      <br>

      additional code in the Bot class.

      <br>

      <br>

      ### Examples

      <br>

      <br>

      A global configuration file with multiple bots

      <br>

      /etc/intelmq/intelmq.yml

      <br>

      <br>

      ```

      <br>

      - shodan1:

      <br>

          module: intelmq.bots.collectors.shodan.collector

      <br>

      - mylittlebot23:

      <br>

          module: intelmq.bots.expert.asn_lookup.expert

      <br>

          http:

      <br>

            proxy: <a class="moz-txt-link-freetext" href="http://myproxy.tld:80">http://myproxy.tld:80</a>

      <br>

      - fop1:

      <br>

          module: intelmq.bots.outputs.file

      <br>

          output:

      <br>

            filename: /dev/null

      <br>

      ```

      <br>

      <br>

      We can run a bot with intelmq-bot shodan1 which is the same as

      <br>

      `intelmq-bot shodan1 -c /etc/intelmq/intelmq.yml`

      <br>

      <br>

      Another configuration file with multiple bots

      <br>

      /root/intelmq-bots-managed-by-root:

      <br>

      <br>

      ```

      <br>

      - shodan2:

      <br>

          module: intelmq.bots.collectors.shodan.collector

      <br>

      - fop1:

      <br>

          module: intelmq.bots.outputs.file

      <br>

          output:

      <br>

            filename: /var/log/fop1.log

      <br>

      ```

      <br>

      <br>

      We can run a bot with

      <br>

      `intelmq-bot shodan2 -c /root/intelmq-bots-managed-by-root`;

      <br>

      We can run a bot using

      <br>

      `intelmq-bot fop1 -c /root/intelmq-bots-managed-by-root`

      <br>

      which would then send output to `/var/log/fop1.log`.

      <br>

      <br>

      A configuration for a group in /etc/intelmq/collector-group.yml

      <br>

      <br>

      ```

      <br>

      - group:collectors

      <br>

        http:

      <br>

          proxy: <a class="moz-txt-link-freetext" href="http://thirdparty.proxy.tld:9000">http://thirdparty.proxy.tld:9000</a>

      <br>

      ```

      <br>

      <br>

      We can run a bot with intelmq-bot

      <br>

      `mylittlebot23 -c /etc/intelmq/collector-group.yml`

      <br>

      which uses the third-party proxy.

      <br>

      <br>

      ## Internal handling

      <br>

      <br>

      Every bot class defines their own settings as class variables.

      Every

      <br>

      class variable has to be typed. Every class variable should be set

      <br>

      to a reasonable default, otherwise None. The init of the

      (abstract)

      <br>

      Bot class should load all the relevant configuration files and

      then

      <br>

      overwrite the settings. If a setting is still None and the value

      of

      <br>

      the setting is vital for the functionality of the bot, the bot

      <br>

      should stop and emit a meaningful error message. For the most

      common

      <br>

      types of settings, there should be Python objects to check the

      <br>

      values.  Value checking should only be done after all the

      <br>

      configurations are merged.

      <br>

      <br>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

    </blockquote>

    <pre class="moz-signature" cols="72">-- 

// Sebastian Wagner <a class="moz-txt-link-rfc2396E" href="mailto:wagner@cert.at"><wagner@cert.at></a> - T: +43 1 5056416 7201

// CERT Austria - <a class="moz-txt-link-freetext" href="https://www.cert.at/">https://www.cert.at/</a>

// Eine Initiative der nic.at GmbH - <a class="moz-txt-link-freetext" href="https://www.nic.at/">https://www.nic.at/</a>

// Firmenbuchnummer 172568b, LG Salzburg</pre>

  </body>

</html>