<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Hi.</p>

    <div class="moz-cite-prefix">On 1/22/21 5:26 PM, Bernhard Reiter

      wrote:

    </div>

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">This part is about the question where do we store the

configuration?.

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

overall I do miss the use cases or problems 

that should be addressed by the proposed changes.

Having a problem description and links to discussion that have already taken 

place, would make it easier to comment on the proposal.

Some relevant places that describe wishes, status and suggestions:

  <a class="moz-txt-link-freetext" href="https://intelmq.readthedocs.io/en/latest/user/bots.html#common-parameters">https://intelmq.readthedocs.io/en/latest/user/bots.html#common-parameters</a>

  <a class="moz-txt-link-freetext" href="https://intelmq.readthedocs.io/en/latest/user/configuration-management.html">https://intelmq.readthedocs.io/en/latest/user/configuration-management.html</a>

  <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/267">https://github.com/certtools/intelmq/issues/267</a>

    (Configurations - Hierarchy configurations) closed

  <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/552">https://github.com/certtools/intelmq/issues/552</a>

    (Enable separate packaging of bots by allowing addition and removals to 

the config)</pre>

    </blockquote>

    Plus<br>

    <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/570">https://github.com/certtools/intelmq/issues/570</a> "configuration

    format"<br>

    <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/121">https://github.com/certtools/intelmq/issues/121</a> "Configuration

    Files" (closed but not implemented all ideas)<br>

    <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/1026">https://github.com/certtools/intelmq/issues/1026</a> "Proposal: use

    template library for JSON configs" (not addressed by this proposal)<br>

    <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/1580">https://github.com/certtools/intelmq/issues/1580</a> "Some parameters

    with default values throw AttributeError when not set"<br>

    and related to the BOTS file:<br>

    <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/440">https://github.com/certtools/intelmq/issues/440</a> "Installing custom

    Bots"<br>

    <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/1646">https://github.com/certtools/intelmq/issues/1646</a> "Run custom bot"<br>

    <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/552">https://github.com/certtools/intelmq/issues/552</a> "Enable separate

    packaging of bots by allowing addition and removals to the config."<br>

    <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/757">https://github.com/certtools/intelmq/issues/757</a> "Clearly define all

    parameters used in a bot"<br>

    <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/668">https://github.com/certtools/intelmq/issues/668</a> "Very long BOTS

    file"<br>

    <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/644">https://github.com/certtools/intelmq/issues/644</a> "Errors when already

    configured bots gain additional options through upgrade"<br>

    <a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/issues/908">https://github.com/certtools/intelmq/issues/908</a> "Parameter from BOTS

    does'nt passed to a new bot"

    <p>But non of them directly matches the proposal and most are

      addressed by the "Internal handling" section of the proposal. Our

      proposal is also based on the requirements collection last year

      and extended to match the behavior of other tools (`-c` parameter)

      or simply some handy usability tricks like setting parameters with

      `-p` (useful for debugging & testing). So, besides the

      examples given or linked in the proposal itself, there are not

      much more use-cases.<br>

    </p>

    <p>Our intention was as well to *start* a discussion by the proposal

      in the first place, but until now the discussion mainly focused on

      one aspect. One lesson learning on this is to split proposals into

      smaller parts, and not group them too much.

    </p>

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">In addition to that, to make the setup of IntelMQ easier, the

defaults.conf should be dropped. Default values should be set in the

Bot classes respectively in the IntelMQ process managers, but there

is no need for a separate file.

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

The default.conf seems to be used to offer a single place to change

options shared by many bots (e.g. http_user_agent) at once.

If options exist where a common value for a single installation

and their bots is useful the functionality has to be kept somewhere

central. 

I understood the new plave for this would be in a global configuration file,

which contains what default.conf had. This would just be a renaming if there 

weren't other things in the file.</pre>

    </blockquote>

    It's more than renaming, it's also a cleanup. As the IntelMQ-default

    values go into the code, that file (or section in a file) only needs

    to carry those default values which are set by the administrator and

    differ from IntelMQ's defaults. So the default-files of most

    installations can be either dropped or will shrink significantly.

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">Another question is, if every bot should have their own

configuration file. 

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

What would be the use case for this?

 #552 packaging does not mandate this, if general default

 values are in the source code of bots. (It would mandate it,

 if bots had to come with an example config file to be useful.)</pre>

    </blockquote>

    <p>The question/proposal is based on a use-case identified by the

      requirements collection:<br>

    </p>

    <p><a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/blob/version-3.0-ideas/docs/architecture-3.0.md#user-content-configuration-files">https://github.com/certtools/intelmq/blob/version-3.0-ideas/docs/architecture-3.0.md#user-content-configuration-files</a></p>

    <p>> be on a per-program-basis (one config file per "bot"). The

      config files per program shall reside in $base/etc/config.d/ and

      follow the common linux standards.</p>

    <p>The proposal to use the -c parameter for this covers the

      use-case, but is more generic. For example it can be handy for

      Docker-setups as well, as described in the initial mail.

    </p>

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <pre class="moz-quote-pre" wrap="">Again one aspect to look for can be what we want to do with the

configuration files. One use case is:

We want to check the whole configuration for consistency. 

For this it make sense that a lot of stuff is known about

configuration parameters and to me the best way to specify this is

as part of the source code of bots using Python code and type information.

This way even more complex requirements for config values can be expressed 

using python functions and dynamic consistency check could use this code.

Thus the code for a bot specific configuration parameters should be 

close to the bot itself.</pre>

    </blockquote>

    Definitely. We thought about using variable typing for this, but

    haven't done PoCs yet. See section "Internal handling" of the

    proposal

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <pre class="moz-quote-pre" wrap="">(And if their are parameters they share, it can be in the super class or 

abstract class, coming with IntelMQ (core).)</pre>

    </blockquote>

    For the CollectorBot and ParserBot classes, this is already the

    case. There's more potential, e.g. a HTTPBot class.

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">Some users wish to be able to start a bot 

without having to rely on IntelMQ, 

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

Why? How can a bot with access to the IntelMQ queues be useful?

I can imagine some janitor functionality, like freshing an external

datasource format from time to time and this needs parameters

that the real bot also needs. Anyhow could be seen as not being the bot 

itself, it would just be shared config values.</pre>

    </blockquote>

    I don't have more details on this use-case. But this use-case is

    covered by the more generic idea to have a -c parameter to load

    configuration files.<br>

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">If we want to support the request to be able to pass individual

configurations to bots,

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

Why would I run a bot that affects the IntelMQ network

to be run with different parameters? I have to make sure to stop the bot with 

the real parameters.</pre>

    </blockquote>

    When running bots interactively for testing and debugging, this

    would be very handy. It's the operators responsibility to stop the

    bot, after starting it with deviating parameters.<br>

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">This individual configuration file would also allow a 

bot to be run in a docker environment without having to set any

environment variables. 

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

The bots would still have to access the commonly set parameters.</pre>

    </blockquote>

    Not if the commonly set parameters are included in that file, or if

    IntelMQ's defaults are ok.<br>

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <pre class="moz-quote-pre" wrap="">Interlude:

  <a class="moz-txt-link-freetext" href="https://12factor.net/config">https://12factor.net/config</a>

believes that using ENVIRONMENT variables would be a good pattern

for running application parts ("apps") in different containers.

Wireing that happens outside of course.

The idea is, if you need a different set of configuration,

just fire up a container with it.

(I am not necessarily convinced of this pattern, leading to this comment

<a class="moz-txt-link-freetext" href="https://github.com/Intevation/intelmq-fody-backend/blob/ad7a88022bdeadf3461ab63ba8b6327013ec8772/tickets_api/tickets_api/serve.py#L90">https://github.com/Intevation/intelmq-fody-backend/blob/ad7a88022bdeadf3461ab63ba8b6327013ec8772/tickets_api/tickets_api/serve.py#L90</a>

)</pre>

    </blockquote>

    This is also the best practice for Docker, leading to this part of

    the proposal:

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">* Every bot also consults the environment and the values that are

   set their overwrite the values in any configuration file

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

Same here.</pre>

    </blockquote>

    The primary use-case here is Docker. In Docker the best-practice to

    pass configuration variables to containers are environment

    variables. This approach is partly used by the existing Docker image

    we created.<br>

    For now, we only implemented this for redis_cache_host (<span

      class="pl-s"></span><a class="moz-txt-link-freetext" href="https://github.com/certtools/intelmq/blob/develop/intelmq/lib/bot.py#L734-L738">https://github.com/certtools/intelmq/blob/develop/intelmq/lib/bot.py#L734-L738</a>)

    as bare minimum to be able to create the Docker image.<br>

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">* There are also configuration files which list settings that are

   not bot specific, i.e. via a reserved key default (successor of

   the defaults.conf file) or group:id, those are also handled like

   other configuration files, but the bot does not compare its name to

   the key of the configuration.

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

So additional default.conf files? (I guess I do not fully understand the 

idea.)</pre>

    </blockquote>

    <p>In order to get rid of the separate defaults.conf file, the

      proposal lists two solutions:</p>

    <p>* the reserved key "default" (or similar). For example, the

      configuration file could look like this:<br>

      ```<br>

      - shodan1:

      <br>

          module: intelmq.bots.collectors.shodan.collector

      <br>

      - mylittlebot23:

      <br>

          module: intelmq.bots.expert.asn_lookup.expert

      <br>

          http:

      <br>

            proxy: <a class="moz-txt-link-freetext"

        href="http://myproxy.tld:80">http://myproxy.tld:80</a><br>

      - default:<br>

        http:<br>

          proxy: <a class="moz-txt-link-freetext" href="http://mydefault.proxy.intern:8080">http://mydefault.proxy.intern:8080</a><br>

      ```<br>

    </p>

    <p>* The other *additional* solution are the group defaults. The

      example given in the proposal is:<br>

      ```<br>

      - group:collectors<br>

        http:<br>

          proxy: <a class="moz-txt-link-freetext" href="http://thirdparty.proxy.tld:9000">http://thirdparty.proxy.tld:9000</a> <br>

      ```</p>

    <p>This would be a new feature and can be handy for e.g. rate_limit

      or error handling parameters

    </p>

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">In an ideal setup, the bot should be totally

indifferent as to if it runs in a Docker container, on bare metal,

in a SystemD unit file or with SupervisorD. 

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

I agree in principle.

A potential solution is: the process manager could extract

all the configuration settings and export them all in environment variables.

This way the central configuration files (which were existing in all proposed 

variants) do not have to be shipped to the container, so filesystem access 

would not be mandatory, only access to redis and whatever other resources a 

bot needs.</pre>

    </blockquote>

    That's actually one of the possibilities for deploying every bot in

    a single docker container and pass the parameters to the containers

    by the central orchestration component. However, this can be address

    later.

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <pre class="moz-quote-pre" wrap="">Thinking about this, we could make a redis configuration / control queue

and then bots would only need to connect to the queue system and then request

their current configuration from there. (File that idea in folder *crazy*, it 

is getting close to end of business here. ;) )</pre>

    </blockquote>

    I wouldn't call it crazy, but radical.<br>

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <pre class="moz-quote-pre" wrap="">Overall I've observed much good thinking while reading the storage part of the 

proposal part. The whole problem space does not really segments itself nicely 

in my head up to now, which is a sign that things are more involved than at 

first sight. Hope my mixture of questions and thoughts helps to make it 

better!</pre>

    </blockquote>

    <p>Thank you for all your valuable feedback, insights and thoughts.

      We are very thankful for your detailed responses!</p>

    <p>best regards<br>

      Sebastian<br>

    </p>

    <blockquote type="cite"

      cite="mid:202101221726.58521.bernhard@intevation.de">

      <pre class="moz-quote-pre" wrap="">

</pre>

    </blockquote>

    <pre class="moz-signature" cols="72">-- 

// Sebastian Wagner <a class="moz-txt-link-rfc2396E" href="mailto:wagner@cert.at"><wagner@cert.at></a> - T: +43 1 5056416 7201

// CERT Austria - <a class="moz-txt-link-freetext" href="https://www.cert.at/">https://www.cert.at/</a>

// Eine Initiative der nic.at GmbH - <a class="moz-txt-link-freetext" href="https://www.nic.at/">https://www.nic.at/</a>

// Firmenbuchnummer 172568b, LG Salzburg</pre>

  </body>

</html>