Dear Mika,
Sorry for the late response. I have seen the mail, but postponed answering to later and then I forgot...
On 2/2/21 1:14 PM, Mika Silander wrote:
Trying to assess what safeguards are sufficient: what happens when a bot has some internal failure and it "dies"?
IntelMQ has an internal error handling, so a thrown exception, e.g. in the bot's process() method does not lead to the bot dying. Documentation on this can be found at https://intelmq.readthedocs.io/en/latest/user/configuration-management.html#...
Please let us know if information is missing there so we can improve it.
Will intelmq restart the bot automatically or will it be up to the admin of intelmq to manually restart it?
Currently there is no such automatism by default. IntelMQ has as of now no watcher/supervising daemon itself, but we have
- integration into supervisord: https://intelmq.readthedocs.io/en/latest/user/configuration-management.html#... - and a script to generate systemd service files for bots: https://github.com/certtools/intelmq/tree/develop/contrib/systemd (and as I am reminded just now that is really badly documented)
And if automatic restarts is the norm, how could one stop the bot from processing new incoming messages if say, X consecutive failures like these have happened within the time frame of the last 5 minutes?
The error handling takes care of that. By default, the bot tries to process a message up to three times and then gives up on this one, dumps it to disk for further inspection of the administrator, and continues with the next message. The erroneous message is removed from the queue.
For parsers you can reduce the parameter error_max_retries, as they don't depend on external resources and temporary failures can't happen. For experts which make external lookups, retries are perfectly fine.
For more information on the dumping functionality and how to process these dumps, see https://intelmq.readthedocs.io/en/latest/user/configuration-management.html#...
By writing some log entries at bot startup and then making the bot itself analyze the log at every restart?
I'm trying to make sure a burst of erroneous/malformed events are not accidentally forwarded by a malfunctioning or partially functioning bot.
That won't happen, except if you explicitly configure IntelMQ to do so.
Hope that helps. If it doesn't - don't dare to ask :)
best regards Sebastian