[IntelMQ-dev] Preventing lost events when stopping intelmq

Mon Jul 26 09:18:03 CEST 2021

Hi Sebastian, all,

 Back from short holidays now, thanks for the answer. The reason to my question was not actually related to intelmq but to the ticketing system we have behind intelmq. This ticketing system will end up having inconsistent information if intelmq is stopped in the midst of event processing and I'd like to minimize the likelihood of this happening. People familiar with this ticketing system would rightfully argue the system itself is nothing but an inconsistency, but that's another story.

 Continuing with the idea of using signals: would it be possible to implement a signal handling routine (for another signal than kill) that cleanly shuts down a bot if it is not processing an event? And if it is processing, set a flag so that once processing is finished, the bot will shutdown? Still, if I'm not mistaken, Linux doesn't guarantee the delivery of signals so even this approach isn't foolproof.

 One can envision other approaches to implement shutdown functionality but they all tend to violate the KISS principle. 

Br, Mika

----- Original Message -----
From: "Sebastian Wagner" <wagner at cert.at>
To: "Mika Silander" <mika.silander at csc.fi>, "intelmq-dev" <intelmq-dev at lists.cert.at>
Sent: Friday, 2 July, 2021 17:24:31
Subject: Re: [IntelMQ-dev] Preventing lost events when stopping intelmq

Hi Mika,

On 7/1/21 3:05 PM, Mika Silander wrote:
> Returning to a similar issue but from a different angle: for maintenance I'd like to be able to cleanly shutdown the server running intelmq. Is there a way to guarantee that none of the bots is in a processing state (i.e. processing an event in the process method) before server shutdown? Can "intelmqctl stop" for example stop the bot chain in such a way that none of the bots is in the midst of processing an event?
>  If not, what would be the best approach for achieving this?

I have two answer to offer:

The kill signal is destructive and interrupts syscalls. So after the
reception, the bot cannot just continue where it stopped. As far as I
know it's currently not possible to circumvent this except for
threading, where the main thread receives the signal and then could wait
for the other threads finish processing. Would be a cool feature :)
Related feature request (of myself):
https://github.com/certtools/intelmq/issues/1298

The other answer is: You may simply ignore this. You won't loose any
data, as the message on the input side is only deleted after the message
is processed completely and sent to the next queue. But you can end up
with messages being duplicated, especially if you kill a parser which is
just parsing a large report. It could happen for all bots in principle,
if you kill them after they sent the message and just before they
acknowledged it - but I consider that very improbable. You can prevent
this by placing another deduplicator just before your output bot(s).

I assume these are not the answers you were looking for and hope they
don't spoil your mood just before the weekend =)

best regards
Sebastian

-- 
// Sebastian Wagner <wagner at cert.at> - T: +43 676 898 298 7201
// CERT Austria - https://www.cert.at/
// Eine Initiative der nic.at GmbH - https://www.nic.at/
// Firmenbuchnummer 172568b, LG Salzburg