[IntelMQ-dev] Preventing lost events when stopping intelmq

Mika Silander mika.silander at csc.fi
Fri Jul 30 11:22:01 CEST 2021


Hi Sebastian, all,

 I didn't tell the name of our ticketing system :-). Anyway, I was referring to stopping the entire intelmq bot infra on a server. I think it is better not to require the admins to remember to stop one specific bot before server maintenance/reboots etc. The ultimate solution in our case is to switch to a ticketing system that truly supports transactions (in the db sense).

 Having a meeting is fine, although I fear I haven't very much to offer apart from the idea of using signals the way I already explained. The cleanest other option for implementing graceful shutdown could be to have the bots accept some control message ("graceful shutdown") sent out by intelmqctl to all bots, but the risks of this turning into over-engineering and/or KISS violation are high, right?

Br, Mika

 
----- Original Message -----
From: "Sebastian Wagner" <wagner at cert.at>
To: "Mika Silander" <mika.silander at csc.fi>, "intelmq-dev" <intelmq-dev at lists.cert.at>
Sent: Friday, 30 July, 2021 11:17:55
Subject: Re: [IntelMQ-dev] Preventing lost events when stopping intelmq

Hi Mika,

On 7/26/21 9:18 AM, Mika Silander wrote:
>  Back from short holidays now, thanks for the answer. The reason to my question was not actually related to intelmq but to the ticketing system we have behind intelmq. This ticketing system will end up having inconsistent information if intelmq is stopped in the midst of event processing and I'd like to minimize the likelihood of this happening. People familiar with this ticketing system would rightfully argue the system itself is nothing but an inconsistency, but that's another story.

Do you refer to the interruption of the RT collector alone, or of the
IntelMQ Instance in total?

>  Continuing with the idea of using signals: would it be possible to implement a signal handling routine (for another signal than kill) that cleanly shuts down a bot if it is not processing an event? And if it is processing, set a flag so that once processing is finished, the bot will shutdown? Still, if I'm not mistaken, Linux doesn't guarantee the delivery of signals so even this approach isn't foolproof.

I think so, yes. But I'm not an expert with Linux' and Python's signal
handling and I have already misunderstood it in the past.

See also: https://github.com/certtools/intelmq/issues/1247

We can also do a short meeting on this topic, if you'd like. Is anyone
else also interested in signals/graceful shutdowns?

Sebastian

>  One can envision other approaches to implement shutdown functionality but they all tend to violate the KISS principle. 
>
> Br, Mika
>
> ----- Original Message -----
> From: "Sebastian Wagner" <wagner at cert.at>
> To: "Mika Silander" <mika.silander at csc.fi>, "intelmq-dev" <intelmq-dev at lists.cert.at>
> Sent: Friday, 2 July, 2021 17:24:31
> Subject: Re: [IntelMQ-dev] Preventing lost events when stopping intelmq
>
> Hi Mika,
>
> On 7/1/21 3:05 PM, Mika Silander wrote:
>> Returning to a similar issue but from a different angle: for maintenance I'd like to be able to cleanly shutdown the server running intelmq. Is there a way to guarantee that none of the bots is in a processing state (i.e. processing an event in the process method) before server shutdown? Can "intelmqctl stop" for example stop the bot chain in such a way that none of the bots is in the midst of processing an event?
>>  If not, what would be the best approach for achieving this?
> I have two answer to offer:
>
> The kill signal is destructive and interrupts syscalls. So after the
> reception, the bot cannot just continue where it stopped. As far as I
> know it's currently not possible to circumvent this except for
> threading, where the main thread receives the signal and then could wait
> for the other threads finish processing. Would be a cool feature :)
> Related feature request (of myself):
> https://github.com/certtools/intelmq/issues/1298
>
> The other answer is: You may simply ignore this. You won't loose any
> data, as the message on the input side is only deleted after the message
> is processed completely and sent to the next queue. But you can end up
> with messages being duplicated, especially if you kill a parser which is
> just parsing a large report. It could happen for all bots in principle,
> if you kill them after they sent the message and just before they
> acknowledged it - but I consider that very improbable. You can prevent
> this by placing another deduplicator just before your output bot(s).
>
> I assume these are not the answers you were looking for and hope they
> don't spoil your mood just before the weekend =)
>
> best regards
> Sebastian
>
-- 
// Sebastian Wagner <wagner at cert.at> - T: +43 676 898 298 7201
// CERT Austria - https://www.cert.at/
// Eine Initiative der nic.at GmbH - https://www.nic.at/
// Firmenbuchnummer 172568b, LG Salzburg


More information about the IntelMQ-dev mailing list