[IntelMQ-dev] Preventing lost events when stopping intelmq

Sebastian Wagner wagner at cert.at
Fri Jul 30 12:13:51 CEST 2021


Hi,

On 7/30/21 11:22 AM, Mika Silander wrote:
>  I didn't tell the name of our ticketing system :-).
IntelMQ only has a collector for one ticketing system, so I assumed you
are using that one.
> Anyway, I was referring to stopping the entire intelmq bot infra on a server. I think it is better not to require the admins to remember to stop one specific bot before server maintenance/reboots etc. The ultimate solution in our case is to switch to a ticketing system that truly supports transactions (in the db sense).
>
>  Having a meeting is fine, although I fear I haven't very much to offer apart from the idea of using signals the way I already explained. The cleanest other option for implementing graceful shutdown could be to have the bots accept some control message ("graceful shutdown") sent out by intelmqctl to all bots, but the risks of this turning into over-engineering and/or KISS violation are high, right?

That would require some control-channel for sending commands. Maybe we
need to rely more on threading anyway, because the main thread catches
all the signals, and can handle them gracefully, while worker threads
could continue their work uninterrupted. And that already exists with
out multi-threading feature, which I implemented a while ago. E.g. the
main thread handles SIGHUP and sets a event for all the instances:
https://github.com/certtools/intelmq/blob/343432f7aea7a59a578762aa961b635a032a5922/intelmq/lib/bot.py#L171
Extending that might mean some cleanup and always run in threading mode,
also if it's not necessary to do so (having only 1 instance). But in
principle, that's already kind of a PoC for such an implementation.

best regards
Sebastian

>
> Br, Mika
>
>  
> ----- Original Message -----
> From: "Sebastian Wagner" <wagner at cert.at>
> To: "Mika Silander" <mika.silander at csc.fi>, "intelmq-dev" <intelmq-dev at lists.cert.at>
> Sent: Friday, 30 July, 2021 11:17:55
> Subject: Re: [IntelMQ-dev] Preventing lost events when stopping intelmq
>
> Hi Mika,
>
> On 7/26/21 9:18 AM, Mika Silander wrote:
>>  Back from short holidays now, thanks for the answer. The reason to my question was not actually related to intelmq but to the ticketing system we have behind intelmq. This ticketing system will end up having inconsistent information if intelmq is stopped in the midst of event processing and I'd like to minimize the likelihood of this happening. People familiar with this ticketing system would rightfully argue the system itself is nothing but an inconsistency, but that's another story.
> Do you refer to the interruption of the RT collector alone, or of the
> IntelMQ Instance in total?
>
>>  Continuing with the idea of using signals: would it be possible to implement a signal handling routine (for another signal than kill) that cleanly shuts down a bot if it is not processing an event? And if it is processing, set a flag so that once processing is finished, the bot will shutdown? Still, if I'm not mistaken, Linux doesn't guarantee the delivery of signals so even this approach isn't foolproof.
> I think so, yes. But I'm not an expert with Linux' and Python's signal
> handling and I have already misunderstood it in the past.
>
> See also: https://github.com/certtools/intelmq/issues/1247
>
> We can also do a short meeting on this topic, if you'd like. Is anyone
> else also interested in signals/graceful shutdowns?
>
> Sebastian
>
>>  One can envision other approaches to implement shutdown functionality but they all tend to violate the KISS principle. 
>>
>> Br, Mika
>>
>> ----- Original Message -----
>> From: "Sebastian Wagner" <wagner at cert.at>
>> To: "Mika Silander" <mika.silander at csc.fi>, "intelmq-dev" <intelmq-dev at lists.cert.at>
>> Sent: Friday, 2 July, 2021 17:24:31
>> Subject: Re: [IntelMQ-dev] Preventing lost events when stopping intelmq
>>
>> Hi Mika,
>>
>> On 7/1/21 3:05 PM, Mika Silander wrote:
>>> Returning to a similar issue but from a different angle: for maintenance I'd like to be able to cleanly shutdown the server running intelmq. Is there a way to guarantee that none of the bots is in a processing state (i.e. processing an event in the process method) before server shutdown? Can "intelmqctl stop" for example stop the bot chain in such a way that none of the bots is in the midst of processing an event?
>>>  If not, what would be the best approach for achieving this?
>> I have two answer to offer:
>>
>> The kill signal is destructive and interrupts syscalls. So after the
>> reception, the bot cannot just continue where it stopped. As far as I
>> know it's currently not possible to circumvent this except for
>> threading, where the main thread receives the signal and then could wait
>> for the other threads finish processing. Would be a cool feature :)
>> Related feature request (of myself):
>> https://github.com/certtools/intelmq/issues/1298
>>
>> The other answer is: You may simply ignore this. You won't loose any
>> data, as the message on the input side is only deleted after the message
>> is processed completely and sent to the next queue. But you can end up
>> with messages being duplicated, especially if you kill a parser which is
>> just parsing a large report. It could happen for all bots in principle,
>> if you kill them after they sent the message and just before they
>> acknowledged it - but I consider that very improbable. You can prevent
>> this by placing another deduplicator just before your output bot(s).
>>
>> I assume these are not the answers you were looking for and hope they
>> don't spoil your mood just before the weekend =)
>>
>> best regards
>> Sebastian
>>
-- 
// Sebastian Wagner <wagner at cert.at> - T: +43 676 898 298 7201
// CERT Austria - https://www.cert.at/
// Eine Initiative der nic.at GmbH - https://www.nic.at/
// Firmenbuchnummer 172568b, LG Salzburg


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.cert.at/pipermail/intelmq-dev/attachments/20210730/e7e17cd8/attachment.sig>


More information about the IntelMQ-dev mailing list