[IntelMQ-dev] Expected behaviour of bot reloading after log rotation

Mika Silander mika.silander at csc.fi
Tue Oct 24 10:31:28 CEST 2023


Morning all,

 Ignore my two messages from yesterday. Signaling works fine, the problem in our case is the very long wait between the signaling and when the reload is effectuated. This morning I had a workable interleaving of events such that it was easy to see. From my point of view it would be handy if all bots could handle the reload also if they happen to be idle waiting for new incoming events since that would make the delay between signaling and reloading short. However, even that does leave a small time window where one of our monitoring scripts could be working on void log file handles. So, looks like we have to devise a way to detect those void handles in our monitoring scripts.

Br, Mika

----- Original Message -----
From: "Mika Silander" <mika.silander at csc.fi>
To: "intelmq-dev" <intelmq-dev at lists.cert.at>
Sent: Monday, 23 October, 2023 16:48:45
Subject: Re: [IntelMQ-dev] Expected behaviour of bot reloading after log rotation

... and one small addition to the below: it appears psutil 5.9.0 invokes os.kill() internally, and this latter succeeds, but no signal is delivered.

Br, Mika

----- Original Message -----
From: "Mika Silander" <mika.silander at csc.fi>
To: "intelmq-dev" <intelmq-dev at lists.cert.at>
Sent: Monday, 23 October, 2023 16:37:17
Subject: Re: [IntelMQ-dev] Expected behaviour of bot reloading after log rotation

Hi Sebastian, all,

 Started the week debugging. First I answer my question earlier about "expected [signaling] behaviour" with respect to reloading:

"Yes, all the bots in the configured botnet are sent a SIGHUP signal one-by-one when intelmqctl reload is run."

 Then the bad news: I see SIGHUP successfully sent to the Mail Attachment Collector bot _only_. Why on earth does the signaling of the other bots fail, I don't have a clue. The other bots are sent the same signal, but apparently it does not get delivered. All PIDs were correct. So, the problem must be inside psutil 5.9.0 that offers POSIX signaling.

 If someone is on a

Ubuntu 22.04 LTS
Python 3.10.12
intelmq 3.2.+
psutil 5.9.0 (from python3-psutil apt package)    

 I would appreciate to know what is the outcome of your invocation of (and running as root):

sudo -u intelmq /usr/bin/intelmqctl reload

 and let me know what hits you get with

grep SIGHUP /var/log/intelmq/*.log

 after the reload if any (you may need to adapt the log file paths based on your particular installation of intelmq).

Thank you and br, Mika

 

----- Original Message -----
From: "Sebix" <sebix at sebix.at>
To: "intelmq-dev" <intelmq-dev at lists.cert.at>, "Mika Silander" <mika.silander at csc.fi>
Sent: Thursday, 19 October, 2023 22:19:17
Subject: Re: [IntelMQ-dev] Expected behaviour of bot reloading after log rotation

Hi Mika

To explain the behavior (partly): Signals have an interrupting nature on 
connections to redis, sleep()[0], and other things. The interrupt itself 
is caught by the method __handle_sighup_signal.

Image a parser bot is processing a (potentially large) report. Then, 
logrotate sends the SIGHUP signal. If the parser now reloaded, the Bot 
needed to re-initialize, interrupting the processing. There are also 
other circumstances when a re-initialization is "sub-optimal".

Thus, handling the signal is postponed until it is safe to do so, for 
example, after every process call and after receiving a message from the 
pipeline.

This logic can be revisited now. The core received quite a few updates 
since the implementation of reloading, and there are possibly more 
places in the Bot's code where the reloading can be triggered if necessary.

[0]: That's why e.g. Bot.__sleep, Bot.receive_message contain 
while-loops, in case a SIGHUP interrupted them

Sebastian

Institute for Common Good Technology
gemeinnütziger Kulturverein - nonprofit cultural society
https://commongoodtechnology.org/
ZVR 1510673578

On 10/19/23 14:35, Kamil Mankowski via IntelMQ-dev wrote:
> Hey,
>
> There is a bot's flag ("_sighup_delay") controlling weather a bot 
> reloads immediate or waits. Looking at the source code, it looks like 
> an expected behaviour, see:
>
> https://github.com/certtools/intelmq/blob/d624988693894e71defdd8b18618f7af98be852c/intelmq/lib/bot.py#L289C27-L297 
>
>
> and:
>
> https://github.com/certtools/intelmq/blob/d624988693894e71defdd8b18618f7af98be852c/intelmq/lib/bot.py#L713-L719 
>
>
> However, I understand your concern: this is okay if the reload happens 
> during the one day (logrotate keeps the old log file untouched until 
> next run), but it can cause troubles, if there is no message on one day.
>
> Best regards,
> Kamil Mankowski
> CERT.at GmbH
> www.cert.at
>
> On 10/19/23 13:29, Mika Silander wrote:
>> Hi all,
>>
>>   Apologies for a lengthy message, but I had a set of monitoring 
>> scripts that worked like a charm on a physical machine but when moved 
>> to a virtual machine with a newer Ubuntu, I see odd failures in the 
>> form of monitoring script processes left hanging on void file 
>> handles. I suspect the problem is somehow related to log rotation in 
>> connection to bot reloading but I can't figure out how. A few 
>> questions at the end.
>>
>>   Our setup for reference:
>>
>> Ubuntu 22.04 LTS
>> python 3.10.12
>> intelmq 3.2.2a1
>> intelmq-api 3.1.0-rc1
>> intelmq-manager 3.2.0
>>
>>   We have a simple bot chain: Mail Attachment Collector -> 
>> Shadowserver Parser -> N expert bots -> 1 output bot.
>>
>> /etc/logrotate.d/intelmq contains:
>>
>> /var/log/intelmq/*.log {
>>      su intelmq intelmq
>>      daily
>>      maxsize 10M
>>      rotate 60
>>      notifempty
>>      compress
>>      delaycompress
>>      create 644 intelmq intelmq
>>      sharedscripts
>>      postrotate
>>          sudo -u intelmq /usr/bin/intelmqctl --quiet reload
>>      endscript
>> }
>>
>>   The cron daemon runs run-parts for targets under /etc/cron.daily/ 
>> at 06:25, including /etc/cron.daily/logrotate. Thus, "intelmqctl 
>> --quiet reload" gets also executed after all the intelmq log files 
>> have been rotated.
>>
>>   What I see is that no bot reloads soon after run-parts has 
>> finished. The first bot to perform a reload is the Mail Attachment 
>> Collector immediately after midnight. The others remain as is until 
>> the Mail Attachment Collector reads the day's first report and 
>> forwards it to the Shadowserver Parser bot. The parser bot then 
>> performs a reload and forwards the events to the first expert in the 
>> chain. Once the first event from the parser trickles through the 
>> entire bot chain, every bot in turn first performs a reload and then 
>> processes the event. In practise, the logrotate triggered bot reload 
>> can thus be delayed almost a full day - our first mail reports tend 
>> to arrive during the very early morning hours.
>>
>>   My question is, is the above reload behaviour as expected? I found 
>> it strange that the Mail Attachment Collector reloads at midnight 
>> even though it has processed new mail reports after the "intelmqctl 
>> reload" call done by logrotate. I would have expected the Mail 
>> Attachment Collector to reload itself when the first report arrives 
>> for processing after 06:25, i.e. the first report after the call to 
>> intelqmctl reload. And, what controls the timing of the Mail 
>> Attachment Collector's reload? All its reloads I see in older log 
>> files indicate that the reload is consistently run after midnight 
>> irrespective of incoming reports and the timing of the intelmqctl 
>> reload command.
>>
>>   Thanks again and best regards from a puzzled Mika
>>
>>
>>
>>
>>
>> _______________________________________________
>> IntelMQ-dev mailing list
>> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
>> https://intelmq.readthedocs.io/
>
> _______________________________________________
> IntelMQ-dev mailing list
> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
> https://intelmq.readthedocs.io/
_______________________________________________
IntelMQ-dev mailing list
https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
https://intelmq.readthedocs.io/
_______________________________________________
IntelMQ-dev mailing list
https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
https://intelmq.readthedocs.io/


More information about the IntelMQ-dev mailing list