[IntelMQ-dev] Expected behaviour of bot reloading after log rotation
Sebix
sebix at sebix.at
Thu Oct 19 21:19:17 CEST 2023
Hi Mika
To explain the behavior (partly): Signals have an interrupting nature on
connections to redis, sleep()[0], and other things. The interrupt itself
is caught by the method __handle_sighup_signal.
Image a parser bot is processing a (potentially large) report. Then,
logrotate sends the SIGHUP signal. If the parser now reloaded, the Bot
needed to re-initialize, interrupting the processing. There are also
other circumstances when a re-initialization is "sub-optimal".
Thus, handling the signal is postponed until it is safe to do so, for
example, after every process call and after receiving a message from the
pipeline.
This logic can be revisited now. The core received quite a few updates
since the implementation of reloading, and there are possibly more
places in the Bot's code where the reloading can be triggered if necessary.
[0]: That's why e.g. Bot.__sleep, Bot.receive_message contain
while-loops, in case a SIGHUP interrupted them
Sebastian
Institute for Common Good Technology
gemeinnütziger Kulturverein - nonprofit cultural society
https://commongoodtechnology.org/
ZVR 1510673578
On 10/19/23 14:35, Kamil Mankowski via IntelMQ-dev wrote:
> Hey,
>
> There is a bot's flag ("_sighup_delay") controlling weather a bot
> reloads immediate or waits. Looking at the source code, it looks like
> an expected behaviour, see:
>
> https://github.com/certtools/intelmq/blob/d624988693894e71defdd8b18618f7af98be852c/intelmq/lib/bot.py#L289C27-L297
>
>
> and:
>
> https://github.com/certtools/intelmq/blob/d624988693894e71defdd8b18618f7af98be852c/intelmq/lib/bot.py#L713-L719
>
>
> However, I understand your concern: this is okay if the reload happens
> during the one day (logrotate keeps the old log file untouched until
> next run), but it can cause troubles, if there is no message on one day.
>
> Best regards,
> Kamil Mankowski
> CERT.at GmbH
> www.cert.at
>
> On 10/19/23 13:29, Mika Silander wrote:
>> Hi all,
>>
>> Apologies for a lengthy message, but I had a set of monitoring
>> scripts that worked like a charm on a physical machine but when moved
>> to a virtual machine with a newer Ubuntu, I see odd failures in the
>> form of monitoring script processes left hanging on void file
>> handles. I suspect the problem is somehow related to log rotation in
>> connection to bot reloading but I can't figure out how. A few
>> questions at the end.
>>
>> Our setup for reference:
>>
>> Ubuntu 22.04 LTS
>> python 3.10.12
>> intelmq 3.2.2a1
>> intelmq-api 3.1.0-rc1
>> intelmq-manager 3.2.0
>>
>> We have a simple bot chain: Mail Attachment Collector ->
>> Shadowserver Parser -> N expert bots -> 1 output bot.
>>
>> /etc/logrotate.d/intelmq contains:
>>
>> /var/log/intelmq/*.log {
>> su intelmq intelmq
>> daily
>> maxsize 10M
>> rotate 60
>> notifempty
>> compress
>> delaycompress
>> create 644 intelmq intelmq
>> sharedscripts
>> postrotate
>> sudo -u intelmq /usr/bin/intelmqctl --quiet reload
>> endscript
>> }
>>
>> The cron daemon runs run-parts for targets under /etc/cron.daily/
>> at 06:25, including /etc/cron.daily/logrotate. Thus, "intelmqctl
>> --quiet reload" gets also executed after all the intelmq log files
>> have been rotated.
>>
>> What I see is that no bot reloads soon after run-parts has
>> finished. The first bot to perform a reload is the Mail Attachment
>> Collector immediately after midnight. The others remain as is until
>> the Mail Attachment Collector reads the day's first report and
>> forwards it to the Shadowserver Parser bot. The parser bot then
>> performs a reload and forwards the events to the first expert in the
>> chain. Once the first event from the parser trickles through the
>> entire bot chain, every bot in turn first performs a reload and then
>> processes the event. In practise, the logrotate triggered bot reload
>> can thus be delayed almost a full day - our first mail reports tend
>> to arrive during the very early morning hours.
>>
>> My question is, is the above reload behaviour as expected? I found
>> it strange that the Mail Attachment Collector reloads at midnight
>> even though it has processed new mail reports after the "intelmqctl
>> reload" call done by logrotate. I would have expected the Mail
>> Attachment Collector to reload itself when the first report arrives
>> for processing after 06:25, i.e. the first report after the call to
>> intelqmctl reload. And, what controls the timing of the Mail
>> Attachment Collector's reload? All its reloads I see in older log
>> files indicate that the reload is consistently run after midnight
>> irrespective of incoming reports and the timing of the intelmqctl
>> reload command.
>>
>> Thanks again and best regards from a puzzled Mika
>>
>>
>>
>>
>>
>> _______________________________________________
>> IntelMQ-dev mailing list
>> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
>> https://intelmq.readthedocs.io/
>
> _______________________________________________
> IntelMQ-dev mailing list
> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
> https://intelmq.readthedocs.io/
More information about the IntelMQ-dev
mailing list