[Intelmq-users] Bots stopping

Sebastian Wagner wagner at cert.at
Thu Oct 11 13:30:56 CEST 2018


Hi,

The traceback in the email shows an exception in a custom bot. Without
the code, it's hard to say what's going on.

The exceptions attached do contain the following redis error message:
> redis.exceptions.BusyLoadingError: Redis is loading the dataset in memory

Looks like redis is just starting. In this case we could wait (up to a
maximum time) as long as this error occurs and then continue. Still
requires someone to implement it.
I opened https://github.com/certtools/intelmq/issues/1334 for it.

Sebastian

On 11/10/2018 08.47, Vaclav Bruzek wrote:
> Hi Sebastian,
> I've added the Redis exception to the attachment. That is the case
> that I would expect that the bot would keep trying to connect to Redis
> and not give up and exit. 
>
> I use continuous run mode for all bots. 
>
> I've also extracted the example of the other behaviour, that is
> exiting without logging that the bot stopped. That is indeed what I
> meant (your last point), that the bot logs the exception but doesn't
> log the line "Bot stopped" and stops, which is what status check is
> reproting.
> ¨
> 2018-10-02 02:43:13,744 - output - ERROR - Bot has found a problem.
> Traceback (most recent call last):
>   File
> "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/lib/bot.py",
> line 167, in start
>     self.process()
>   File
> "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/bots/outputs/bot/output.py",
> line 67, in process
>     status = self.db_check()
>   File
> "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/bots/outputs/bot/output.py",
> line 53, in db_check
>     payload = self.connection_blacklist.get(key).decode("utf-8",
> errors="ignore")
> AttributeError: 'NoneType' object has no attribute 'decode'
> 2018-10-02 02:43:13,744 - output - INFO - Current Message(event):
> {"some event"}.
> 2018-10-02 02:43:13,745 - output - INFO - Bot will continue in 0 seconds.
> 2018-10-02 02:43:35,997 - whitelist-output - ERROR - Bot has found a
> problem.
> Traceback (most recent call last):
>
> AttributeError: 'NoneType' object has no attribute 'decode'
> 2018-10-02 02:43:35,998 - whitelist-output - INFO - Current
> Message(event): {'feed.accuracy': 100.0, 'feed.name
> <http://feed.name>': 'whalebone', 'feed.url':
> 'http://wb-whitelist.azurewebsites.net/whitelist.txt',
> 'time.observation': '2018-10-02T02:41:58+00:00', 'source.fqdn':
> 'com.bd <http://com.bd>', 'raw': 'Y29tLmJkDQo='}.
> 2018-10-02 02:43:35,998 - whitelist-output - INFO - Dumping message
> from pipeline to dump file.
>
>
> Sincerely,
> Václav Brůžek
>
>
> On Wed, 10 Oct 2018 at 15:36, Sebastian Wagner <wagner at cert.at
> <mailto:wagner at cert.at>> wrote:
>
>     Hi Václav,
>
>     I can't estimate the implications of the docker usage on redis and
>     intelmq.
>
>     Concerning the redis problem: There were no changes in the code
>     handling redis problems and the only case when intelmq's bots do
>     not log anything is when there are not enough resources to
>     shutdown cleanly (memory, disk). Even then, there's output on
>     stdout. You could log stdout and see if there are any errors shown
>     at the end.
>
>     Concerning the error handling and sudden stops: There haven't been
>     code changes too. Do you use the scheduled run mode? If the
>     error_procedure is pass and there are pipeline problems, the bot
>     stops (in bot.py search for "error_procedure: pass and pipeline
>     problem"). AFAIR the reasoning for this was/is that if the bot
>     would not stop, the pipeline would be kind of DOS'ed. But as
>     problems with memory and snapshots in redis are handled better
>     now, that could be relaxed. I'll do some experiments.
>
>     Concerning "encounters an exception and logs nothing but status
>     check reports that the bot is not running": How do you know that
>     the bot encountered an exception if nothing is logged? Is the bot
>     then still running or not?
>
>     Sebastian
>
>     On 09/10/2018 12.58, Vaclav Bruzek wrote:
>>     Hi,
>>     no there are no modification to the intelmq code. The situation
>>     occurs at my custom bots as well as the default ones. As an
>>     example of this behaviour: recently Redis broker wasn't available
>>     for some time, as a result almost all bots stopped without any
>>     log message indicating that the bot stopped.
>>
>>     Sincerely,
>>     Václav Brůžek
>>
>>
>>     On Tue, 9 Oct 2018 at 12:01, Sebastian Wagner <wagner at cert.at
>>     <mailto:wagner at cert.at>> wrote:
>>
>>         Hi,
>>
>>         I didn't know of any problems yet. Do you use any custom
>>         modifications in the code? If yes, which?
>>
>>         Sebastian
>>
>>         On 09/10/2018 10.42, Vaclav Bruzek wrote:
>>>         Hi,
>>>         since upgrading to version 1.1.0 it became quite a big
>>>         problem the stability of bots. Often it happens that bot
>>>         encounters an exception and logs that the bot is stopped or
>>>         encounters an exception and logs nothing but status check
>>>         reports that the bot is not running. I'm using the
>>>         'error_procedure' parameter set to 'pass'
>>>         (with error_max_retries and error_retry_delay set to 0) and
>>>         I've always thought that this is a sort of 'run forever'
>>>         parameter that even when exception occurs the bot will keep
>>>         on doing its job. I'm using intelmq in Docker environment
>>>         with ubuntu 18.04 as base.
>>>
>>>         Sincerely,
>>>         Václav Brůžek
>>>
>>         -- 
>>         // Sebastian Wagner <wagner at cert.at> <mailto:wagner at cert.at> - T: +43 1 5056416 7201
>>         // CERT Austria - https://www.cert.at/
>>         // Eine Initiative der nic.at <http://nic.at> GmbH - https://www.nic.at/
>>         // Firmenbuchnummer 172568b, LG Salzburg
>>
>     -- 
>     // Sebastian Wagner <wagner at cert.at> <mailto:wagner at cert.at> - T: +43 1 5056416 7201
>     // CERT Austria - https://www.cert.at/
>     // Eine Initiative der nic.at <http://nic.at> GmbH - https://www.nic.at/
>     // Firmenbuchnummer 172568b, LG Salzburg
>
-- 
// Sebastian Wagner <wagner at cert.at> - T: +43 1 5056416 7201
// CERT Austria - https://www.cert.at/
// Eine Initiative der nic.at GmbH - https://www.nic.at/
// Firmenbuchnummer 172568b, LG Salzburg

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cert.at/pipermail/intelmq-users/attachments/20181011/9293d0c9/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.cert.at/pipermail/intelmq-users/attachments/20181011/9293d0c9/attachment.sig>


More information about the Intelmq-users mailing list