Hi, since upgrading to version 1.1.0 it became quite a big problem the stability of bots. Often it happens that bot encounters an exception and logs that the bot is stopped or encounters an exception and logs nothing but status check reports that the bot is not running. I'm using the 'error_procedure' parameter set to 'pass' (with error_max_retries and error_retry_delay set to 0) and I've always thought that this is a sort of 'run forever' parameter that even when exception occurs the bot will keep on doing its job. I'm using intelmq in Docker environment with ubuntu 18.04 as base.
Sincerely, Václav Brůžek
Hi,
I didn't know of any problems yet. Do you use any custom modifications in the code? If yes, which?
Sebastian
On 09/10/2018 10.42, Vaclav Bruzek wrote:
Hi, since upgrading to version 1.1.0 it became quite a big problem the stability of bots. Often it happens that bot encounters an exception and logs that the bot is stopped or encounters an exception and logs nothing but status check reports that the bot is not running. I'm using the 'error_procedure' parameter set to 'pass' (with error_max_retries and error_retry_delay set to 0) and I've always thought that this is a sort of 'run forever' parameter that even when exception occurs the bot will keep on doing its job. I'm using intelmq in Docker environment with ubuntu 18.04 as base.
Sincerely, Václav Brůžek
Hi, no there are no modification to the intelmq code. The situation occurs at my custom bots as well as the default ones. As an example of this behaviour: recently Redis broker wasn't available for some time, as a result almost all bots stopped without any log message indicating that the bot stopped.
Sincerely, Václav Brůžek
On Tue, 9 Oct 2018 at 12:01, Sebastian Wagner wagner@cert.at wrote:
Hi,
I didn't know of any problems yet. Do you use any custom modifications in the code? If yes, which?
Sebastian On 09/10/2018 10.42, Vaclav Bruzek wrote:
Hi, since upgrading to version 1.1.0 it became quite a big problem the stability of bots. Often it happens that bot encounters an exception and logs that the bot is stopped or encounters an exception and logs nothing but status check reports that the bot is not running. I'm using the 'error_procedure' parameter set to 'pass' (with error_max_retries and error_retry_delay set to 0) and I've always thought that this is a sort of 'run forever' parameter that even when exception occurs the bot will keep on doing its job. I'm using intelmq in Docker environment with ubuntu 18.04 as base.
Sincerely, Václav Brůžek
-- // Sebastian Wagner wagner@cert.at wagner@cert.at - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
Hi Václav,
I can't estimate the implications of the docker usage on redis and intelmq.
Concerning the redis problem: There were no changes in the code handling redis problems and the only case when intelmq's bots do not log anything is when there are not enough resources to shutdown cleanly (memory, disk). Even then, there's output on stdout. You could log stdout and see if there are any errors shown at the end.
Concerning the error handling and sudden stops: There haven't been code changes too. Do you use the scheduled run mode? If the error_procedure is pass and there are pipeline problems, the bot stops (in bot.py search for "error_procedure: pass and pipeline problem"). AFAIR the reasoning for this was/is that if the bot would not stop, the pipeline would be kind of DOS'ed. But as problems with memory and snapshots in redis are handled better now, that could be relaxed. I'll do some experiments.
Concerning "encounters an exception and logs nothing but status check reports that the bot is not running": How do you know that the bot encountered an exception if nothing is logged? Is the bot then still running or not?
Sebastian
On 09/10/2018 12.58, Vaclav Bruzek wrote:
Hi, no there are no modification to the intelmq code. The situation occurs at my custom bots as well as the default ones. As an example of this behaviour: recently Redis broker wasn't available for some time, as a result almost all bots stopped without any log message indicating that the bot stopped.
Sincerely, Václav Brůžek
On Tue, 9 Oct 2018 at 12:01, Sebastian Wagner <wagner@cert.at mailto:wagner@cert.at> wrote:
Hi, I didn't know of any problems yet. Do you use any custom modifications in the code? If yes, which? Sebastian On 09/10/2018 10.42, Vaclav Bruzek wrote:
Hi, since upgrading to version 1.1.0 it became quite a big problem the stability of bots. Often it happens that bot encounters an exception and logs that the bot is stopped or encounters an exception and logs nothing but status check reports that the bot is not running. I'm using the 'error_procedure' parameter set to 'pass' (with error_max_retries and error_retry_delay set to 0) and I've always thought that this is a sort of 'run forever' parameter that even when exception occurs the bot will keep on doing its job. I'm using intelmq in Docker environment with ubuntu 18.04 as base. Sincerely, Václav Brůžek
-- // Sebastian Wagner <wagner@cert.at> <mailto:wagner@cert.at> - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at <http://nic.at> GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
Hi Sebastian, I've added the Redis exception to the attachment. That is the case that I would expect that the bot would keep trying to connect to Redis and not give up and exit.
I use continuous run mode for all bots.
I've also extracted the example of the other behaviour, that is exiting without logging that the bot stopped. That is indeed what I meant (your last point), that the bot logs the exception but doesn't log the line "Bot stopped" and stops, which is what status check is reproting. ¨ 2018-10-02 02:43:13,744 - output - ERROR - Bot has found a problem. Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/lib/bot.py", line 167, in start self.process() File "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/bots/outputs/bot/output.py", line 67, in process status = self.db_check() File "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/bots/outputs/bot/output.py", line 53, in db_check payload = self.connection_blacklist.get(key).decode("utf-8", errors="ignore") AttributeError: 'NoneType' object has no attribute 'decode' 2018-10-02 02:43:13,744 - output - INFO - Current Message(event): {"some event"}. 2018-10-02 02:43:13,745 - output - INFO - Bot will continue in 0 seconds. 2018-10-02 02:43:35,997 - whitelist-output - ERROR - Bot has found a problem. Traceback (most recent call last):
AttributeError: 'NoneType' object has no attribute 'decode' 2018-10-02 02:43:35,998 - whitelist-output - INFO - Current Message(event): {'feed.accuracy': 100.0, 'feed.name': 'whalebone', 'feed.url': ' http://wb-whitelist.azurewebsites.net/whitelist.txt', 'time.observation': '2018-10-02T02:41:58+00:00', 'source.fqdn': 'com.bd', 'raw': 'Y29tLmJkDQo='}. 2018-10-02 02:43:35,998 - whitelist-output - INFO - Dumping message from pipeline to dump file.
Sincerely, Václav Brůžek
On Wed, 10 Oct 2018 at 15:36, Sebastian Wagner wagner@cert.at wrote:
Hi Václav,
I can't estimate the implications of the docker usage on redis and intelmq.
Concerning the redis problem: There were no changes in the code handling redis problems and the only case when intelmq's bots do not log anything is when there are not enough resources to shutdown cleanly (memory, disk). Even then, there's output on stdout. You could log stdout and see if there are any errors shown at the end.
Concerning the error handling and sudden stops: There haven't been code changes too. Do you use the scheduled run mode? If the error_procedure is pass and there are pipeline problems, the bot stops (in bot.py search for "error_procedure: pass and pipeline problem"). AFAIR the reasoning for this was/is that if the bot would not stop, the pipeline would be kind of DOS'ed. But as problems with memory and snapshots in redis are handled better now, that could be relaxed. I'll do some experiments.
Concerning "encounters an exception and logs nothing but status check reports that the bot is not running": How do you know that the bot encountered an exception if nothing is logged? Is the bot then still running or not?
Sebastian On 09/10/2018 12.58, Vaclav Bruzek wrote:
Hi, no there are no modification to the intelmq code. The situation occurs at my custom bots as well as the default ones. As an example of this behaviour: recently Redis broker wasn't available for some time, as a result almost all bots stopped without any log message indicating that the bot stopped.
Sincerely, Václav Brůžek
On Tue, 9 Oct 2018 at 12:01, Sebastian Wagner wagner@cert.at wrote:
Hi,
I didn't know of any problems yet. Do you use any custom modifications in the code? If yes, which?
Sebastian On 09/10/2018 10.42, Vaclav Bruzek wrote:
Hi, since upgrading to version 1.1.0 it became quite a big problem the stability of bots. Often it happens that bot encounters an exception and logs that the bot is stopped or encounters an exception and logs nothing but status check reports that the bot is not running. I'm using the 'error_procedure' parameter set to 'pass' (with error_max_retries and error_retry_delay set to 0) and I've always thought that this is a sort of 'run forever' parameter that even when exception occurs the bot will keep on doing its job. I'm using intelmq in Docker environment with ubuntu 18.04 as base.
Sincerely, Václav Brůžek
-- // Sebastian Wagner wagner@cert.at wagner@cert.at - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
--
// Sebastian Wagner wagner@cert.at wagner@cert.at - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
Sorry, I've managed to send mail early without the exception and the attachment. Here is the last line of the log I wanted to add.
2018-10-08 07:24:01,332 - whitelist-output - INFO - WhitelistOutputBot initialized with id whitelist-output and intelmq 1.1.0 and python 3.6.5 (default, Apr 1 2018, 05:46:30) as process 1150.
Sincerely, Václav Brůžek
On Thu, 11 Oct 2018 at 08:47, Vaclav Bruzek vasek.bruzek@gmail.com wrote:
Hi Sebastian, I've added the Redis exception to the attachment. That is the case that I would expect that the bot would keep trying to connect to Redis and not give up and exit.
I use continuous run mode for all bots.
I've also extracted the example of the other behaviour, that is exiting without logging that the bot stopped. That is indeed what I meant (your last point), that the bot logs the exception but doesn't log the line "Bot stopped" and stops, which is what status check is reproting. ¨ 2018-10-02 02:43:13,744 - output - ERROR - Bot has found a problem. Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/lib/bot.py", line 167, in start self.process() File "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/bots/outputs/bot/output.py", line 67, in process status = self.db_check() File "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/bots/outputs/bot/output.py", line 53, in db_check payload = self.connection_blacklist.get(key).decode("utf-8", errors="ignore") AttributeError: 'NoneType' object has no attribute 'decode' 2018-10-02 02:43:13,744 - output - INFO - Current Message(event): {"some event"}. 2018-10-02 02:43:13,745 - output - INFO - Bot will continue in 0 seconds. 2018-10-02 02:43:35,997 - whitelist-output - ERROR - Bot has found a problem. Traceback (most recent call last):
AttributeError: 'NoneType' object has no attribute 'decode' 2018-10-02 02:43:35,998 - whitelist-output - INFO - Current Message(event): {'feed.accuracy': 100.0, 'feed.name': 'whalebone', 'feed.url': 'http://wb-whitelist.azurewebsites.net/whitelist.txt', 'time.observation': '2018-10-02T02:41:58+00:00', 'source.fqdn': 'com.bd', 'raw': 'Y29tLmJkDQo='}. 2018-10-02 02:43:35,998 - whitelist-output - INFO - Dumping message from pipeline to dump file.
Sincerely, Václav Brůžek
On Wed, 10 Oct 2018 at 15:36, Sebastian Wagner wagner@cert.at wrote:
Hi Václav,
I can't estimate the implications of the docker usage on redis and intelmq.
Concerning the redis problem: There were no changes in the code handling redis problems and the only case when intelmq's bots do not log anything is when there are not enough resources to shutdown cleanly (memory, disk). Even then, there's output on stdout. You could log stdout and see if there are any errors shown at the end.
Concerning the error handling and sudden stops: There haven't been code changes too. Do you use the scheduled run mode? If the error_procedure is pass and there are pipeline problems, the bot stops (in bot.py search for "error_procedure: pass and pipeline problem"). AFAIR the reasoning for this was/is that if the bot would not stop, the pipeline would be kind of DOS'ed. But as problems with memory and snapshots in redis are handled better now, that could be relaxed. I'll do some experiments.
Concerning "encounters an exception and logs nothing but status check reports that the bot is not running": How do you know that the bot encountered an exception if nothing is logged? Is the bot then still running or not?
Sebastian On 09/10/2018 12.58, Vaclav Bruzek wrote:
Hi, no there are no modification to the intelmq code. The situation occurs at my custom bots as well as the default ones. As an example of this behaviour: recently Redis broker wasn't available for some time, as a result almost all bots stopped without any log message indicating that the bot stopped.
Sincerely, Václav Brůžek
On Tue, 9 Oct 2018 at 12:01, Sebastian Wagner wagner@cert.at wrote:
Hi,
I didn't know of any problems yet. Do you use any custom modifications in the code? If yes, which?
Sebastian On 09/10/2018 10.42, Vaclav Bruzek wrote:
Hi, since upgrading to version 1.1.0 it became quite a big problem the stability of bots. Often it happens that bot encounters an exception and logs that the bot is stopped or encounters an exception and logs nothing but status check reports that the bot is not running. I'm using the 'error_procedure' parameter set to 'pass' (with error_max_retries and error_retry_delay set to 0) and I've always thought that this is a sort of 'run forever' parameter that even when exception occurs the bot will keep on doing its job. I'm using intelmq in Docker environment with ubuntu 18.04 as base.
Sincerely, Václav Brůžek
-- // Sebastian Wagner wagner@cert.at wagner@cert.at - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
--
// Sebastian Wagner wagner@cert.at wagner@cert.at - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
I have ran into this scenario. My take is that your redis is running low on memory. Try increasing the memory/.
On Thu, Oct 11, 2018 at 12:21 PM Vaclav Bruzek vasek.bruzek@gmail.com wrote:
Sorry, I've managed to send mail early without the exception and the attachment. Here is the last line of the log I wanted to add.
2018-10-08 07:24:01,332 - whitelist-output - INFO - WhitelistOutputBot initialized with id whitelist-output and intelmq 1.1.0 and python 3.6.5 (default, Apr 1 2018, 05:46:30) as process 1150.
Sincerely, Václav Brůžek
On Thu, 11 Oct 2018 at 08:47, Vaclav Bruzek vasek.bruzek@gmail.com wrote:
Hi Sebastian, I've added the Redis exception to the attachment. That is the case that I would expect that the bot would keep trying to connect to Redis and not give up and exit.
I use continuous run mode for all bots.
I've also extracted the example of the other behaviour, that is exiting without logging that the bot stopped. That is indeed what I meant (your last point), that the bot logs the exception but doesn't log the line "Bot stopped" and stops, which is what status check is reproting. ¨ 2018-10-02 02:43:13,744 - output - ERROR - Bot has found a problem. Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/lib/bot.py", line 167, in start self.process() File "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/bots/outputs/bot/output.py", line 67, in process status = self.db_check() File "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/bots/outputs/bot/output.py", line 53, in db_check payload = self.connection_blacklist.get(key).decode("utf-8", errors="ignore") AttributeError: 'NoneType' object has no attribute 'decode' 2018-10-02 02:43:13,744 - output - INFO - Current Message(event): {"some event"}. 2018-10-02 02:43:13,745 - output - INFO - Bot will continue in 0 seconds. 2018-10-02 02:43:35,997 - whitelist-output - ERROR - Bot has found a problem. Traceback (most recent call last):
AttributeError: 'NoneType' object has no attribute 'decode' 2018-10-02 02:43:35,998 - whitelist-output - INFO - Current Message(event): {'feed.accuracy': 100.0, 'feed.name': 'whalebone', 'feed.url': 'http://wb-whitelist.azurewebsites.net/whitelist.txt', 'time.observation': '2018-10-02T02:41:58+00:00', 'source.fqdn': 'com.bd', 'raw': 'Y29tLmJkDQo='}. 2018-10-02 02:43:35,998 - whitelist-output - INFO - Dumping message from pipeline to dump file.
Sincerely, Václav Brůžek
On Wed, 10 Oct 2018 at 15:36, Sebastian Wagner wagner@cert.at wrote:
Hi Václav,
I can't estimate the implications of the docker usage on redis and intelmq.
Concerning the redis problem: There were no changes in the code handling redis problems and the only case when intelmq's bots do not log anything is when there are not enough resources to shutdown cleanly (memory, disk). Even then, there's output on stdout. You could log stdout and see if there are any errors shown at the end.
Concerning the error handling and sudden stops: There haven't been code changes too. Do you use the scheduled run mode? If the error_procedure is pass and there are pipeline problems, the bot stops (in bot.py search for "error_procedure: pass and pipeline problem"). AFAIR the reasoning for this was/is that if the bot would not stop, the pipeline would be kind of DOS'ed. But as problems with memory and snapshots in redis are handled better now, that could be relaxed. I'll do some experiments.
Concerning "encounters an exception and logs nothing but status check reports that the bot is not running": How do you know that the bot encountered an exception if nothing is logged? Is the bot then still running or not?
Sebastian On 09/10/2018 12.58, Vaclav Bruzek wrote:
Hi, no there are no modification to the intelmq code. The situation occurs at my custom bots as well as the default ones. As an example of this behaviour: recently Redis broker wasn't available for some time, as a result almost all bots stopped without any log message indicating that the bot stopped.
Sincerely, Václav Brůžek
On Tue, 9 Oct 2018 at 12:01, Sebastian Wagner wagner@cert.at wrote:
Hi,
I didn't know of any problems yet. Do you use any custom modifications in the code? If yes, which?
Sebastian On 09/10/2018 10.42, Vaclav Bruzek wrote:
Hi, since upgrading to version 1.1.0 it became quite a big problem the stability of bots. Often it happens that bot encounters an exception and logs that the bot is stopped or encounters an exception and logs nothing but status check reports that the bot is not running. I'm using the 'error_procedure' parameter set to 'pass' (with error_max_retries and error_retry_delay set to 0) and I've always thought that this is a sort of 'run forever' parameter that even when exception occurs the bot will keep on doing its job. I'm using intelmq in Docker environment with ubuntu 18.04 as base.
Sincerely, Václav Brůžek
-- // Sebastian Wagner wagner@cert.at wagner@cert.at - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
--
// Sebastian Wagner wagner@cert.at wagner@cert.at - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
--
Listen-Einstellungen: https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-users
Hi,
The traceback in the email shows an exception in a custom bot. Without the code, it's hard to say what's going on.
The exceptions attached do contain the following redis error message:
redis.exceptions.BusyLoadingError: Redis is loading the dataset in memory
Looks like redis is just starting. In this case we could wait (up to a maximum time) as long as this error occurs and then continue. Still requires someone to implement it. I opened https://github.com/certtools/intelmq/issues/1334 for it.
Sebastian
On 11/10/2018 08.47, Vaclav Bruzek wrote:
Hi Sebastian, I've added the Redis exception to the attachment. That is the case that I would expect that the bot would keep trying to connect to Redis and not give up and exit.
I use continuous run mode for all bots.
I've also extracted the example of the other behaviour, that is exiting without logging that the bot stopped. That is indeed what I meant (your last point), that the bot logs the exception but doesn't log the line "Bot stopped" and stops, which is what status check is reproting. ¨ 2018-10-02 02:43:13,744 - output - ERROR - Bot has found a problem. Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/lib/bot.py", line 167, in start self.process() File "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/bots/outputs/bot/output.py", line 67, in process status = self.db_check() File "/usr/local/lib/python3.6/dist-packages/intelmq-1.1.0-py3.6.egg/intelmq/bots/outputs/bot/output.py", line 53, in db_check payload = self.connection_blacklist.get(key).decode("utf-8", errors="ignore") AttributeError: 'NoneType' object has no attribute 'decode' 2018-10-02 02:43:13,744 - output - INFO - Current Message(event): {"some event"}. 2018-10-02 02:43:13,745 - output - INFO - Bot will continue in 0 seconds. 2018-10-02 02:43:35,997 - whitelist-output - ERROR - Bot has found a problem. Traceback (most recent call last):
AttributeError: 'NoneType' object has no attribute 'decode' 2018-10-02 02:43:35,998 - whitelist-output - INFO - Current Message(event): {'feed.accuracy': 100.0, 'feed.name http://feed.name': 'whalebone', 'feed.url': 'http://wb-whitelist.azurewebsites.net/whitelist.txt', 'time.observation': '2018-10-02T02:41:58+00:00', 'source.fqdn': 'com.bd http://com.bd', 'raw': 'Y29tLmJkDQo='}. 2018-10-02 02:43:35,998 - whitelist-output - INFO - Dumping message from pipeline to dump file.
Sincerely, Václav Brůžek
On Wed, 10 Oct 2018 at 15:36, Sebastian Wagner <wagner@cert.at mailto:wagner@cert.at> wrote:
Hi Václav, I can't estimate the implications of the docker usage on redis and intelmq. Concerning the redis problem: There were no changes in the code handling redis problems and the only case when intelmq's bots do not log anything is when there are not enough resources to shutdown cleanly (memory, disk). Even then, there's output on stdout. You could log stdout and see if there are any errors shown at the end. Concerning the error handling and sudden stops: There haven't been code changes too. Do you use the scheduled run mode? If the error_procedure is pass and there are pipeline problems, the bot stops (in bot.py search for "error_procedure: pass and pipeline problem"). AFAIR the reasoning for this was/is that if the bot would not stop, the pipeline would be kind of DOS'ed. But as problems with memory and snapshots in redis are handled better now, that could be relaxed. I'll do some experiments. Concerning "encounters an exception and logs nothing but status check reports that the bot is not running": How do you know that the bot encountered an exception if nothing is logged? Is the bot then still running or not? Sebastian On 09/10/2018 12.58, Vaclav Bruzek wrote:
Hi, no there are no modification to the intelmq code. The situation occurs at my custom bots as well as the default ones. As an example of this behaviour: recently Redis broker wasn't available for some time, as a result almost all bots stopped without any log message indicating that the bot stopped. Sincerely, Václav Brůžek On Tue, 9 Oct 2018 at 12:01, Sebastian Wagner <wagner@cert.at <mailto:wagner@cert.at>> wrote: Hi, I didn't know of any problems yet. Do you use any custom modifications in the code? If yes, which? Sebastian On 09/10/2018 10.42, Vaclav Bruzek wrote:
Hi, since upgrading to version 1.1.0 it became quite a big problem the stability of bots. Often it happens that bot encounters an exception and logs that the bot is stopped or encounters an exception and logs nothing but status check reports that the bot is not running. I'm using the 'error_procedure' parameter set to 'pass' (with error_max_retries and error_retry_delay set to 0) and I've always thought that this is a sort of 'run forever' parameter that even when exception occurs the bot will keep on doing its job. I'm using intelmq in Docker environment with ubuntu 18.04 as base. Sincerely, Václav Brůžek
-- // Sebastian Wagner <wagner@cert.at> <mailto:wagner@cert.at> - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at <http://nic.at> GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
-- // Sebastian Wagner <wagner@cert.at> <mailto:wagner@cert.at> - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at <http://nic.at> GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
Hi,
On 11/10/2018 13.30, Sebastian Wagner wrote:
The exceptions attached do contain the following redis error message:
redis.exceptions.BusyLoadingError: Redis is loading the dataset in
memory
Looks like redis is just starting. In this case we could wait (up to a maximum time) as long as this error occurs and then continue. Still requires someone to implement it. I opened https://github.com/certtools/intelmq/issues/1334 for it.
I implemented this and it will be part of the next release.
Hi,
On 10/10/2018 15.36, Sebastian Wagner wrote:
If the error_procedure is pass and there are pipeline problems, the bot stops (in bot.py search for "error_procedure: pass and pipeline problem").
Which behavior would you (all readers) expect in this case, after the maximum configured retries are reached?
Some possibilities: * Retrying forever. (Comment: The procedure is pass, not retry) * Dumping the message. (Comment: The error is most probably not specific to the message) * Stopping. (Current behavior)
Sebastian
Hi, I think that the 'pass' parameter behaviour should be applied here and retrying forever is the correct approach. Because in my case the monitoring of the botnet is an issue and bots stopping abruptly is a big headache.
Sincerely, Václav Brůžek
On Tue, 23 Oct 2018 at 11:47, Sebastian Wagner wagner@cert.at wrote:
Hi,
On 10/10/2018 15.36, Sebastian Wagner wrote:
If the error_procedure is pass and there are pipeline problems, the bot stops (in bot.py search for "error_procedure: pass and pipeline problem").
Which behavior would you (all readers) expect in this case, after the maximum configured retries are reached?
Some possibilities:
- Retrying forever. (Comment: The procedure is pass, not retry)
- Dumping the message. (Comment: The error is most probably not
specific to the message)
- Stopping. (Current behavior)
Sebastian
-- // Sebastian Wagner wagner@cert.at - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg