[Intelmq-dev] Proposal (Request For Comments) - IntelMQ with Run Modes & Process Management

Fri May 5 19:06:59 CEST 2017

> Hi Navtej,
> 
> Am Freitag 21 April 2017 21:00:11 schrieb Navtej Singh:
> > I would like to share some insights from working with intelmq with roughly
> > 70 feeds. I have frequently run into these problems and tried to solve
> > these on my own.
> 
> thanks for adding your experiences and approaches.
> I believe in coming up with a number of ideas, trying some and then find a
> good solution, so it is good to see your approaches.
> 
> > There are some concerns if systemd is a right solution. I believe it is.
> > There are some aspects of systemd which are appealing and helpful. Running
> > the bots as intelmq user is a breeze, with User and Group directives.
> > However one of the biggest gains is with RandomizedDelaySec directive.
> 
> If we had a process manager that knows how the bots are wired, it could just
> queue some one time collectors behind each other if the insertion point
> before experts is already loaded. So I don't think this is coupled to systemd
> in particular, though the RandomizedDelaySec sounds interesting for some
> simple use cases.
> 
> > I understand that I am about to expand the discussion here, however I feel
> > it is connected issue. There should be a way to prevent running multiple
> > instances of bot with same id. As I see it, collectors and parsers though
> > different are tightly coupled.
> 
> To me this sounds like a use case that should be considered in this
> discussion. See my other post (a few minuted ago) where I explain why I
> consider this kind of "flow control" relevant with your example.
> 
> > a. Replace redis as queue with something persistent. As present redis uses
> > a lot of memory since it keeps the events in memory. if your feeds are
> > getting data frequently and, in the chain, you have a slow processing
> > expert, queue size keeps growing and so does the redis memory usage.
> 
> I also consider this a "flow control" issue, stop inserting stuff if the
> downstream pipe is full. Which technically could mean that redis has used the
> configured memory.
> 
> > b. multiple events processing by single bot, This has been discussed a lot
> > in issues and mailing lists. I have an implementation using gevents[2].
> > However there are problems with this, those trade-offs I am fine with. c &
> > d might help to resolve these issues.
> 
> Can you point me to a more elaborate outline of the problem?
> (I always thought that a bot can already process several events, but you mean
> per network event?)

I meant the bots should be able to process messages in parallel. At present a 
a single bot processes messages linearly. However that is too slow for expert
bots which query external services eg gethostbyname. The processing of such bots
can be increased if the expert bot can process multiple events in parallel. 
My implementation of it is by using gevent based green threads.

> 
> > c. events should have IDs. This will help in acknowledging the correct
> > message in case of multi processing wrt to b.
> 
> My mental model tells me that the information about an abuse sighting is the
> same, it shall be the "same" for intelmq, so an ID wouldn't help. Somehow
> intelmq must record the contents of the "events" and deduplicate anyway.
>

This needs bit of explanation, the current implementation of acknowledge for 
redis back-end uses rpop. It indiscriminately picks the rightmost event and 
acknowledges it. In multi processing mode, this is undesirable because threads
can return in non linear fashion. Example,
lets us assume that there are following five hostnames to be resolved in -internal
queue and we spawn five threads, in the order with a being at rightmost
a.in  goes to thread1
b.in  goes to thread2
c.in  goes to thread3
d.in  goes to thread4
e.in  goes to thread5
Now if the thread2 returns first, it will end up acknowledging a.in instead of
b.in and at the end of it we will have e.in remaining in -internal queue even 
though it got processed successfully.

If we dont want an ID something else has to be there to acknowledge correct msg
in multiprocessing env.

load_balance option does not scale. I think gevent/ayncio are probably the best
ones at present.

> > d. bots should be able to peek at message count in the source queue. This
> > will help with b. as well as backoff algorithm discussed at other places,
> > iirc  Sebastian proposed it on some github issues. this really simple, I
> > had written the peek function however I cannot locate it as of now.
> 
> This sounds like the bots implementing some "flow control" itself.
> From a design perspective I think the bot shall known and somehow register
> what it wants to do or handle, however the control seems feasable from an
> oversight process from my perspective.
> 
> Best Regards,
> Bernhard
> 
> --
> www.intevation.de/~bernhard   +49 541 33 508 3-3
>