Hi,
Some comments:
(I'm not fully up to date on IntelMQ internals, so I might be off.)
On 21.04.2017 21:00, Navtej Singh wrote:
With RandomizedDelaySec systemd will spread the execution over a period thus preventing this sudden rush for memory. This was very helpful.
I would be wary about relying on randomization. Random numbers have the property that every now and then they are all identical.
So I'd consider that to be more of a CPU load-distribution and not as a fix for the RAM usage.
Another thing, which might be worth discussion is, collectors should have a flag, to save the collected input to a file, and parser could then potentially pick from queue or file. This will help in cases where the input size is relative large, eg blueliv or alienvault (subscribed to lot of pulses, reminds me I need to submit a PR for this enhancement). May be some enhancements to fileinput/fileoutput bot can do that, I haven't really explored it, however an integrated approach would be much better, imo.
IMHO there are multiple issues:
a) how to pass huge amounts of data between bots b) how to process larger data-sets
Ad a)
yes, passing a reference to a file (filename?) instead of the content of the file is one option. It may well be that using a different Message-Passing backend (e.g. Rabbit-MQ) might also solve the issue.
Ad b)
IMHO much more tricky is the issue of actually processing huge data-sets. Once you reach file-sizes in the GB range one needs to switch from "load everything into a data-structure in RAM, then process it" to a "load next few KB from a data-stream, process it, then get next slice".
My worry is that the current bot API cannot be easily converted to stream processing.
We need to think this through.
a. Replace redis as queue with something persistent. As present redis uses a lot of memory since it keeps the events in memory. if your feeds are getting data frequently and, in the chain, you have a slow processing expert, queue size keeps growing and so does the redis memory usage.
Yes.
b. multiple events processing by single bot, This has been discussed a lot in issues and mailing lists. I have an implementation using gevents[2]. However there are problems with this, those trade-offs I am fine with. c & d might help to resolve these issues.
Yes, some experts would be a **lot** more efficient if they can do bulk processing.
c. events should have IDs. This will help in acknowledging the correct message in case of multi processing wrt to b.
Yes, but for a different reason: Assume more CERTs that do IntelMQ-IntelMQ cross-connects. You need a way to avoid building forwarding-loops. Persistent IDs can help (analogue to Message-IDs in the Usenet context).
(btw the Usenet analogy: some sort of Path: header would also be helpful: a list of Systems that this event has already passed through.)
otmar