Dear community,
I invite you to discuss a new (IEP - IntelMQ Enhancement Proposal)
IEP007: Running IntelMQ as Python Library
Have you ever wondered if you can write a Python script, call a bot's process method, pass it some data and get back the enriched/modified data? (pseudo code) bot_instance = Bot(parameters) bot_instance.process_message(input message) -> output messages
Strictly speaking, it *is* actually possible with the current version, but it requires some bizarre hacks like re-defining Bot's methods and overwriting internal values. Staying on the wishlist for quite a while, we intend to implement the feature now. I started the IEP007 draft and need some input from you to maximize the benefit for all the IntelMQ community (developers): https://github.com/certtools/ieps/pull/7/files Or in a readable display: https://github.com/certtools/ieps/tree/iep-007/007
What features and interfacing options would you expect when starting the bot as a library?
Do you think the `Bot.process` method should be rewritten entirely now, removing the calls receive_message/send_message and converting the method into a generator (an API-breaking change)? And if yes, should this be done in one step, or separated from this bot-as-library feature, reducing the complexity of development steps?
Looking forward for your ideas Sebastian
Hi,
Reading through the IEP in question, I thought I would find reasons or motivations as to why having a library is desired. It's possible I've missed discussions or mails and the reasons have been discussed/documented elsewhere.
Having worked hard during more than two years to get IntelMQ up in production and now only waiting for the required servers to arrive, I'm reluctant to have any major changes to the code base (=my vote). If the implementation you choose for IEP is the API-breaking generator one, please, if at all possible, consider implementing wrappers, decorators or the like to maintain the old process method (and friends) available for some time onwards. This would give the bot developers (me included) time to adapt our own bots to this new approach.
Br, Mika
P.S: IntelMQ was the easier part, most of those two years mentioned above has been spent on getting the other interconnected systems and interfaces to them working.
----- Original Message ----- From: "Sebix" sebix@sebix.at To: "intelmq-dev" intelmq-dev@lists.cert.at Sent: Monday, 24 April, 2023 19:31:39 Subject: [IntelMQ-dev] RFC on IEP007: Running IntelMQ as Python Library
Dear community,
I invite you to discuss a new (IEP - IntelMQ Enhancement Proposal)
IEP007: Running IntelMQ as Python Library
Have you ever wondered if you can write a Python script, call a bot's process method, pass it some data and get back the enriched/modified data? (pseudo code) bot_instance = Bot(parameters) bot_instance.process_message(input message) -> output messages
Strictly speaking, it *is* actually possible with the current version, but it requires some bizarre hacks like re-defining Bot's methods and overwriting internal values. Staying on the wishlist for quite a while, we intend to implement the feature now. I started the IEP007 draft and need some input from you to maximize the benefit for all the IntelMQ community (developers): https://github.com/certtools/ieps/pull/7/files Or in a readable display: https://github.com/certtools/ieps/tree/iep-007/007
What features and interfacing options would you expect when starting the bot as a library?
Do you think the `Bot.process` method should be rewritten entirely now, removing the calls receive_message/send_message and converting the method into a generator (an API-breaking change)? And if yes, should this be done in one step, or separated from this bot-as-library feature, reducing the complexity of development steps?
Looking forward for your ideas Sebastian
Dear Mika,
On 4/25/23 11:28 AM, Mika Silander wrote:
Reading through the IEP in question, I thought I would find reasons or motivations as to why having a library is desired. It's possible I've missed discussions or mails and the reasons have been discussed/documented elsewhere.
Thanks for the input, I'll try to find some references from the 3.0 discussions when we first discussed this feature request in more details, or otherwise expand this section on my own.
If the implementation you choose for IEP is the API-breaking generator one
I think there was a misunderstanding: It is precisely the /question/ if we also want to address this topic. It has its pros and cons, whereas the biggest downside is the API-breaking part with all its implications.
You know that it was a long and bumpy road towards IntelMQ 3.0, and accumulating multiple significant development steps in a short period was challenging (OTOH had the - temporary - development capacity at that time to make this leap). I'm fully aware that breaking changes can mean a lot of trouble and I'm all for small, separated steps (hence this IEP), which allow more straightforward discussion, review and maintenance. When writing PoCs for this IEP, I found a similar area for improvement in IntelMQ which could also be interesting to tackle. It is not up to me to decide the direction of the IntelMQ roadmap but to the /community/. Hence, I posed this question and asked for comments, and I am very grateful that you are contributing so actively through code and discussion.
best regards Sebastian
Hi Sebastian,
References to the discussions on motivations for the library proposal are welcome, but when and if you find them, IMHO I think the best location for them is the IEP itself. The document could for example, start with a "Background" or similar section that tries to answer why this is a desired thing, what benefits it would bring and maybe some examples. A long time ago I wrote a wrapper script that allowed chaining bots on the command line. It did of course not provide all the features that are outlined in your IEP, but it gave me the possibility to simple sampling of events and testing bots with e.g.
wrapper BotA | wrapper BotB | wrapper BotC ...
so I was struggling to find good use cases to support a library implementation.
What comes to the API-breaking vs. less intrusive implementation, I think I did understand these were two separate options, but the API-breaking I found a bit scary and reacted to that only. And yes, I can imagine it's been a big effort to reach versions 3.0 and 3.1. Had I known all the effort I've had to do put to the interconnected systems related to our IntelMQ setup (IntelMQ itself was not that hard), I would have never even started :-). Still, as said, my vote is on the less intrusive implementation option and sticking to KISS, but if it turns out the API-breaking one wins, please consider techniques that could provide some degree of backward compatibility as I suggested.
I don't think I'm a very active debater on this forum, so it would be nice to read what other developers think. Especially the ones like me who do not form part of the core developers but just need to add some components and features of their own to IntelMQ.
Br, Mika
From: "Sebix" sebix@sebix.at To: "intelmq-dev" intelmq-dev@lists.cert.at Cc: "Mika Silander" mika.silander@csc.fi Sent: Tuesday, 25 April, 2023 13:05:50 Subject: Re: [IntelMQ-dev] RFC on IEP007: Running IntelMQ as Python Library
Dear Mika, On 4/25/23 11:28 AM, Mika Silander wrote:
Reading through the IEP in question, I thought I would find reasons or motivations as to why having a library is desired. It's possible I've missed discussions or mails and the reasons have been discussed/documented elsewhere.
Thanks for the input, I'll try to find some references from the 3.0 discussions when we first discussed this feature request in more details, or otherwise expand this section on my own.
BQ_BEGIN
If the implementation you choose for IEP is the API-breaking generator one
BQ_END
I think there was a misunderstanding: It is precisely the question if we also want to address this topic. It has its pros and cons, whereas the biggest downside is the API-breaking part with all its implications.
You know that it was a long and bumpy road towards IntelMQ 3.0, and accumulating multiple significant development steps in a short period was challenging (OTOH had the - temporary - development capacity at that time to make this leap). I'm fully aware that breaking changes can mean a lot of trouble and I'm all for small, separated steps (hence this IEP), which allow more straightforward discussion, review and maintenance. When writing PoCs for this IEP, I found a similar area for improvement in IntelMQ which could also be interesting to tackle. It is not up to me to decide the direction of the IntelMQ roadmap but to the community . Hence, I posed this question and asked for comments, and I am very grateful that you are contributing so actively through code and discussion.
best regards Sebastian
Hi Mika,
I will add that to the IEP, that's a good suggestion to add a background/motivation section.
But I know I have been discussing this idea with Sebastian (Sebix) years ago already and it came from the need to easily wrap intelmq functionality and bring it into other tools.
I hear you regarding API breakage. No one wants that and I guess we'll have to think carefully how to achieve this. Good point.
But having intelmq be librarized would actually help a lot for integration into other tools. And these other tools could rely on the hard work of intelmq / parsers mostly to "get it right once and for all". So, in a sense, it would benefit the whole community.
That's my stance on it so far. Hope my comments help to explain the motivation a bit (?)
Best, Aaron.
On 25.04.2023, at 13:43, Mika Silander mika.silander@csc.fi wrote:
Hi Sebastian,
References to the discussions on motivations for the library proposal are welcome, but when and if you find them, IMHO I think the best location for them is the IEP itself. The document could for example, start with a "Background" or similar section that tries to answer why this is a desired thing, what benefits it would bring and maybe some examples. A long time ago I wrote a wrapper script that allowed chaining bots on the command line. It did of course not provide all the features that are outlined in your IEP, but it gave me the possibility to simple sampling of events and testing bots with e.g.
wrapper BotA | wrapper BotB | wrapper BotC ...
so I was struggling to find good use cases to support a library implementation.
What comes to the API-breaking vs. less intrusive implementation, I think I did understand these were two separate options, but the API-breaking I found a bit scary and reacted to that only. And yes, I can imagine it's been a big effort to reach versions 3.0 and 3.1. Had I known all the effort I've had to do put to the interconnected systems related to our IntelMQ setup (IntelMQ itself was not that hard), I would have never even started :-). Still, as said, my vote is on the less intrusive implementation option and sticking to KISS, but if it turns out the API-breaking one wins, please consider techniques that could provide some degree of backward compatibility as I suggested.
I don't think I'm a very active debater on this forum, so it would be nice to read what other developers think. Especially the ones like me who do not form part of the core developers but just need to add some components and features of their own to IntelMQ.
Br, Mika From: "Sebix" sebix@sebix.at To: "intelmq-dev" intelmq-dev@lists.cert.at Cc: "Mika Silander" mika.silander@csc.fi Sent: Tuesday, 25 April, 2023 13:05:50 Subject: Re: [IntelMQ-dev] RFC on IEP007: Running IntelMQ as Python Library
Dear Mika,
On 4/25/23 11:28 AM, Mika Silander wrote: Reading through the IEP in question, I thought I would find reasons or motivations as to why having a library is desired. It's possible I've missed discussions or mails and the reasons have been discussed/documented elsewhere. Thanks for the input, I'll try to find some references from the 3.0 discussions when we first discussed this feature request in more details, or otherwise expand this section on my own. If the implementation you choose for IEP is the API-breaking generator one
I think there was a misunderstanding: It is precisely the question if we also want to address this topic. It has its pros and cons, whereas the biggest downside is the API-breaking part with all its implications.
You know that it was a long and bumpy road towards IntelMQ 3.0, and accumulating multiple significant development steps in a short period was challenging (OTOH had the - temporary - development capacity at that time to make this leap). I'm fully aware that breaking changes can mean a lot of trouble and I'm all for small, separated steps (hence this IEP), which allow more straightforward discussion, review and maintenance. When writing PoCs for this IEP, I found a similar area for improvement in IntelMQ which could also be interesting to tackle. It is not up to me to decide the direction of the IntelMQ roadmap but to the community. Hence, I posed this question and asked for comments, and I am very grateful that you are contributing so actively through code and discussion.
best regards Sebastian --
Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society
ZVR 1510673578
IntelMQ-dev mailing list https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev https://intelmq.readthedocs.io/
Addendum:
I think maybe it also helps to clearly explain what Sebix meant by "API breaking change"? I.e. internally the API for the Bot class changes (and can be search&replaced & refactored for all code and all the developers get informed) Or... this also breaks the (hug/fastapi) API ? ;-)
On other words: what does this change really entail for installations out there? @Sebix: would it make sense to clarify?
Best, a.
On 25.04.2023, at 13:47, L. Aaron Kaplan aaron@lo-res.org wrote:
Hi Mika,
I will add that to the IEP, that's a good suggestion to add a background/motivation section.
But I know I have been discussing this idea with Sebastian (Sebix) years ago already and it came from the need to easily wrap intelmq functionality and bring it into other tools.
I hear you regarding API breakage. No one wants that and I guess we'll have to think carefully how to achieve this. Good point.
But having intelmq be librarized would actually help a lot for integration into other tools. And these other tools could rely on the hard work of intelmq / parsers mostly to "get it right once and for all". So, in a sense, it would benefit the whole community.
That's my stance on it so far. Hope my comments help to explain the motivation a bit (?)
Best, Aaron.
On 25.04.2023, at 13:43, Mika Silander mika.silander@csc.fi wrote:
Hi Sebastian,
References to the discussions on motivations for the library proposal are welcome, but when and if you find them, IMHO I think the best location for them is the IEP itself. The document could for example, start with a "Background" or similar section that tries to answer why this is a desired thing, what benefits it would bring and maybe some examples. A long time ago I wrote a wrapper script that allowed chaining bots on the command line. It did of course not provide all the features that are outlined in your IEP, but it gave me the possibility to simple sampling of events and testing bots with e.g.
wrapper BotA | wrapper BotB | wrapper BotC ...
so I was struggling to find good use cases to support a library implementation.
What comes to the API-breaking vs. less intrusive implementation, I think I did understand these were two separate options, but the API-breaking I found a bit scary and reacted to that only. And yes, I can imagine it's been a big effort to reach versions 3.0 and 3.1. Had I known all the effort I've had to do put to the interconnected systems related to our IntelMQ setup (IntelMQ itself was not that hard), I would have never even started :-). Still, as said, my vote is on the less intrusive implementation option and sticking to KISS, but if it turns out the API-breaking one wins, please consider techniques that could provide some degree of backward compatibility as I suggested.
I don't think I'm a very active debater on this forum, so it would be nice to read what other developers think. Especially the ones like me who do not form part of the core developers but just need to add some components and features of their own to IntelMQ.
Br, Mika From: "Sebix" sebix@sebix.at To: "intelmq-dev" intelmq-dev@lists.cert.at Cc: "Mika Silander" mika.silander@csc.fi Sent: Tuesday, 25 April, 2023 13:05:50 Subject: Re: [IntelMQ-dev] RFC on IEP007: Running IntelMQ as Python Library
Dear Mika,
On 4/25/23 11:28 AM, Mika Silander wrote: Reading through the IEP in question, I thought I would find reasons or motivations as to why having a library is desired. It's possible I've missed discussions or mails and the reasons have been discussed/documented elsewhere. Thanks for the input, I'll try to find some references from the 3.0 discussions when we first discussed this feature request in more details, or otherwise expand this section on my own. If the implementation you choose for IEP is the API-breaking generator one
I think there was a misunderstanding: It is precisely the question if we also want to address this topic. It has its pros and cons, whereas the biggest downside is the API-breaking part with all its implications.
You know that it was a long and bumpy road towards IntelMQ 3.0, and accumulating multiple significant development steps in a short period was challenging (OTOH had the - temporary - development capacity at that time to make this leap). I'm fully aware that breaking changes can mean a lot of trouble and I'm all for small, separated steps (hence this IEP), which allow more straightforward discussion, review and maintenance. When writing PoCs for this IEP, I found a similar area for improvement in IntelMQ which could also be interesting to tackle. It is not up to me to decide the direction of the IntelMQ roadmap but to the community. Hence, I posed this question and asked for comments, and I am very grateful that you are contributing so actively through code and discussion.
best regards Sebastian --
Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society
ZVR 1510673578
IntelMQ-dev mailing list https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev https://intelmq.readthedocs.io/
IntelMQ-dev mailing list https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev https://intelmq.readthedocs.io/
Hi,
Am Dienstag 25 April 2023 13:47:56 schrieb L. Aaron Kaplan:
it came from the need to easily wrap intelmq functionality and bring it into other tools.
if there are specific use cases for those envisioned integrations, I would put them in the IEP as well. Detailed problem descriptions allow to consider if the proposed solution using IntelMQ as library is a good solution for each.
(I have often seen major changes being undertaken just for an abstract idea of "reuse" or "integration", which made the situation worse. So when in doubt, it is good to have real use cases to decide if it is worth it.)
My 2 €¢ Bernhard
Dear Mika, dear Aaron,
On 4/25/23 1:43 PM, Mika Silander wrote:
References to the discussions on motivations for the library proposal are welcome, but when and if you find them, IMHO I think the best location for them is the IEP itself. The document could for example, start with a "Background" or similar section that tries to answer why this is a desired thing, what benefits it would bring and maybe some examples.
I have added a new section with another example and a proposal how it can/will be used in the IntelMQ Webinput:
https://github.com/certtools/ieps/tree/iep-007/007#user-content-use-cases
The code examples are not yet consistent, as they are to be discussed and I work on proof of concepts (i.e. researching what interface options are possible/doable) in parallel.
A long time ago I wrote a wrapper script that allowed chaining bots on the command line. It did of course not provide all the features that are outlined in your IEP, but it gave me the possibility to simple sampling of events and testing bots with e.g.
wrapper BotA | wrapper BotB | wrapper BotC ...
Oooh, that's interesting as well! Although not the same as this proposal, it somehow goes in a similar direction
What comes to the API-breaking vs. less intrusive implementation, I think I did understand these were two separate options, but the API-breaking I found a bit scary and reacted to that only. And yes, I can imagine it's been a big effort to reach versions 3.0 and 3.1. Had I known all the effort I've had to do put to the interconnected systems related to our IntelMQ setup (IntelMQ itself was not that hard), I would have never even started :-).
Hehe, oh yeah, I feel your pain. Getting systems interconnecting together, month-long fiddling around in various scripts, and setting up proper management, monitoring, other supporting systems around all that, docs - lots of work.
Still, as said, my vote is on the less intrusive implementation option and sticking to KISS, but if it turns out the API-breaking one wins, please consider techniques that could provide some degree of backward compatibility as I suggested.
ACK
I don't think I'm a very active debater on this forum, so it would be nice to read what other developers think. Especially the ones like me who do not form part of the core developers but just need to add some components and features of their own to IntelMQ.
+1
On 4/25/23 1:52 PM, L. Aaron Kaplan wrote:
I think maybe it also helps to clearly explain what Sebix meant by "API breaking change"?
The word "API" has two meanings here 1. The program "IntelMQ API", the managers interface 2. The application program interface of IntelMQ (Core) itself, i.e. the methods, its parameters etc.
I.e. internally the API for the Bot class changes (and can be search&replaced & refactored for all code and all the developers get informed) Or... this also breaks the (hug/fastapi) API ? ;-)
In this case I meant the program interface of IntelMQ and more specifically I addressed the function signature of Bot.process. A change here would mean the process() methods need to be adapted, or a compatibility layer introduced (is possible with Python's built-in code inspection).
So far, we all agree that we do not want to make this step (now).
best regards Sebastian
Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society https://sebix.at/ ZVR 1510673578
Dear community,
tl;dr Your opinion on the programming interface matters! Please have a look and share your thoughts by the end of May, preferably before. Links below.
Thank you for contributing to the discussion around this proposal so far.
I want to take the liberty of summarising the previous discussion as follows:
* the feature itself is welcome and not objected * the benefits are not clear to all * the existing programming interfaces (especially for bots) must not change
Whereas the focus of the former comments was on the nature of the proposal itself, we can dive deeper and discuss technical details, such as the programming interfaces, such as:
* how to instantiate bots in "library mode" and to pass settings (parameters) * how to pass messages to the bot/source pipeline * how to receive resulting messages from the bot/destination pipeline
Below I added a few links with code examples.
We'd welcome your thoughts, especially on these topics.
In today's IntelMQ maintainer meeting, we planned to conclude this IEP, including its implementation, End of May with a new feature release of IntelMQ. So it would be great if we could collect all feedback before.
Programming examples in the current draft of IEP007 itself:
https://github.com/certtools/ieps/tree/iep-007/007/#user-content-examples
and you can also look at the current PoC/draft implementation:
https://github.com/certtools/intelmq/pull/2358/files
or an example use case:
https://github.com/Intevation/intelmq-webinput-csv/blob/f29c6922f3a41a1399b4...
Best regards
Sebastian
P.S.: A bugfix release is envisaged for the end of next week
Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society https://sebix.at/ ZVR 1510673578
On 4/24/23 18:31, Sebix wrote:
Dear community,
I invite you to discuss a new (IEP - IntelMQ Enhancement Proposal)
IEP007: Running IntelMQ as Python Library
Have you ever wondered if you can write a Python script, call a bot's process method, pass it some data and get back the enriched/modified data? (pseudo code) bot_instance = Bot(parameters) bot_instance.process_message(input message) -> output messages
Strictly speaking, it *is* actually possible with the current version, but it requires some bizarre hacks like re-defining Bot's methods and overwriting internal values. Staying on the wishlist for quite a while, we intend to implement the feature now. I started the IEP007 draft and need some input from you to maximize the benefit for all the IntelMQ community (developers): https://github.com/certtools/ieps/pull/7/files Or in a readable display:https://github.com/certtools/ieps/tree/iep-007/007
What features and interfacing options would you expect when starting the bot as a library?
Do you think the `Bot.process` method should be rewritten entirely now, removing the calls receive_message/send_message and converting the method into a generator (an API-breaking change)? And if yes, should this be done in one step, or separated from this bot-as-library feature, reducing the complexity of development steps?
Looking forward for your ideas Sebastian
IntelMQ-dev mailing list https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev https://intelmq.readthedocs.io/