RFC on IEP007: Running IntelMQ as Python Library

List overview All Threads
Download

newer

older

Structural change in one feed's...

IntelMQ 3.1 release

Sebix

24 Apr 2023 24 Apr '23

6:31 p.m.

Dear community,

I invite you to discuss a new (IEP - IntelMQ Enhancement Proposal)

IEP007: Running IntelMQ as Python Library

Have you ever wondered if you can write a Python script, call a bot's process method, pass it some data and get back the enriched/modified data? (pseudo code) bot_instance = Bot(parameters) bot_instance.process_message(input message) -> output messages

Strictly speaking, it *is* actually possible with the current version, but it requires some bizarre hacks like re-defining Bot's methods and overwriting internal values. Staying on the wishlist for quite a while, we intend to implement the feature now. I started the IEP007 draft and need some input from you to maximize the benefit for all the IntelMQ community (developers): https://github.com/certtools/ieps/pull/7/files Or in a readable display: https://github.com/certtools/ieps/tree/iep-007/007

What features and interfacing options would you expect when starting the bot as a library?

Do you think the `Bot.process` method should be rewritten entirely now, removing the calls receive_message/send_message and converting the method into a generator (an API-breaking change)? And if yes, should this be done in one step, or separated from this bot-as-library feature, reducing the complexity of development steps?

Looking forward for your ideas Sebastian

-- Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society https://sebix.at/ ZVR 1510673578

Attachments:

signature.asc (application/pgp-signature — 833 bytes)

Show replies by date

Mika Silander

25 Apr 25 Apr

11:28 a.m.

Hi,

Reading through the IEP in question, I thought I would find reasons or motivations as to why having a library is desired. It's possible I've missed discussions or mails and the reasons have been discussed/documented elsewhere.

Having worked hard during more than two years to get IntelMQ up in production and now only waiting for the required servers to arrive, I'm reluctant to have any major changes to the code base (=my vote). If the implementation you choose for IEP is the API-breaking generator one, please, if at all possible, consider implementing wrappers, decorators or the like to maintain the old process method (and friends) available for some time onwards. This would give the bot developers (me included) time to adapt our own bots to this new approach.

Br, Mika

P.S: IntelMQ was the easier part, most of those two years mentioned above has been spent on getting the other interconnected systems and interfaces to them working.

----- Original Message ----- From: "Sebix" sebix@sebix.at To: "intelmq-dev" intelmq-dev@lists.cert.at Sent: Monday, 24 April, 2023 19:31:39 Subject: [IntelMQ-dev] RFC on IEP007: Running IntelMQ as Python Library

Dear community,

I invite you to discuss a new (IEP - IntelMQ Enhancement Proposal)

IEP007: Running IntelMQ as Python Library

What features and interfacing options would you expect when starting the bot as a library?

Looking forward for your ideas Sebastian

-- Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society https://sebix.at/ ZVR 1510673578 _______________________________________________ IntelMQ-dev mailing list https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev https://intelmq.readthedocs.io/

Sebix

12:05 p.m.

Dear Mika,

On 4/25/23 11:28 AM, Mika Silander wrote:

...

Reading through the IEP in question, I thought I would find reasons or motivations as to why having a library is desired. It's possible I've missed discussions or mails and the reasons have been discussed/documented elsewhere.

Thanks for the input, I'll try to find some references from the 3.0 discussions when we first discussed this feature request in more details, or otherwise expand this section on my own.

...

If the implementation you choose for IEP is the API-breaking generator one

I think there was a misunderstanding: It is precisely the /question/ if we also want to address this topic. It has its pros and cons, whereas the biggest downside is the API-breaking part with all its implications.

You know that it was a long and bumpy road towards IntelMQ 3.0, and accumulating multiple significant development steps in a short period was challenging (OTOH had the - temporary - development capacity at that time to make this leap). I'm fully aware that breaking changes can mean a lot of trouble and I'm all for small, separated steps (hence this IEP), which allow more straightforward discussion, review and maintenance. When writing PoCs for this IEP, I found a similar area for improvement in IntelMQ which could also be interesting to tackle. It is not up to me to decide the direction of the IntelMQ roadmap but to the /community/. Hence, I posed this question and asked for comments, and I am very grateful that you are contributing so actively through code and discussion.

best regards Sebastian

-- Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society https://sebix.at/ ZVR 1510673578

Mika Silander

1:43 p.m.

Hi Sebastian,

References to the discussions on motivations for the library proposal are welcome, but when and if you find them, IMHO I think the best location for them is the IEP itself. The document could for example, start with a "Background" or similar section that tries to answer why this is a desired thing, what benefits it would bring and maybe some examples. A long time ago I wrote a wrapper script that allowed chaining bots on the command line. It did of course not provide all the features that are outlined in your IEP, but it gave me the possibility to simple sampling of events and testing bots with e.g.

wrapper BotA | wrapper BotB | wrapper BotC ...

so I was struggling to find good use cases to support a library implementation.

What comes to the API-breaking vs. less intrusive implementation, I think I did understand these were two separate options, but the API-breaking I found a bit scary and reacted to that only. And yes, I can imagine it's been a big effort to reach versions 3.0 and 3.1. Had I known all the effort I've had to do put to the interconnected systems related to our IntelMQ setup (IntelMQ itself was not that hard), I would have never even started :-). Still, as said, my vote is on the less intrusive implementation option and sticking to KISS, but if it turns out the API-breaking one wins, please consider techniques that could provide some degree of backward compatibility as I suggested.

I don't think I'm a very active debater on this forum, so it would be nice to read what other developers think. Especially the ones like me who do not form part of the core developers but just need to add some components and features of their own to IntelMQ.

Br, Mika

From: "Sebix" sebix@sebix.at To: "intelmq-dev" intelmq-dev@lists.cert.at Cc: "Mika Silander" mika.silander@csc.fi Sent: Tuesday, 25 April, 2023 13:05:50 Subject: Re: [IntelMQ-dev] RFC on IEP007: Running IntelMQ as Python Library

Dear Mika, On 4/25/23 11:28 AM, Mika Silander wrote:

Thanks for the input, I'll try to find some references from the 3.0 discussions when we first discussed this feature request in more details, or otherwise expand this section on my own.

BQ_BEGIN

If the implementation you choose for IEP is the API-breaking generator one

BQ_END

I think there was a misunderstanding: It is precisely the question if we also want to address this topic. It has its pros and cons, whereas the biggest downside is the API-breaking part with all its implications.

You know that it was a long and bumpy road towards IntelMQ 3.0, and accumulating multiple significant development steps in a short period was challenging (OTOH had the - temporary - development capacity at that time to make this leap). I'm fully aware that breaking changes can mean a lot of trouble and I'm all for small, separated steps (hence this IEP), which allow more straightforward discussion, review and maintenance. When writing PoCs for this IEP, I found a similar area for improvement in IntelMQ which could also be interesting to tackle. It is not up to me to decide the direction of the IntelMQ roadmap but to the community . Hence, I posed this question and asked for comments, and I am very grateful that you are contributing so actively through code and discussion.

best regards Sebastian

-- Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society [ https://sebix.at/ | https://sebix.at/ ] ZVR 1510673578

L. Aaron Kaplan

1:47 p.m.

Hi Mika,

I will add that to the IEP, that's a good suggestion to add a background/motivation section.

But I know I have been discussing this idea with Sebastian (Sebix) years ago already and it came from the need to easily wrap intelmq functionality and bring it into other tools.

I hear you regarding API breakage. No one wants that and I guess we'll have to think carefully how to achieve this. Good point.

But having intelmq be librarized would actually help a lot for integration into other tools. And these other tools could rely on the hard work of intelmq / parsers mostly to "get it right once and for all". So, in a sense, it would benefit the whole community.

That's my stance on it so far. Hope my comments help to explain the motivation a bit (?)

Best, Aaron.

...

On 25.04.2023, at 13:43, Mika Silander mika.silander@csc.fi wrote:

Hi Sebastian,

References to the discussions on motivations for the library proposal are welcome, but when and if you find them, IMHO I think the best location for them is the IEP itself. The document could for example, start with a "Background" or similar section that tries to answer why this is a desired thing, what benefits it would bring and maybe some examples. A long time ago I wrote a wrapper script that allowed chaining bots on the command line. It did of course not provide all the features that are outlined in your IEP, but it gave me the possibility to simple sampling of events and testing bots with e.g.

wrapper BotA | wrapper BotB | wrapper BotC ...

so I was struggling to find good use cases to support a library implementation.

What comes to the API-breaking vs. less intrusive implementation, I think I did understand these were two separate options, but the API-breaking I found a bit scary and reacted to that only. And yes, I can imagine it's been a big effort to reach versions 3.0 and 3.1. Had I known all the effort I've had to do put to the interconnected systems related to our IntelMQ setup (IntelMQ itself was not that hard), I would have never even started :-). Still, as said, my vote is on the less intrusive implementation option and sticking to KISS, but if it turns out the API-breaking one wins, please consider techniques that could provide some degree of backward compatibility as I suggested.

I don't think I'm a very active debater on this forum, so it would be nice to read what other developers think. Especially the ones like me who do not form part of the core developers but just need to add some components and features of their own to IntelMQ.

Br, Mika From: "Sebix" sebix@sebix.at To: "intelmq-dev" intelmq-dev@lists.cert.at Cc: "Mika Silander" mika.silander@csc.fi Sent: Tuesday, 25 April, 2023 13:05:50 Subject: Re: [IntelMQ-dev] RFC on IEP007: Running IntelMQ as Python Library

Dear Mika,

On 4/25/23 11:28 AM, Mika Silander wrote: Reading through the IEP in question, I thought I would find reasons or motivations as to why having a library is desired. It's possible I've missed discussions or mails and the reasons have been discussed/documented elsewhere. Thanks for the input, I'll try to find some references from the 3.0 discussions when we first discussed this feature request in more details, or otherwise expand this section on my own. If the implementation you choose for IEP is the API-breaking generator one

I think there was a misunderstanding: It is precisely the question if we also want to address this topic. It has its pros and cons, whereas the biggest downside is the API-breaking part with all its implications.

You know that it was a long and bumpy road towards IntelMQ 3.0, and accumulating multiple significant development steps in a short period was challenging (OTOH had the - temporary - development capacity at that time to make this leap). I'm fully aware that breaking changes can mean a lot of trouble and I'm all for small, separated steps (hence this IEP), which allow more straightforward discussion, review and maintenance. When writing PoCs for this IEP, I found a similar area for improvement in IntelMQ which could also be interesting to tackle. It is not up to me to decide the direction of the IntelMQ roadmap but to the community. Hence, I posed this question and asked for comments, and I am very grateful that you are contributing so actively through code and discussion.

best regards Sebastian --

Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society

https://sebix.at/

ZVR 1510673578

IntelMQ-dev mailing list https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev https://intelmq.readthedocs.io/

L. Aaron Kaplan

1:52 p.m.

Addendum:

I think maybe it also helps to clearly explain what Sebix meant by "API breaking change"? I.e. internally the API for the Bot class changes (and can be search&replaced & refactored for all code and all the developers get informed) Or... this also breaks the (hug/fastapi) API ? ;-)

On other words: what does this change really entail for installations out there? @Sebix: would it make sense to clarify?

Best, a.

...

On 25.04.2023, at 13:47, L. Aaron Kaplan aaron@lo-res.org wrote:

Hi Mika,

I will add that to the IEP, that's a good suggestion to add a background/motivation section.

But I know I have been discussing this idea with Sebastian (Sebix) years ago already and it came from the need to easily wrap intelmq functionality and bring it into other tools.

I hear you regarding API breakage. No one wants that and I guess we'll have to think carefully how to achieve this. Good point.

But having intelmq be librarized would actually help a lot for integration into other tools. And these other tools could rely on the hard work of intelmq / parsers mostly to "get it right once and for all". So, in a sense, it would benefit the whole community.

That's my stance on it so far. Hope my comments help to explain the motivation a bit (?)

Best, Aaron.

...
On 25.04.2023, at 13:43, Mika Silander mika.silander@csc.fi wrote:

Hi Sebastian,

References to the discussions on motivations for the library proposal are welcome, but when and if you find them, IMHO I think the best location for them is the IEP itself. The document could for example, start with a "Background" or similar section that tries to answer why this is a desired thing, what benefits it would bring and maybe some examples. A long time ago I wrote a wrapper script that allowed chaining bots on the command line. It did of course not provide all the features that are outlined in your IEP, but it gave me the possibility to simple sampling of events and testing bots with e.g.

wrapper BotA | wrapper BotB | wrapper BotC ...

so I was struggling to find good use cases to support a library implementation.

What comes to the API-breaking vs. less intrusive implementation, I think I did understand these were two separate options, but the API-breaking I found a bit scary and reacted to that only. And yes, I can imagine it's been a big effort to reach versions 3.0 and 3.1. Had I known all the effort I've had to do put to the interconnected systems related to our IntelMQ setup (IntelMQ itself was not that hard), I would have never even started :-). Still, as said, my vote is on the less intrusive implementation option and sticking to KISS, but if it turns out the API-breaking one wins, please consider techniques that could provide some degree of backward compatibility as I suggested.

I don't think I'm a very active debater on this forum, so it would be nice to read what other developers think. Especially the ones like me who do not form part of the core developers but just need to add some components and features of their own to IntelMQ.

Br, Mika From: "Sebix" sebix@sebix.at To: "intelmq-dev" intelmq-dev@lists.cert.at Cc: "Mika Silander" mika.silander@csc.fi Sent: Tuesday, 25 April, 2023 13:05:50 Subject: Re: [IntelMQ-dev] RFC on IEP007: Running IntelMQ as Python Library

Dear Mika,

On 4/25/23 11:28 AM, Mika Silander wrote: Reading through the IEP in question, I thought I would find reasons or motivations as to why having a library is desired. It's possible I've missed discussions or mails and the reasons have been discussed/documented elsewhere. Thanks for the input, I'll try to find some references from the 3.0 discussions when we first discussed this feature request in more details, or otherwise expand this section on my own. If the implementation you choose for IEP is the API-breaking generator one

I think there was a misunderstanding: It is precisely the question if we also want to address this topic. It has its pros and cons, whereas the biggest downside is the API-breaking part with all its implications.

You know that it was a long and bumpy road towards IntelMQ 3.0, and accumulating multiple significant development steps in a short period was challenging (OTOH had the - temporary - development capacity at that time to make this leap). I'm fully aware that breaking changes can mean a lot of trouble and I'm all for small, separated steps (hence this IEP), which allow more straightforward discussion, review and maintenance. When writing PoCs for this IEP, I found a similar area for improvement in IntelMQ which could also be interesting to tackle. It is not up to me to decide the direction of the IntelMQ roadmap but to the community. Hence, I posed this question and asked for comments, and I am very grateful that you are contributing so actively through code and discussion.

best regards Sebastian --

Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society

https://sebix.at/

ZVR 1510673578

IntelMQ-dev mailing list https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev https://intelmq.readthedocs.io/

IntelMQ-dev mailing list https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev https://intelmq.readthedocs.io/

Bernhard Reiter

2:54 p.m.

Hi,

Am Dienstag 25 April 2023 13:47:56 schrieb L. Aaron Kaplan:

...

it came from the need to easily wrap intelmq functionality and bring it into other tools.

if there are specific use cases for those envisioned integrations, I would put them in the IEP as well. Detailed problem descriptions allow to consider if the proposed solution using IntelMQ as library is a good solution for each.

(I have often seen major changes being undertaken just for an abstract idea of "reuse" or "integration", which made the situation worse. So when in doubt, it is good to have real use cases to decide if it is worth it.)

My 2 €¢ Bernhard

-- https://intevation.de/~bernhard +49 541 33 508 3-3 Intevation GmbH, Osnabrück, DE; Amtsgericht Osnabrück, HRB 18998 Geschäftsführer Frank Koormann, Bernhard Reiter

Sebix

26 Apr 26 Apr

10:58 p.m.

Dear Mika, dear Aaron,

On 4/25/23 1:43 PM, Mika Silander wrote:

...

References to the discussions on motivations for the library proposal are welcome, but when and if you find them, IMHO I think the best location for them is the IEP itself. The document could for example, start with a "Background" or similar section that tries to answer why this is a desired thing, what benefits it would bring and maybe some examples.

I have added a new section with another example and a proposal how it can/will be used in the IntelMQ Webinput:

https://github.com/certtools/ieps/tree/iep-007/007#user-content-use-cases

The code examples are not yet consistent, as they are to be discussed and I work on proof of concepts (i.e. researching what interface options are possible/doable) in parallel.

...

A long time ago I wrote a wrapper script that allowed chaining bots on the command line. It did of course not provide all the features that are outlined in your IEP, but it gave me the possibility to simple sampling of events and testing bots with e.g.

wrapper BotA | wrapper BotB | wrapper BotC ...

Oooh, that's interesting as well! Although not the same as this proposal, it somehow goes in a similar direction

...

What comes to the API-breaking vs. less intrusive implementation, I think I did understand these were two separate options, but the API-breaking I found a bit scary and reacted to that only. And yes, I can imagine it's been a big effort to reach versions 3.0 and 3.1. Had I known all the effort I've had to do put to the interconnected systems related to our IntelMQ setup (IntelMQ itself was not that hard), I would have never even started :-).

Hehe, oh yeah, I feel your pain. Getting systems interconnecting together, month-long fiddling around in various scripts, and setting up proper management, monitoring, other supporting systems around all that, docs - lots of work.

...

Still, as said, my vote is on the less intrusive implementation option and sticking to KISS, but if it turns out the API-breaking one wins, please consider techniques that could provide some degree of backward compatibility as I suggested.

ACK

...

I don't think I'm a very active debater on this forum, so it would be nice to read what other developers think. Especially the ones like me who do not form part of the core developers but just need to add some components and features of their own to IntelMQ.

On 4/25/23 1:52 PM, L. Aaron Kaplan wrote:

...

I think maybe it also helps to clearly explain what Sebix meant by "API breaking change"?

The word "API" has two meanings here 1. The program "IntelMQ API", the managers interface 2. The application program interface of IntelMQ (Core) itself, i.e. the methods, its parameters etc.

...

I.e. internally the API for the Bot class changes (and can be search&replaced & refactored for all code and all the developers get informed) Or... this also breaks the (hug/fastapi) API ? ;-)

In this case I meant the program interface of IntelMQ and more specifically I addressed the function signature of Bot.process. A change here would mean the process() methods need to be adapted, or a compatibility layer introduced (is possible with Python's built-in code inspection).

So far, we all agree that we do not want to make this step (now).

best regards Sebastian

Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society https://sebix.at/ ZVR 1510673578

Sebix

4 May 4 May

10:09 p.m.

Dear community,

tl;dr Your opinion on the programming interface matters! Please have a look and share your thoughts by the end of May, preferably before. Links below.

Thank you for contributing to the discussion around this proposal so far.

I want to take the liberty of summarising the previous discussion as follows:

* the feature itself is welcome and not objected * the benefits are not clear to all * the existing programming interfaces (especially for bots) must not change

Whereas the focus of the former comments was on the nature of the proposal itself, we can dive deeper and discuss technical details, such as the programming interfaces, such as:

* how to instantiate bots in "library mode" and to pass settings (parameters) * how to pass messages to the bot/source pipeline * how to receive resulting messages from the bot/destination pipeline

Below I added a few links with code examples.

We'd welcome your thoughts, especially on these topics.

In today's IntelMQ maintainer meeting, we planned to conclude this IEP, including its implementation, End of May with a new feature release of IntelMQ. So it would be great if we could collect all feedback before.

Programming examples in the current draft of IEP007 itself:

https://github.com/certtools/ieps/tree/iep-007/007/#user-content-examples

and you can also look at the current PoC/draft implementation:

https://github.com/certtools/intelmq/pull/2358/files

or an example use case:

https://github.com/Intevation/intelmq-webinput-csv/blob/f29c6922f3a41a1399b4...

Best regards

Sebastian

P.S.: A bugfix release is envisaged for the end of next week

Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society https://sebix.at/ ZVR 1510673578

On 4/24/23 18:31, Sebix wrote:

...

Dear community,

I invite you to discuss a new (IEP - IntelMQ Enhancement Proposal)

IEP007: Running IntelMQ as Python Library

Have you ever wondered if you can write a Python script, call a bot's process method, pass it some data and get back the enriched/modified data? (pseudo code) bot_instance = Bot(parameters) bot_instance.process_message(input message) -> output messages

Strictly speaking, it *is* actually possible with the current version, but it requires some bizarre hacks like re-defining Bot's methods and overwriting internal values. Staying on the wishlist for quite a while, we intend to implement the feature now. I started the IEP007 draft and need some input from you to maximize the benefit for all the IntelMQ community (developers): https://github.com/certtools/ieps/pull/7/files Or in a readable display:https://github.com/certtools/ieps/tree/iep-007/007

What features and interfacing options would you expect when starting the bot as a library?

Do you think the `Bot.process` method should be rewritten entirely now, removing the calls receive_message/send_message and converting the method into a generator (an API-breaking change)? And if yes, should this be done in one step, or separated from this bot-as-library feature, reducing the complexity of development steps?

Looking forward for your ideas Sebastian

IntelMQ-dev mailing list https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev https://intelmq.readthedocs.io/

827

Age (days ago)

837

Last active (days ago)

intelmq-dev@lists.cert.at

8 comments

4 participants

tags (0)

participants (4)

Bernhard Reiter
L. Aaron Kaplan
Mika Silander
Sebix