[IntelMQ-dev] Question on a few feeds' field mappings in Shadowserver parser

Mika Silander mika.silander at csc.fi
Mon Nov 11 12:31:01 CET 2024


Hi Sebastian,

 Thanks for the clarification. Yes, I would also prefer explicit behaviour rather than implicit. In addition, this implicitness makes it hard to use the intelmq.json description of the feeds as the base line for checking whether the events reported truly contain the expected fields and nothing else. I'd guess it also indirectly hampers the adoption of intelmq. Maybe someone else can clarify what benefits this implicitness brings.

Br, Mika

----- Original Message -----
From: "Sebix" <sebix at sebix.at>
To: "Mika Silander" <mika.silander at csc.fi>, "intelmq-dev" <intelmq-dev at lists.cert.at>
Sent: Monday, 11 November, 2024 12:09:55
Subject: Re: [IntelMQ-dev] Question on a few feeds' field mappings in Shadowserver parser

Hi Mika

On 11/11/24 8:06 AM, Mika Silander via IntelMQ-dev wrote:
>   In the intelmq.json file mentioned above, the Sandbox URL feed defines the optional input field "user_agent" to be parsed on output to "user_agent" (right?):
>
>           [
>              "user_agent",
>              "user_agent",
>              "validate_to_none"
>           ],
>
>   However, the parser bot appears to output "extra.user_agent" instead.
Yes, because "user_agent" is used as shortcut for "extra.user_agent", 
because the field "user_agent" does not exist in IntelMQ.
This behavior is specific to the Shadowserver-Parser, not a default in 
IntelMQ.

https://github.com/certtools/intelmq/blob/e86912f6740ea1592f531fbaa9713e1f6049b1bf/intelmq/bots/parsers/shadowserver/parser.py#L221-L222

However, I think "explicit is better than implicit" and the behavior 
does not bring any advantages, only potential confusion, as in this case.

>   The other mapping that seemed odd was in Sinkhole Events HTTP IPv4 & IPv6 (and in Microsoft Sinkhole Events HTTP IPv4):
>
>          [
>              "destination.url",
>              "http_url",
>              "convert_http_host_and_url",
>              true
>           ],
>
>   I interpret here that the optional input field "http_url" should be mapped by the Shadowserver parser bot to "destination.url" on output, but we seem to get it mapped to "extra.http_url" instead.

That's also how I'd interpret it, but don't have any more insights (and 
data/examples) here.

Best regards

Sebastian

-- 
Institute for Common Good Technology
gemeinnütziger Kulturverein - nonprofit cultural society
https://commongoodtechnology.org/
ZVR 1510673578


More information about the IntelMQ-dev mailing list