Hi Sebastian,
Thanks for the clarification. Yes, I would also prefer explicit behaviour rather than implicit. In addition, this implicitness makes it hard to use the intelmq.json description of the feeds as the base line for checking whether the events reported truly contain the expected fields and nothing else. I'd guess it also indirectly hampers the adoption of intelmq. Maybe someone else can clarify what benefits this implicitness brings.
Br, Mika
----- Original Message ----- From: "Sebix" sebix@sebix.at To: "Mika Silander" mika.silander@csc.fi, "intelmq-dev" intelmq-dev@lists.cert.at Sent: Monday, 11 November, 2024 12:09:55 Subject: Re: [IntelMQ-dev] Question on a few feeds' field mappings in Shadowserver parser
Hi Mika
On 11/11/24 8:06 AM, Mika Silander via IntelMQ-dev wrote:
In the intelmq.json file mentioned above, the Sandbox URL feed defines the optional input field "user_agent" to be parsed on output to "user_agent" (right?):
[ "user_agent", "user_agent", "validate_to_none" ],
However, the parser bot appears to output "extra.user_agent" instead.
Yes, because "user_agent" is used as shortcut for "extra.user_agent", because the field "user_agent" does not exist in IntelMQ. This behavior is specific to the Shadowserver-Parser, not a default in IntelMQ.
https://github.com/certtools/intelmq/blob/e86912f6740ea1592f531fbaa9713e1f60...
However, I think "explicit is better than implicit" and the behavior does not bring any advantages, only potential confusion, as in this case.
The other mapping that seemed odd was in Sinkhole Events HTTP IPv4 & IPv6 (and in Microsoft Sinkhole Events HTTP IPv4):
[ "destination.url", "http_url", "convert_http_host_and_url", true ],
I interpret here that the optional input field "http_url" should be mapped by the Shadowserver parser bot to "destination.url" on output, but we seem to get it mapped to "extra.http_url" instead.
That's also how I'd interpret it, but don't have any more insights (and data/examples) here.
Best regards
Sebastian