Hello,
I will update the user_agent in the Sandbox URL mapping to be more explicit in the next schema update.
The reason that the http_url is commonly not mapped to destination.url for the Sinkhole Events HTTP reports is that the value is not a fully qualified URL such as "/index.php" without any other context which fails the validation for the type "URL" as specified in the harmonization configuration. When the value fails validation it is added to extra instead.
Regards,
Jason
On 11/10/24 11:06 PM, Mika Silander via IntelMQ-dev wrote:
Hi,
We discovered a few Shadowserver feeds with field mappings that look somewhat odd. We use the feed mapping file https://github.com/The-Shadowserver-Foundation/report_schema/blob/main/intel... and our Shadowserver parser bot is configured to update its own copy of this file dynamically. Our intelmq and its Shadowserver parser bot is the one from intelmq git repo branch release-3.3.1.
In the intelmq.json file mentioned above, the Sandbox URL feed defines the optional input field "user_agent" to be parsed on output to "user_agent" (right?):
[ "user_agent", "user_agent", "validate_to_none" ],
However, the parser bot appears to output "extra.user_agent" instead.
The other mapping that seemed odd was in Sinkhole Events HTTP IPv4 & IPv6 (and in Microsoft Sinkhole Events HTTP IPv4):
[ "destination.url", "http_url", "convert_http_host_and_url", true ],
I interpret here that the optional input field "http_url" should be mapped by the Shadowserver parser bot to "destination.url" on output, but we seem to get it mapped to "extra.http_url" instead.
Is this a hickup in intelmq.json, the parser or have I (again) missed something? Anyone else seeing this?
Best regards, Mika _______________________________________________ IntelMQ-dev mailing list https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev https://docs.intelmq.org/