I think the URL parsing is fixed by Thomas' PR
https://github.com/certtools/intelmq/pull/1243
That was part of the last releases already
I took a look at the other reports where there is domain under 'http_host', but the main problem is that parser is joining wrong fields from shadowserver report. It joins 'hostname' with 'url' parameters which it shouldn't do, because under hostname is actually dns ptr record (source_reverse.dns). So it should join 'http_host'(source.fqdn) + 'url' to get the real source.url. Regards, -- Tomislav On 07.01.2018 00:02, Tomislav Protega wrote:Hi, I ran into this error: Shadowserver-Compromised-Website-Parser - ERROR - Could not convert shadowkey: 'http_host', value: '' via conversion function 'validate_fqdn'. More detailed log is attached. This happens when "http_host" field in the shadowserver origin report contains IP instead of domain which is not something unusual. At the end IntelMQ does produce the output data, but there's no 'source.url' field which should contain merged 'http_host' and 'url' parameters from the origin report. Regards,
-- // Sebastian Wagner <wagner@cert.at> - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg