On 2018-07-06 12:50, Ole Kristoffer Dybvik Apeland wrote:
We had an issue with shadowserver data today where those feeds that only provide a download link would yield a php script. I have spoken with someone connected to the project and they are aware of the problem.
Thanks!
Is it possible to let the shadowserver parser drop the messages in these instances? In it's current form only a warning is given (line 92 in bots/parsers/shadowserver/parser.py) even though required_fields are missing in the data.
The (in my opinion) correct behavior would be to log the error, dump the line if configured to do so and continue with the next one. I pushed a fix which does this, to all branches: https://github.com/certtools/intelmq/commit/f33d26da3c3e0ad90187e1c88a00ff67... As it is neither critical nor urgent or important, I won't do a release right now. Ignoring data is quite problematic as the program never knows what is the reason. Shadowserver does change the format regularly.
What we could further do is to check if the first line at least contains some commas. If not, the input is nonsense. On the other hand: How often does this really happen?
Sebastian