Thanks Patrick for sharing this!
On 28.02.2020, at 10:31, Patrick Forsberg fors@cert.sunet.se wrote:
Signed PGP part Hi,
I'm pretty sure some of you (at least Sebastian) is aware of this, but there is currently a problem with the intelmq.bots.collectors.mail.collector_mail_attach and attachments with long filenames. The error is not in the bot itself, but in the underlying imbox library that doesn't handle long filenames spread over multiple "filename*=" lines in Content-Disposition.
The end result is that some attachments will probably fail to be extracted and it will look similar to the following line in the Collector log. Shadowserver-Mail-Attachment-Fetcher-Collector - INFO - Attachment sv.zip didn't match regex.
There is a non merged pull request in the imbox Git repository to handle this but it hasn't been merged with the main repo.
The solution is to patch imbox/parser.py
I've attached my patch against the pip3 version of imbox (slightly different to the umerged Git pull request)
Regards,
Patrick Forsberg SUNET CERT
<parser.py.patch>
-- // L. Aaron Kaplan kaplan@cert.at - T: +43 1 5056416 78 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - http://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg