Hello,
according to https://github.com/certtools/intelmq/blob/develop/docs/user/bots.md events collected using a "Generic Mail URL Fetcher" should include this information:
feed.url extra.email_date extra.email_subject extra.email_from extra.email_message_id extra.file_name
In our database, the events DO include feed.url but DO NOT include any of the extra fields. Events collected using a "Generic Mail Attachment Fetcher" are missing the extra fields as well.
I wonder if this is a bug or caused by some configuration issue with our setup.
- Thomas
Hi Thomas,
Afaik, the extra.* fields are added by the Mail collector bot into the outgoing messages on-the-fly, i.e. the messages you are supposed to feed to the parser bot that follows in the chain. If you look at the end of https://github.com/certtools/intelmq/blob/develop/intelmq/bots/collectors/ma... you should see the enrichment. Message subjects and id etc should of course be present in the email report that is processed by the Mail collector bot for the enrichment to work.
Disclaimer: the above based on my assumptions not knowing what your database and its entries truly look like.
Br, Mika
----- Original Message ----- From: "Thomas Hungenberg via IntelMQ-users" intelmq-users@lists.cert.at To: "intelmq-users" intelmq-users@lists.cert.at Sent: Monday, 8 July, 2024 12:16:29 Subject: [IntelMQ-users] mail collector extra fields
Hello,
according to https://github.com/certtools/intelmq/blob/develop/docs/user/bots.md events collected using a "Generic Mail URL Fetcher" should include this information:
feed.url extra.email_date extra.email_subject extra.email_from extra.email_message_id extra.file_name
In our database, the events DO include feed.url but DO NOT include any of the extra fields. Events collected using a "Generic Mail Attachment Fetcher" are missing the extra fields as well.
I wonder if this is a bug or caused by some configuration issue with our setup.
- Thomas
Hi Thomas, Mika and other readers
As Mika already assumed correctly, these fields are to be used by the Parser bot optionally.
https://docs.intelmq.org/latest/user/bots/#generic-mail-url-fetcher says "The resulting reports contain the following special fields:" (This explanation can be made clearer, as you are pointing out)
Parsers can use this information to process the data and, for example, decide the type of feed. For example, the Shadowserver parser uses extra.file_name. After parsing, this information is no longer needed, as it constitutes only additional information to a Report, not the resulting event. The mentioned fields will not automatically passed on to Events.
Short demo:
import intelmq.lib.message as msg
# create a report with such a field:
rep = msg.Report() rep.add('extra.file_name', 'test')
True
rep
{'time.observation': '2024-07-08T11:33:53+00:00', 'extra.file_name': 'test'} # now create an Event from the Report
ev = msg.Event(rep) ev
{'time.observation': '2024-07-08T11:33:53+00:00'}
Hope this helps Sebastian
Institute for Common Good Technology gemeinnütziger Kulturverein - nonprofit cultural society https://commongoodtechnology.org/ ZVR 1510673578
On 7/8/24 11:16 AM, Thomas Hungenberg via IntelMQ-users wrote:
Hello,
according to https://github.com/certtools/intelmq/blob/develop/docs/user/bots.md events collected using a "Generic Mail URL Fetcher" should include this information:
feed.url extra.email_date extra.email_subject extra.email_from extra.email_message_id extra.file_name
In our database, the events DO include feed.url but DO NOT include any of the extra fields. Events collected using a "Generic Mail Attachment Fetcher" are missing the extra fields as well.
I wonder if this is a bug or caused by some configuration issue with our setup.
- Thomas