[IntelMQ-users] [IntelMQ] Deduplication on an optional field

Sebastian Wagner wagner at cert.at
Fri Aug 6 13:20:46 CEST 2021


Hi,

On 8/6/21 9:34 AM, Guillaume GRANJON DE LEPINEY wrote:
> Thank you for taking the time to answer all my questions.          
>
>  
>
> I've already learned a few things from reading the email that I’m
> going to apply.
>
> However, during my tests I had the impression that the messages were
> dropping when it didn't have the key.
>
Yeah, it depends on the other fields' values. If they are identical, the
events will get dropped. As the message-algorithm just ignores
non-existing fields.

Sebastian

> I'll look into the issue when I'll have more time in the coming weeks.
>
> I will not hesitate to contact you again.
>
>  
>
> Thanks,
>
>  
>
> *Guillaume GRANJON de LÉPINEY*| ggranjon at excellium-services.be
> <mailto:ggranjon at excellium-services.be>| PGP Key ID: 0xE2FD5ED1
> <https://pgp.circl.lu/pks/lookup?search=0xE2FD5ED1&fingerprint=on&op=index>
> *CERT-XLM Incident Handler* @ excellium-services.com
> <https://excellium-services.com/>
> *CERT-XLM* | cert at excellium-services.com
> <mailto:cert at excellium-services.com>| PGP Key ID: 0xD74E5AC0
> <http://pgp.circl.lu/pks/lookup?op=vindex&fingerprint=on&search=0x67B311E5D74E5AC0>
> Excellium Services Belgium N.V.| Orion Bldg, Belgicastraat 13, B-1930
> Zaventem, Belgium
> Mobile: +32 4 71 98 57 65
> Emergency: +352 262 039 64 708 | emergency at excellium-services.com
> <mailto:emergency at excellium-services.com>| PGP Key ID: 0x42662EFE
> <https://excellium-services.com/assets/EMERGENCY_PKEY.asc>
>
>  
>
> *From:*Sebastian Wagner <wagner at cert.at>
> *Sent:* vendredi 30 juillet 2021 09:42
> *To:* Guillaume GRANJON DE LEPINEY <ggranjon at excellium-services.be>;
> 'intelmq-users at lists.cert.at' <intelmq-users at lists.cert.at>
> *Subject:* Re: [IntelMQ-users] [IntelMQ] Deduplication on an optional
> field
>
>  
>
> Hi,
>
> On 7/26/21 3:04 PM, Guillaume GRANJON DE LEPINEY wrote:
>
>     I wonder if there is a simple way to use a Deduplicator bot on an
>     optional field. Indeed, I noticed when I apply the deduplicator on
>     an optional field that the null value must be entered in the redis
>     because all messages (except the first one) that do not contain
>     the field are dropped.
>
>     Is there a workaround please?
>
>      
>
>     I could work around this problem by adding two Sieve bots at the
>     exit of the precedent bot that would jump the Deduplicator bot if
>     the message doesn't have the field, but I don't find that to be
>     optimal. Thus, I am open to any proposal that could help me.
>
> The message-hash method ignores any non-existing key:
> https://github.com/certtools/intelmq/blob/8a8107ec6b332e710626d056b2b0446ab976775f/intelmq/lib/message.py#L404-L405
>
> iffilter_type == "whitelist"andkey notinfilter_keys:
>
>                 continue
>
> You could either filter these messages out just before the
> deduplicator, but I don't see a reason for /two/ sieve bots, one
> should be sufficient, plus using paths (see
> https://intelmq.readthedocs.io/en/latest/user/bots.html#sieve).
>
> (btw: If someone tackles
> https://github.com/certtools/intelmq/issues/1250, the simpler filter
> expert would also work)
>
> If that's not viable for you, then you'd need to adapt the
> deduplicator's code a bit, probably also introducing additional
> parameters. Using the Message.set_default_value is not possible
> either, as that would set a constant, leading to the same behavior as
> you have now.
>
> I hope that helps a bit
>
> Sebastian
>
> -- 
> // Sebastian Wagner <wagner at cert.at> <mailto:wagner at cert.at>- T: +43 676 898 298 7201
> // CERT Austria - https://www.cert.at/
> // Eine Initiative der nic.at GmbH - https://www.nic.at/
> // Firmenbuchnummer 172568b, LG Salzburg
> This email is confidential and may contain legally privileged
> information. If you are not the intended recipient, you should not
> copy, distribute, disclose or use the information it contains, please
> e-mail the sender immediately and delete this message from your
> system. Note: e-mails are susceptible to corruption, interception and
> unauthorised amendment; we do not accept liability for any such
> changes, or for their consequences. You should be aware that we may
> monitor your e-mails and their content. Excellium Services SA. 

-- 
// Sebastian Wagner <wagner at cert.at> - T: +43 676 898 298 7201
// CERT Austria - https://www.cert.at/
// Eine Initiative der nic.at GmbH - https://www.nic.at/
// Firmenbuchnummer 172568b, LG Salzburg

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cert.at/pipermail/intelmq-users/attachments/20210806/ecddbe9c/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.cert.at/pipermail/intelmq-users/attachments/20210806/ecddbe9c/attachment.sig>


More information about the IntelMQ-users mailing list