[IntelMQ-dev] Shift of focus, new fields in IDF needed for scanning results?

Thu Aug 17 20:34:12 CEST 2023

> On 17.08.2023, at 18:34, L. Aaron Kaplan <aaron at lo-res.org> wrote:
> 
> The right way to do this is to write up a IEP (intelmq enhancement proposal) in GitHub. It’s simple.  
> You can basically reuse the proposal below. 

Forgot to add: here are some examples: https://github.com/certtools/ieps/tree/main/

As said, I will add a new field for shadowserver as well here.

Best,
a.
> 
> This idea aligns nicely with some extensions which I need for processing new shadowserver feeds. 
> We could add these fields in one go. 
> 
> i would propose you copy & paste the idea below, specify the fields in detail (data type) in the IEP, have a few  sentences if this affects older installations (I assume it does not since it’s additional fields only) and then we wait a to be defined silence procedure on the mailing list and we (Kamil? Cert.at? Some one else?) implement it. 
> 
> I will then add the fields for shadowserver to the IEP and we can do it in one go. 
> 
> 
> Best,
> Aaron. 
> 
> ---
> Mobile
> 
>> On 17.08.2023, at 17:32, Otmar Lendl via IntelMQ-dev <intelmq-dev at lists.cert.at> wrote:
>> 
>> 
>> Folks,
>> 
>> I've been looking at some of the data that we recently processed with our IntelMQ setup (leading to https://cert.at/de/aktuelles/2023/8/verwundbare-webserver-status-in-osterreich), and I found that we need a few changes to IntelMQ to better process these kind of feeds.
>> 
>> What happened? Initially, our focus with IntelMQ was on botnet drones which were detected via sinkholing. That's why we have the source.*, destination.*, and malware.* parameters in the data model. A C2 communication was the main inspiration.
>> 
>> These days, a majority of the data feeds we process are generated by scanning the Internet. That can be Shodan, that can be Shadowserver, that can be from our own local scans. This is fine and good, it gives us valuable telemetry on what's going on in our constituency.
>> 
>> But the fields needed to store the data in a consistent way are missing in the current iteration of the IDF. I don't want to pack that into extra.* or any non-standard field, I think we should come to an agreement what those elements are and how they should be handled.
>> 
>> Here are some of the data fields I'm missing:
>> 
>> What did the scanning find?
>> 
>> * vendor
>> * product
>> * software version
>> 
>> Are there any problems with it?
>> 
>> * Vulnerability identifier (CVE or other, potentially multi-valued, which is another can of worms)
>> * Severity information (e.g. associated CVSS score)
>> * CWE IDs (https://cwe.mitre.org/index.html)
>> 
>> Generic tagging
>> 
>> * I see e.g. "iot", "ics" or similar ones popping up.
>> 
>> As a minimum I think we need "vendor", "product" and "vulnerability".
>> 
>> What do you all think?
>> 
>> otmar
>> 
>> -----------
>> 
>> References:
>> 
>> https://datapedia.shodan.io/
>> 
>> There is e.g.:
>> 
>> os        string Operating system    
>> platform  string        
>> product   string Name of the software that powers the service.    
>> vendor    string
>> 
>> 
>> https://www.shadowserver.org/what-we-do/network-reporting/vulnerable-http-report/
>> 
>> info on the vendor and CVEs is stored in the tag field.
>> 
>> -- 
>> // Otmar Lendl <lendl at cert.at> - T: +43 1 5056416 711
>> // CERT Austria - https://www.cert.at/
>> // CERT.at GmbH, FB-Nr. 561772k, HG Wien
>> _______________________________________________
>> IntelMQ-dev mailing list
>> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
>> https://intelmq.readthedocs.io/
> _______________________________________________
> IntelMQ-dev mailing list
> https://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
> https://intelmq.readthedocs.io/