Hi all,
next to the destination IP and port, malware sinkhole feeds commonly also include information on the destination 'host'. This value is usually taken from the 'Host:' header of the HTTP request sent to the sinkhole by the infected client. The corresponding columns in the feeds are usually called 'dsthost', 'http_host' (e.g. Shadowserver HTTP-Sinkhole-Drone) or 'cc_dns' (Shadowserver Botnet-Drone). The name 'cc_dns' is probably misleading.
In most cases, the value of the destination host field in the feeds is a FQDN - but depending on how the malware sets up the "Host:" header in the HTTP requests, it may also be FQDN:Port, an IP address or IP:Port (or just some crap). Some malware families perform a DNS lookup for the (DGA based) C2 domain first and then only use the IP address in the HTTP requests instead of the domain name.
I noticed that the IntelMQ parsers for the Shadowserver feeds simply try to map the value of 'http_host' to 'destination.fqdn', e.g.
https://github.com/certtools/intelmq/blob/master/intelmq/bots/parsers/shadow...
Line 295: ('destination.fqdn', 'http_host'),
The scripts we currently use for generating notifications to network owners use the following logic (pseudo code here) to extract the destination host information:
if (feed.dstip is a valid ip) { destination.ip = dstip } else { discard event # destination IP is required }
dsthost = feed.http_host dsthost =~ s/:[0-9]+$// # remove potential port information (we already have this information in dstport) if (dsthost is a fqdn) { destination.fqdn = dsthost } else { do nothing # dsthost is either an IP (which we already have in destination.ip) or useless data }
I'd like to suggest using a similar logic in the InteMQ parsers for the Shadowserver (an other malware sinkhole) feeds to not miss destination.fqdn information which comes in the form FQDN:Port - instead of writing the information to an extras field which is most probably not used by the output generators or even dropping the complete event.
- Thomas
CERT-Bund Incident Response & Malware Analysis Team
Hi,
we have this IP vs FQDN problem in some parsers, not only the shadowserver. Stripping the port there can be simply achieved by use a conversion function. But the main problem is IP/FQDN.
Instead of implementing the logic in many parsers we could add this "intelligence" in the libs. One possibility: If the parser tries to add an FQDN as IP, save the IP. But I don't like this simple approach as this implicitness raises other problems. Other possibility: Use a new "logic" (actually non-existing) field, e.g. `destination.host-info`, same applies to source. If some data is added to this field, the data will be parsed and added to ip, fqdn, port (,network?)
Example 1: event['destination.host-info'] = 'example.com:8080' results in: {'destination.fqdn': 'example.com', 'destination.port': 8080} Example2: event['destination.host-info'] = '10.0.0.1' results in: {'source.ip': '10.0.0.1'}
Sebastian
Any comments on this proposal?
On 04/12/2017 12:54 PM, Sebastian Wagner wrote:
Hi,
we have this IP vs FQDN problem in some parsers, not only the shadowserver. Stripping the port there can be simply achieved by use a conversion function. But the main problem is IP/FQDN.
Instead of implementing the logic in many parsers we could add this "intelligence" in the libs. One possibility: If the parser tries to add an FQDN as IP, save the IP. But I don't like this simple approach as this implicitness raises other problems. Other possibility: Use a new "logic" (actually non-existing) field, e.g. `destination.host-info`, same applies to source. If some data is added to this field, the data will be parsed and added to ip, fqdn, port (,network?)
Example 1: event['destination.host-info'] = 'example.com:8080' results in: {'destination.fqdn': 'example.com', 'destination.port': 8080} Example2: event['destination.host-info'] = '10.0.0.1' results in: {'source.ip': '10.0.0.1'}
Sebastian
Intelmq-dev mailing list Intelmq-dev@lists.cert.at http://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
On 12 Apr 2017, at 12:54, Sebastian Wagner wagner@cert.at wrote:
Hi,
we have this IP vs FQDN problem in some parsers, not only the shadowserver. Stripping the port there can be simply achieved by use a conversion function. But the main problem is IP/FQDN.
Instead of implementing the logic in many parsers we could add this "intelligence" in the libs.
I am not sure if I like that approach. Usually the particularities of the "messiness" are best placed in the parser. Even if the logic repeats itself a bit amongst different parsers. We could of course have a function in lib/ to clear this up, but then each parser which thinks it needs that cleanup part must call the cleanup function in lib/. But: other parsers MUST NOT call that cleanup function.
Because the http host dest fields might contain totally different (crap) in other feeds. So... I would *not* try to impose a default behaviour for all parsers here.
I believe the shadowserver parser should be extended in a way as Thomas suggested.
One possibility: If the parser tries to add an FQDN as IP, save the IP. But I don't like this simple approach as this implicitness raises other problems.
yup
Other possibility: Use a new "logic" (actually non-existing) field, e.g. `destination.host-info`,
how about calling it destination.http_host ?
same applies to source. If some data is added to this field, the data will be parsed and added to ip, fqdn, port (,network?)
Example 1: event['destination.host-info'] = 'example.com:8080' results in: {'destination.fqdn': 'example.com', 'destination.port': 8080} Example2: event['destination.host-info'] = '10.0.0.1' results in: {'source.ip': '10.0.0.1'}
but again, if you have destination.http_host there, then again it would make sense to parse it and put the info into destination.ip, destination.port etc, ...
Sebastian
-- // Sebastian Wagner wagner@cert.at - T: +43 1 50564167201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
Intelmq-dev mailing list Intelmq-dev@lists.cert.at http://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev
-- // L. Aaron Kaplan kaplan@cert.at - T: +43 1 5056416 78 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - http://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
On 06/08/2017 12:20 AM, L. Aaron Kaplan wrote:
On 12 Apr 2017, at 12:54, Sebastian Wagner wagner@cert.at wrote:
Instead of implementing the logic in many parsers we could add this "intelligence" in the libs.
I am not sure if I like that approach. Usually the particularities of the "messiness" are best placed in the parser. Even if the logic repeats itself a bit amongst different parsers. We could of course have a function in lib/ to clear this up, but then each parser which thinks it needs that cleanup part must call the cleanup function in lib/. But: other parsers MUST NOT call that cleanup function.
Yes, that was the intention of my proposal (and Thomas' too I think)
Because the http host dest fields might contain totally different (crap) in other feeds. So... I would *not* try to impose a default behaviour for all parsers here.
As I wrote, I do not like some magic too.
Other possibility: Use a new "logic" (actually non-existing) field, e.g. `destination.host-info`,
how about calling it destination.http_host ?
How is this related to HTTP? FQDN, IP and Port apply to all protocols AFAIK.
same applies to source. If some data is added to this field, the data will be parsed and added to ip, fqdn, port (,network?)
Example 1: event['destination.host-info'] = 'example.com:8080' results in: {'destination.fqdn': 'example.com', 'destination.port': 8080} Example2: event['destination.host-info'] = '10.0.0.1' results in: {'source.ip': '10.0.0.1'}
but again, if you have destination.http_host there, then again it would make sense to parse it and put the info into destination.ip, destination.port etc, ...
Yes, that was the intention of my proposal.
Sebastian
I would favor Aaron's approach. A single function in lib, called by parsers on-demand.
On Fri, Jun 9, 2017 at 7:36 PM, Sebastian Wagner wagner@cert.at wrote:
On 06/08/2017 12:20 AM, L. Aaron Kaplan wrote:
On 12 Apr 2017, at 12:54, Sebastian Wagner wagner@cert.at wrote:
Instead of implementing the logic in many parsers we could add this "intelligence" in the libs.
I am not sure if I like that approach. Usually the particularities of the "messiness" are best placed in the
parser.
Even if the logic repeats itself a bit amongst different parsers. We could of course have a function in lib/ to clear this up, but then
each parser which thinks it needs that cleanup part must call the cleanup function in lib/.
But: other parsers MUST NOT call that cleanup function.
Yes, that was the intention of my proposal (and Thomas' too I think)
Because the http host dest fields might contain totally different (crap)
in other feeds.
So... I would *not* try to impose a default behaviour for all parsers here.
As I wrote, I do not like some magic too.
Other possibility: Use a new "logic" (actually non-existing) field, e.g. `destination.host-info`,
how about calling it destination.http_host ?
How is this related to HTTP? FQDN, IP and Port apply to all protocols AFAIK.
same applies to source. If some data is added to this field, the data will be parsed and added to ip, fqdn, port (,network?)
Example 1: event['destination.host-info'] = 'example.com:8080' results in: {'destination.fqdn': 'example.com', 'destination.port': 8080} Example2: event['destination.host-info'] = '10.0.0.1' results in: {'source.ip': '10.0.0.1'}
but again, if you have destination.http_host there, then again it would
make sense to parse it and put the info into destination.ip, destination.port etc, ... Yes, that was the intention of my proposal.
Sebastian
-- // Sebastian Wagner wagner@cert.at - T: +43 1 5056416 7201 // CERT Austria - https://www.cert.at/ // Eine Initiative der nic.at GmbH - https://www.nic.at/ // Firmenbuchnummer 172568b, LG Salzburg
I opened https://github.com/certtools/intelmq/issues/1007 for this
On 04/12/2017 11:17 AM, Thomas Hungenberg wrote:
Hi all,
next to the destination IP and port, malware sinkhole feeds commonly also include information on the destination 'host'. This value is usually taken from the 'Host:' header of the HTTP request sent to the sinkhole by the infected client. The corresponding columns in the feeds are usually called 'dsthost', 'http_host' (e.g. Shadowserver HTTP-Sinkhole-Drone) or 'cc_dns' (Shadowserver Botnet-Drone). The name 'cc_dns' is probably misleading.
In most cases, the value of the destination host field in the feeds is a FQDN - but depending on how the malware sets up the "Host:" header in the HTTP requests, it may also be FQDN:Port, an IP address or IP:Port (or just some crap). Some malware families perform a DNS lookup for the (DGA based) C2 domain first and then only use the IP address in the HTTP requests instead of the domain name.
I noticed that the IntelMQ parsers for the Shadowserver feeds simply try to map the value of 'http_host' to 'destination.fqdn', e.g.
https://github.com/certtools/intelmq/blob/master/intelmq/bots/parsers/shadow...
Line 295: ('destination.fqdn', 'http_host'),
The scripts we currently use for generating notifications to network owners use the following logic (pseudo code here) to extract the destination host information:
if (feed.dstip is a valid ip) { destination.ip = dstip } else { discard event # destination IP is required } dsthost = feed.http_host dsthost =~ s/:[0-9]+$// # remove potential port information (we already have this information in dstport) if (dsthost is a fqdn) { destination.fqdn = dsthost } else { do nothing # dsthost is either an IP (which we already have in destination.ip) or useless data }
I'd like to suggest using a similar logic in the InteMQ parsers for the Shadowserver (an other malware sinkhole) feeds to not miss destination.fqdn information which comes in the form FQDN:Port - instead of writing the information to an extras field which is most probably not used by the output generators or even dropping the complete event.
- Thomas
CERT-Bund Incident Response & Malware Analysis Team _______________________________________________ Intelmq-dev mailing list Intelmq-dev@lists.cert.at http://lists.cert.at/cgi-bin/mailman/listinfo/intelmq-dev