[Intelmq-dev] Destination host in malware feeds
Thomas Hungenberg
th at cert-bund.de
Wed Apr 12 11:17:33 CEST 2017
Hi all,
next to the destination IP and port, malware sinkhole feeds commonly also
include information on the destination 'host'. This value is usually taken
from the 'Host:' header of the HTTP request sent to the sinkhole by the
infected client. The corresponding columns in the feeds are usually called
'dsthost', 'http_host' (e.g. Shadowserver HTTP-Sinkhole-Drone) or 'cc_dns'
(Shadowserver Botnet-Drone). The name 'cc_dns' is probably misleading.
In most cases, the value of the destination host field in the feeds is a FQDN -
but depending on how the malware sets up the "Host:" header in the HTTP requests,
it may also be FQDN:Port, an IP address or IP:Port (or just some crap).
Some malware families perform a DNS lookup for the (DGA based) C2 domain first
and then only use the IP address in the HTTP requests instead of the domain name.
I noticed that the IntelMQ parsers for the Shadowserver feeds simply try to
map the value of 'http_host' to 'destination.fqdn', e.g.
https://github.com/certtools/intelmq/blob/master/intelmq/bots/parsers/shadowserver/config.py
Line 295: ('destination.fqdn', 'http_host'),
The scripts we currently use for generating notifications to network owners
use the following logic (pseudo code here) to extract the destination host information:
if (feed.dstip is a valid ip) {
destination.ip = dstip
}
else {
discard event # destination IP is required
}
dsthost = feed.http_host
dsthost =~ s/:[0-9]+$// # remove potential port information (we already have this information in dstport)
if (dsthost is a fqdn) {
destination.fqdn = dsthost
}
else {
do nothing # dsthost is either an IP (which we already have in destination.ip) or useless data
}
I'd like to suggest using a similar logic in the InteMQ parsers for the Shadowserver
(an other malware sinkhole) feeds to not miss destination.fqdn information which comes
in the form FQDN:Port - instead of writing the information to an extras field which is
most probably not used by the output generators or even dropping the complete event.
- Thomas
CERT-Bund Incident Response & Malware Analysis Team
More information about the Intelmq-dev
mailing list