Dustin, All,
On 16.06.2016 12:27, Dustin Demuth wrote:
The file contains a bunch of mappings of the feeds below. We are not sure if the mappings are correct.
Can someone verify this and, if possible, remove the appropriate todos, or correct the mapping?
I've been maintaining parsers (in a different context) for Shadowserver feeds for the last 3 years. Based on that experience a few comments:
* Don't assume that the field-names will stay constant. Be prepared to support logic like "use 'ip' or 'srcip' for the IntelMQ 'source.ip'".
For the Drone feed, I e.g. have the following mapping rules in our old system:
# Mapping from local CSV column names to eventDB column names $self->{eventdb_map} = { asn => "reported_asn", ip => "src_ip", hostname => "src_hostname", port => "src_port", cc => "dst_ip", cc_ip => "dst_ip", cc_port => "dst_port", cc_dns => "dst_fqdn", timestamp => "ts", url => "dst_url", geo => "reported_iso2cc", infection => "malware", machine_name => "local_hostname", # older names "Timestamp" => "ts", "Drone" => "src_ip", "ASN" => "reported_asn", "Geo" => "reported_iso2cc", "Hostname" => "src_hostname", "C&C" => "dst_ip", "C&C DNS" => "dst_fqdn", "C&C Port" => "dst_port", "Infection" => "malware", };
* I see you support a fixup-function for each attribute. Yes, this is needed but potentially not good enough. The reason is that you might need to manipulate multiple fields together, e.g. it varies by feed whether C&C URLs are transmitted as full URL or split up in proto/port/hostname/path. If you want to unify these fields, a single function per attribute will not do.
Here is code from one of my parsers (not shadowserver, this is for Virustracker) to demonstrate this point.
if (exists($row->{reported_asn})) { $row->{reported_asn} =~ s/^AS(\d+)\s*.*/$1/; } if (($row->{Type} eq 'HTTP') and $row->{RequestPath} and $row->{dst_fqdn} and $row->{"dst_port"} and ($row->{"dst_port"} =~ /^80|443$/)) { $row->{RequestPath} =~ s,^/?,/,; # make sure request starts with / $row->{"dst_url"} = (($row->{"dst_port"} eq '443') ? 'https://' : 'http://') . ($row->{dst_fqdn} ? $row->{dst_fqdn} : "") . ($row->{RequestPath} ? $row->{RequestPath} : "") ; }
# for udp p2p botnets, the destinatin IP address is encoded in the Domain parameter if (($row->{Type} =~ /P2P/) and $row->{dst_fqdn} and ($row->{"dst_fqdn"} =~ /^([\d.]+):/)) { $row->{"dst_ip"} = $1; delete($row->{"dst_fqdn"}); }
# move IP-addresses from fqdn to ip field if ($row->{dst_fqdn} and ($row->{"dst_fqdn"} =~ /^([\d.]+)$/)) { $row->{"dst_ip"} = $1; delete($row->{"dst_fqdn"}); }
HTH,
otmar