Dear list,
in pull request #944 (netlab 360 enh [0]) by navtej an issue came up
which can't be solved trivially:
The feed Netlab 360 DGA[1] - which is already included in intelmq -
provides a validity time frame for each domain. Most of those (~90%) end
in 2030 while the start date is the current day at 00:00.
So both start and end time are artificial. And the source claims the
event is valid in the future, which is a very odd. And does it actually
make sense to forward this kind of information?
Also, we can't really handle this time information using the current
harmonization.
One idea would be to set time.source to time.observation if the
time.source is in the future. So time.source <= time.observation does
always apply.
What do you think?
Sebastian
[0]: https://github.com/certtools/intelmq/pull/944
[1]: http://data.netlab.360.com/feeds/dga/dga.txt - attention, quite
big! The domains at the beginning have a very near end date.
--
// Sebastian Wagner <wagner(a)cert.at> - T: +43 1 5056416 7201
// CERT Austria - https://www.cert.at/
// Eine Initiative der nic.at GmbH - https://www.nic.at/
// Firmenbuchnummer 172568b, LG Salzburg
= Intelmq-dev-news 05-2016
Issue 5/2016
== Topics ==
# Summary of IHAP meeting in April
# Status update Intevation
# Status update CERT.at
# Status update misc
== May 2016 ==
Dear Intelmq-dev mailing list readers,
this is the second issue of intelmq developer news.
We hope it's useful.
TL;DR and important changes
-----------------------------
The syntax of intelmqcli was changed to a new format:
intelmqctl {start,stop} bot_id.
This breaks compatibilty with existing scripts.
If you put intelmqctl into some script, please adapt it. Also
please be sure to check out the latest version of the intelmq-manager
in case you use it.
Lots of open issues. Progress with intelmqcli (to connect postgresql to the RT ticket system).
/ TL;DR
=== How to contribute to this newsletter? ===
-> contact Aaron, Dustin for future input
=== Summary of IHAP meeting in April ===
In April the IHAP Meeting took place in Vienna.
* A Hacksession the night before the meeting was used by Raphael and Aaron in
order to bridge MISP and IntelMQ.
* Connections between Abusehelper and IntelMQ are on some CERTs wish list.
XMPP is a good start. Unfortunately the XMPP Bot upstream was not fit for
production.
=== Status report Intevation ===
* Still working on the KontaktDB, we appreciate the discussions that started
on IHAP Meeting.
We received a Pull Request from Cert.at and are currently reviewing it.
* We have Scripts to import Data into the KontaktDB.
Nevertheless there is some work left.
* Demonstrated installation from packages on Ubuntu 14.04 on IHAP-Meeting.
We propose to host the **signed** packages on our public apt-repositories.
* Working on a tool similar to intelmqcli, intended to process events from
the eventdb. Instead of using RT they are sent by e-mail.
The tool has the working title "event-processor" and can be found here
(https://github.com/Intevation/event-processor)
* We did not start with support for IODEF or X-ARF yet.
=== Status report CERT.at developments ===
* we moved to python3 only. Intelmq dropped python2 support
(https://github.com/certtools/intelmq/commit/2cbb42f1458a7e90539a443ec5e50ee…).
This does not apply yet to the certat repo (github.com/certat/intelmq), which
still supports python2.7 but only for the intelmqcli tool.
* New active contributor: pedro m. reis! Welcome and thanks for working so
hard on the Bitsight collector
(https://github.com/certtools/intelmq/pull/493)
* intelmqcli tool now supports a lot of new flags:
https://github.com/certat/intelmq/issues/52 This was necessary for CERT.at
since we use intelmqcli via cron job to connect to the (postgresql) eventDB ,
pull out all of the new data and use RT (ticket system) to send stuff out.
Added flags --quiet --batch. Now intelmqcli sends via cronjob.
These flags now allow CERT.at to run intelmq in full auto-mode! intelmqcli is
started via cron and sends out all events to all ISPs.
=== Requests ===
* Intevation searches for testers for the packages.
* We'd like to have some nice graphs in the intelmq-manager: events/sec , parse-failures/sec, etc.
* implementation of whitelisting of events (filter out events based on whitelists). See
https://github.com/certtools/intelmq/issues/426
* A good CSS design for the web page
=== Community ===
* RIPE abuse-c contacts can be done locally. RIPE might be able to export
abuse-c infos publicly (fingers crossed).
* more command line options for intelmqcli (see the
https://github.com/certat/intelmq repo)
* Aaron gave a presentation at the ENISA workshop "CSIRTs in Europe", 11th of May.
Slides will be shared on the ENISA page.
==== intelmq.org ====
The website intelmq.org is now online, but we would like to have more content and a proper
design.
Do you want to contribute to intelmq, but you are not a programmer? This is
your chance!
Current ToDos:
* Create Website Content: How-Tos / Installation Instructions, Success
Stories
** How-Tos / Instructions: If you are using a special feature of IntelMQ, for
instance an expert bot, try to find some time to write down a short article
how you managed to get it to work and why you are using it.
* Website Design
== Wishlist ==
* **we need more test-cases!!!**
* a specific config logic for ASNs: do this and that (for example sett ttl =
1 month) if event is in ASN xyz. Or "ignore" if event is in ASN xyz. This
should support some kind of more-specific-less-specific inheritance,
similarly to Apache directory settings. The most specific setting wins. The
order could be: country code -> ASN -> netblock -> ip (/32). Open questions:
what's more relevant if both domains and numbers (ASN, IPs, net blocks) exist
in an event?
* block based processing: for example block based team cymru lookups
* parallelisation: We need to revisit this topic
== Important Discussions ==
In case you missed something, here are the headlines of some discussion we
consider interesting / important.
=== Mailing Lists ===
* [Intelmq-dev] Packaging Strategy for Bots with dependencies
* [Intelmq-dev] Discussion on intelmq output / transformation architecture
* [Intelmq-dev] Output format to syslog/splunk (PR#503)
== Communication ==
Chat: irc #intelmq on freenode or webchat:
[[https://webchat.freenode.net/?channels=intelmq]]
Follow on twitter: @intelmqorg
Weekly Conference Call every Tuesday: Dial in via the known conference bridge number. It is
[[https://en.wikipedia.org/wiki/Telephone_number_mapping|ENUM]] enabled. Ask
Aaron or Dustin for the number if you want to participate.
Hi all,
next to the destination IP and port, malware sinkhole feeds commonly also
include information on the destination 'host'. This value is usually taken
from the 'Host:' header of the HTTP request sent to the sinkhole by the
infected client. The corresponding columns in the feeds are usually called
'dsthost', 'http_host' (e.g. Shadowserver HTTP-Sinkhole-Drone) or 'cc_dns'
(Shadowserver Botnet-Drone). The name 'cc_dns' is probably misleading.
In most cases, the value of the destination host field in the feeds is a FQDN -
but depending on how the malware sets up the "Host:" header in the HTTP requests,
it may also be FQDN:Port, an IP address or IP:Port (or just some crap).
Some malware families perform a DNS lookup for the (DGA based) C2 domain first
and then only use the IP address in the HTTP requests instead of the domain name.
I noticed that the IntelMQ parsers for the Shadowserver feeds simply try to
map the value of 'http_host' to 'destination.fqdn', e.g.
https://github.com/certtools/intelmq/blob/master/intelmq/bots/parsers/shado…
Line 295: ('destination.fqdn', 'http_host'),
The scripts we currently use for generating notifications to network owners
use the following logic (pseudo code here) to extract the destination host information:
if (feed.dstip is a valid ip) {
destination.ip = dstip
}
else {
discard event # destination IP is required
}
dsthost = feed.http_host
dsthost =~ s/:[0-9]+$// # remove potential port information (we already have this information in dstport)
if (dsthost is a fqdn) {
destination.fqdn = dsthost
}
else {
do nothing # dsthost is either an IP (which we already have in destination.ip) or useless data
}
I'd like to suggest using a similar logic in the InteMQ parsers for the Shadowserver
(an other malware sinkhole) feeds to not miss destination.fqdn information which comes
in the form FQDN:Port - instead of writing the information to an extras field which is
most probably not used by the output generators or even dropping the complete event.
- Thomas
CERT-Bund Incident Response & Malware Analysis Team
Hi IntelMQ-ML,
Before open the discussion, I think is important to write a few notes:
1) The topic that we are opening to discuss have been discussed between me, Aaron and Sebastian in order to understand the impact of the possible changes, the effort require, the complexity, the global perspective of how should be implemented, etc... Each of us has now an idea and perspective about it and is crucial to have the community involved from now on in order to agree on the way to proceed.
2) The proposal that will be shared here is my own perspective but NOT only my work because a lot of the structure and technical details was only possible with Sebastian and Aaron contributions ( Thank you Aaron and Sebastian ), even if in some specific details they might see that there is a space to do in other/better way. This thread will be a good place to listen everyone's perspective. :)
3) IMPORTANT: This proposal is just a proposal and does NOT mean that will be implemented in this way... it's only a base to discuss if it helps.
About the Proposal:
---------------------------
The proposal is available on the following link and tries to be clear for the readers although its possible to have some hide details (by mistake) that will raise some questions.
Proposal: https://github.com/SYNchroACK/intelmq/blob/proposal/docs/proposal.md
Reasons why we are starting this discussion is because two main things (I guess):
1) There is a need to configure bots to only execute in a specific time, therefore, it seems that there is a requirement to configure a bot in different run modes, in this proposal: scheduled and continuous. (see proposal for more details).
2) IntelMQ is now being used by multiple teams and it requires stability during execution time, etc... it seems that there is a need for integrations with tools like systemd.
Please, if you have time, read the proposal and write to the mailing-list your thoughts spitted by "What you like" and "What you don't like".
I hope that I didn't have forgot to mention something important... :)
Thank you in advance,
Regards