Dear list,
in pull request #944 (netlab 360 enh [0]) by navtej an issue came up
which can't be solved trivially:
The feed Netlab 360 DGA[1] - which is already included in intelmq -
provides a validity time frame for each domain. Most of those (~90%) end
in 2030 while the start date is the current day at 00:00.
So both start and end time are artificial. And the source claims the
event is valid in the future, which is a very odd. And does it actually
make sense to forward this kind of information?
Also, we can't really handle this time information using the current
harmonization.
One idea would be to set time.source to time.observation if the
time.source is in the future. So time.source <= time.observation does
always apply.
What do you think?
Sebastian
[0]: https://github.com/certtools/intelmq/pull/944
[1]: http://data.netlab.360.com/feeds/dga/dga.txt - attention, quite
big! The domains at the beginning have a very near end date.
--
// Sebastian Wagner <wagner(a)cert.at> - T: +43 1 5056416 7201
// CERT Austria - https://www.cert.at/
// Eine Initiative der nic.at GmbH - https://www.nic.at/
// Firmenbuchnummer 172568b, LG Salzburg
Hi,
recently I've discovered that there are a lot of security analysts actively
participating on Twitter. By participating I mean that they are posting
quite interesting data (@illegalFawn for example) and i thought that even
if the the amount of data being posted there is not that great it could
provide an interesting source of iocs, which could take traditional feeds a
lot of time to publish. For this a played a bit with the Twitter official
rest api and produced a demo which I would like to get your feedback on it
and what you think could be improved. The code can be found here:
https://codeshare.io/aVKXq9. The bot so far works like this: except for the
necessary parameters for twitter api it requires two lists of users, one
represents accounts which timeline will be processed (this is the feed-like
behaviour) the other list represents the users which mark the interesting
tweets (presumably "owners" of the bot) that should be downloaded the
"mark" here means like. This behaviour allows for automatic collection of
data from accounts like I've posted on the beginning, which post feed-like
information and a manual selection of interesting tweets from accounts
which post "various" posts. The bot gets tweets in bulk, that means that it
gets all the tweets and liked tweets and passes them on in concatenated
report. I've consulted this bot with Sebastian Wagner and he pointed out
some weaknesses of this way mainly data and feed classification. A better
approach is probably by creating a report for each individual which eases
the classification (which could be now done using hashtags if present). The
bot lacks a lot of comments and documentation so ask away if some features
are not clear. Again, I'd like to get your feedback and opinions on this
since I think it could be an interesting addition to intelmq ecosystem.
Sincerely,
Václav Brůžek
Hi,
All I wanted to have in the first place is an easy way to find out if an
event has any extra.* field. I.e. in python: 'extra' in event.
But this does not work, because there is no such field, only extra.foo.
So, what about changing the __contains__ method of the Message class? It
could check for sub-fields if necessary.
Using the (wrong) algorithm to check if some existing key starts with
(in this example) `extra` it turns out this also leads to the convenient
situation that you now can't add e.g. `extra.error.message` if
`extra.error` was previously defined.
And: the alienvault otx parser actually did this with extra.pulse
This is good, but also prevents adding `extra.errormessage` if
`extra.error` exists. The correct algorithm is to check if some field
starts with `extra.`. But using this assumption does not prevent
hierarchy conflicts.
IMO, we want both:
1) a simple way to check if any field of a group is (not) present
2) prevention to add keys which conflict with others in the hierarchy,
i.e. `extra.error.message` must not be added if `extra.error` is present.
We could adapt the contains-operator that e.g. a search for `extra.`
means: "Is there any field in the extra.* domain?". It's an invalid key
name anyway[1].
And for adding new fields: check if there is some value in a higher
hierarchy, if so, fail. Overwriting is not allowed in this case.
Any thoughts on this?
Sebastian
P.S.: In the long rung we can probably save the fields hierarchically
internally too, would avoid some of these shortcomings.
We also had problems with the hierarchy of malware.hash in the past:
https://github.com/certtools/intelmq/issues/732
[1]: actually it is allowed currently, but it shouldn't and does not
make sense, see https://github.com/certtools/intelmq/issues/1104
--
// Sebastian Wagner <wagner(a)cert.at> - T: +43 1 5056416 7201
// CERT Austria - https://www.cert.at/
// Eine Initiative der nic.at GmbH - https://www.nic.at/
// Firmenbuchnummer 172568b, LG Salzburg