Maltrail恶意流量检测系统

最新推荐文章于 2024-04-21 10:08:03 发布

置顶迷途思凡

最新推荐文章于 2024-04-21 10:08:03 发布

阅读量8.9k

点赞数 6

分类专栏：迷之工作

本文链接：https://blog.csdn.net/qq_30212343/article/details/88588647

版权

迷之工作专栏收录该内容

2 篇文章 0 订阅

订阅专栏

Maltrail恶意流量检测系统

项目介绍
运行方式
订阅源扩展
数据采集模块提取

项目介绍

maltrail是一款轻量级的恶意流量检测系统，其工作原理是通过采集网络中各个开源黑样本（包括IP、域名、URL），在待检测目标机器上捕获流量并进行恶意流量匹配，匹配成功则在其web页面上展示命中的恶意流量。

项目GitHub地址

https://github.com/stamparm/maltrail

项目架构

maltrail系统架构
系统采用 流量–>传感器<–>服务端<–>客户端 的架构方式
传感器：负责采集网络流量，更新恶意样本，并进行匹配检测
服务端：提供web界面以及收集网络中的恶意样本流量
客户端：访问和检索恶意流量信息

项目数据集

maltrail项目收集的黑名单列表分为两大类：

内置静态列表（从各种恶意软件报告、学生论文以及个人研究文档中获取的恶意样本），包含以下恶意软件实体：

aboc, adylkuzz, agaadex, alienspy, almalocker, alureon, android_acecard,
android_adrd, android_alienspy, android_arspam, android_backflash,
android_basebridge, android_boxer, android_chuli, android_claco,
android_coolreaper, android_counterclank, android_cyberwurx,
android_dendoroid, android_dougalek, android_droidjack,
android_droidkungfu, android_enesoluty, android_ewalls, android_ewind,
android_exprespam, android_fakebanco, android_fakedown, android_fakeinst,
android_fakelog, android_fakemart, android_fakemrat, android_fakeneflic,
android_fakesecsuit, android_feabme, android_flexispy, android_frogonal,
android_geinimi, android_ghostpush, android_ginmaster, android_gmaster,
android_godwon, android_golddream, android_gonesixty, android_ibanking,
android_kemoge, android_lockdroid, android_lovetrap, android_maistealer,
android_maxit, android_oneclickfraud, android_opfake,
android_ozotshielder, android_pikspam, android_pjapps, android_qdplugin,
android_repane, android_roidsec, android_samsapo, android_sandorat,
android_selfmite, android_simplocker, android_skullkey, android_sndapps,
android_spytekcell, android_stealer, android_stels, android_teelog,
android_tetus, android_tonclank, android_torec, android_uracto,
android_usbcleaver, android_walkinwat, android_windseeker, android_wirex,
android_xavirad, android_zertsecurity, andromem, androm, angler, anuna,
apt_adwind, apt_aridviper, apt_babar, apt_bisonal, apt_blackenergy,
apt_blackvine, apt_bookworm, apt_carbanak, apt_careto, apt_casper,
apt_chches, apt_cleaver, apt_copykittens, apt_cosmicduke, apt_darkhotel,
apt_darkhydrus, apt_desertfalcon, apt_dragonok, apt_dukes,
apt_equationgroup, apt_fin4, apt_finfisher, apt_gamaredon, apt_gaza,
apt_gref, apt_groundbait, apt_htran, apt_ke3chang, apt_lazarus,
apt_lotusblossom, apt_magichound, apt_menupass, apt_miniduke, apt_naikon,
apt_nettraveler, apt_newsbeef, apt_oceanlotus, apt_pegasus, apt_potao,
apt_quasar, apt_redoctober, apt_russiandoll, apt_sauron, apt_scarletmimic,
apt_scieron, apt_shamoon, apt_snake, apt_snowman, apt_sobaken, apt_sofacy,
apt_stealthfalcon, apt_stonedrill, apt_stuxnet, apt_tibet, apt_turla,
apt_tvrms, apt_volatilecedar, apt_waterbug, apt_weakestlink, apt_xagent,
arec, artro, autoit, avalanche, avrecon, axpergle, azorult, bachosens,
badblock, balamid, bamital, bankapol, bankpatch, banloa, banprox, bayrob,
bedep, blackshades, blockbuster, bredolab, bubnix, bucriv, buterat,
calfbot, camerashy, carbanak, carberp, cerber, changeup, chanitor, chekua,
cheshire, chewbacca, chisbur, cloudatlas, cobalt, conficker, contopee,
corebot, couponarific, criakl, cridex, crilock, cryakl, cryptinfinite,
cryptodefense, cryptolocker, cryptowall, ctblocker, cutwail, defru,
destory, dircrypt, dmalocker, dnsbirthday, dnschanger, dnsmessenger,
dnstrojan, dorifel, dorkbot, dragonok, drapion, dridex, dropnak, dursg,
dyreza, elf_aidra, elf_billgates, elf_darlloz, elf_ekoms, elf_groundhog,
elf_hacked_mint, elf_mayhem, elf_mokes, elf_pinscan, elf_rekoobe,
elf_shelldos, elf_slexec, elf_sshscan, elf_themoon, elf_turla, elf_xnote,
elf_xorddos, elpman, emogen, emotet, evilbunny, expiro, fakben, fakeav,
fakeran, fantom, fareit, fbi_ransomware, fiexp, fignotok, filespider,
findpos, fireball, fraudload, fynloski, fysna, gamarue, gandcrab, gauss,
gbot, generic, glupteba, goldfin, golroted, gozi, hacking_team, harnig,
hawkeye, helompy, hiloti, hinired, immortal, injecto, invisimole,
ios_keyraider, ios_muda, ios_oneclickfraud, ios_specter, ios_xcodeghost,
iron, ismdoor, jenxcus, kegotip, kingslayer, kolab, koobface, korgo,
korplug, kovter, kradellsh, kronos, kulekmoko, locky, lollipop, luckycat,
majikpos, malwaremustdie.org.csv, marsjoke, matsnu, mdrop, mebroot,
mestep, misogow, miuref, modpos, morto, nanocor, nbot, necurs, nemeot,
neshuta, netwire, neurevt, nexlogger, nigelthorn, nivdort, njrat,
nonbolqu, notpetya, nuclear, nuqel, nwt, nymaim, odcodc, oficla, onkods,
optima, osx_keranger, osx_keydnap, osx_mami, osx_mughthesec, osx_salgorea,
osx_wirelurker, padcrypt, palevo, parasite, paycrypt, pdfjsc, pepperat,
pghost, phytob, picgoo, pift, plagent, plugx, ponmocup, poshcoder,
powelike, proslikefan, pushdo, pykspa, qakbot, rajump, ramnit, ransirac,
reactorbot, redsip, remcos, renocide, reveton, revetrat, rincux, rovnix,
runforestrun, rustock, sage, sakurel, sality, satana, sathurbot, satori,
scarcruft, seaduke, sefnit, selfdel, shifu, shimrat, shylock, siesta,
silentbrute, silly, simda, sinkhole_abuse, sinkhole_anubis,
sinkhole_arbor, sinkhole_bitdefender, sinkhole_blacklab,
sinkhole_botnethunter, sinkhole_certgovau, sinkhole_certpl,
sinkhole_checkpoint, sinkhole_cirtdk, sinkhole_conficker,
sinkhole_cryptolocker, sinkhole_drweb, sinkhole_dynadot, sinkhole_dyre,
sinkhole_farsight, sinkhole_fbizeus, sinkhole_fitsec, sinkhole_fnord,
sinkhole_gameoverzeus, sinkhole_georgiatech, sinkhole_gladtech,
sinkhole_honeybot, sinkhole_kaspersky, sinkhole_microsoft, sinkhole_rsa,
sinkhole_secureworks, sinkhole_shadowserver, sinkhole_sidnlabs,
sinkhole_sinkdns, sinkhole_sugarbucket, sinkhole_supportintel,
sinkhole_tech, sinkhole_tsway, sinkhole_unknown, sinkhole_virustracker,
sinkhole_wapacklabs, sinkhole_xaayda, sinkhole_yourtrap,
sinkhole_zinkhole, skeeyah, skynet, skyper, smokeloader, smsfakesky,
snifula, snort.org.csv, sockrat, sohanad, spyeye, stabuniq, synolocker,
tdss, teamspy, teerac, teslacrypt, themida, tinba, torpig, torrentlocker,
troldesh, tupym, unruy, upatre, utoti, vawtrak, vbcheman, vinderuf,
virtum, virut, vittalia, vobfus, vundo, waledac, wannacry, waprox, wecorl,
wecoym, wndred, xadupi, xpay, xtrat, yenibot, yimfoca, zaletelly, zcrypt,
zemot, zeroaccess, zeus, zherotee, zlader, zlob, zombrari, zxshell,
zyklon, etc.

实时订阅源（从各种开源黑样本网站上下载的恶意样本）,包含以下恶意样本数据源：

360chinad, 360conficker, 360cryptolocker, 360gameover, 360locky,
360necurs, 360tofsee, 360virut, alienvault, atmos, badips,
bambenekconsultingc2dns, bambenekconsultingc2ip, bambenekconsultingdga,
bitcoinnodes, blackbook, blocklist, botscout, bruteforceblocker, ciarmy,
cruzit, cybercrimetracker, dataplane, dshielddns, dshieldip, emergingthreatsbot,
emergingthreatscip, emergingthreatsdns, feodotrackerdns, feodotrackerip,
greensnow, loki, malc0de, malwaredomainlistdns, malwaredomainlistip,
malwaredomains, malwarepatrol, maxmind, myip, nothink, openphish,
palevotracker, policeman, pony, proxylists, proxyrss, proxyspy,
ransomwaretrackerdns, ransomwaretrackerip, ransomwaretrackerurl,
riproxies, rutgers, sblam, socksproxy, sslipbl, sslproxies,
talosintelligence, torproject, torstatus, turris, urlvir, voipbl, vxvault,
zeustrackerdns, zeustrackerip, zeustrackermonitor, zeustrackerurl, etc.

运行方式

下载及安装Maltrail（由于需要抓取pcap包，所以先安装其依赖模块python-pcapy）

sudo apt-get install git python-pcapy
git clone https://github.com/stamparm/maltrail.git

启动Sensor，进行流量采集及恶意样本检测

cd maltrail
sudo python sensor.py

第一次启动会自动下载网络上的样本集，并在~/.Maltrail/中生成trails.csv、ipcat.csv、ipcat.sqlite文件
启动sensor输出

启动Server，运行Web服务器

cd maltrail
python server.py

启动成功后即可访问http://0.0.0.0:8338/，登录Web服务器，登录默认账号密码 admin:changeme!
启动server输出

订阅源扩展

Maltrail订阅网络上的各种恶意样本网站，其所有采集程序存储在$MALTRAIL_HOME/trails/feeds目录中。
采集目录
用订阅源360chinad.py程序举例，代码如下

	#!/usr/bin/env python

"""
Copyright (c) 2014-2019 Maltrail developers (https://github.com/stamparm/maltrail/)
See the file 'LICENSE' for copying permission
"""

import re

from core.common import retrieve_content

__url__ = "https://data.netlab.360.com/feeds/dga/chinad.txt"
__check__ = "netlab 360"
__info__ = "chinad dga (malware)"
__reference__ = "360.com"

def fetch():
    retval = {}
    content = retrieve_content(__url__)

    if __check__ in content:
        for match in re.finditer(r"(?m)^([\w.]+)\s+2\d{3}\-", content):
            retval[match.group(1)] = (__info__, __reference__)

    return retval

__url__ 待订阅的网站
__check__ 内容匹配字符串
__info__ 样本信息
__reference__ 网站域名

fetch()为数据采集的方法

content = retrieve_cotent(__url__)通过urllib获取网站恶意样本内容

if __check__ in content使用内容匹配字符串来验证网站内容，若匹配失败则放弃订阅

retval为字典对象，其中key为样本主键，例如IP、URL、域名。value则是元组（样本信息，网站域名）

当我们需要添加自定义订阅源时，只需要按上述格式，类似于Java继承的方式，实现fetch()方法，并返回retval，将程序放入$MALTRAIL_HOME/trails/feeds目录中，在每天更新trails.csv文件时则会将自定义的恶意样数据添加进去。

数据采集模块提取

Maltrail最有价值的地方不在于检测和展示，而是对IOC的收集利用，建立一套成熟的OpenIOC机制。

Maltrail数据抽取入口在$MALTRAIL_HOME/core/update.py中。

主要关注update.py中的 update_trails()方法，代码如下：

def update_trails(server=None, force=False, offline=False):
    """
    Update trails from feeds
    """

    success = False
    trails = {}
    duplicates = {}

    try:
        if not os.path.isdir(USERS_DIR):
            os.makedirs(USERS_DIR, 0755)
    except Exception, ex:
        exit("[!] something went wrong during creation of directory '%s' ('%s')" % (USERS_DIR, ex))

    _chown(USERS_DIR)

    if server:
        print "[i] retrieving trails from provided 'UPDATE_SERVER' server..."
        content = retrieve_content(server)
        if not content:
            exit("[!] unable to retrieve data from '%s'" % server)
        else:
            with _fopen(TRAILS_FILE, "w+b") as f:
                f.write(content)
            trails = load_trails()

    trail_files = set()
    for dirpath, dirnames, filenames in os.walk(os.path.abspath(os.path.join(ROOT_DIR, "trails"))) :
        for filename in filenames:
            trail_files.add(os.path.abspath(os.path.join(dirpath, filename)))

    if config.CUSTOM_TRAILS_DIR:
        for dirpath, dirnames, filenames in os.walk(os.path.abspath(os.path.join(ROOT_DIR, os.path.expanduser(config.CUSTOM_TRAILS_DIR)))) :
            for filename in filenames:
                trail_files.add(os.path.abspath(os.path.join(dirpath, filename)))

    if not trails and (force or not os.path.isfile(TRAILS_FILE) or (time.time() - os.stat(TRAILS_FILE).st_mtime) >= config.UPDATE_PERIOD or os.stat(TRAILS_FILE).st_size == 0 or any(os.stat(_).st_mtime > os.stat(TRAILS_FILE).st_mtime for _ in trail_files)):
        print "[i] updating trails (this might take a while)..."

        if not offline and (force or config.USE_FEED_UPDATES):
            _ = os.path.abspath(os.path.join(ROOT_DIR, "trails", "feeds"))
            if _ not in sys.path:
                sys.path.append(_)

            filenames = sorted(glob.glob(os.path.join(_, "*.py")))
        else:
            filenames = []

        _ = os.path.abspath(os.path.join(ROOT_DIR, "trails"))
        if _ not in sys.path:
            sys.path.append(_)

        filenames += [os.path.join(_, "static")]
        filenames += [os.path.join(_, "custom")]

        filenames = [_ for _ in filenames if "__init__.py" not in _]

        if config.DISABLED_FEEDS:
            filenames = [filename for filename in filenames if os.path.splitext(os.path.split(filename)[-1])[0] not in re.split(r"[^\w]+", config.DISABLED_FEEDS)]

        for i in xrange(len(filenames)):
            filename = filenames[i]

            try:
                module = __import__(os.path.basename(filename).split(".py")[0])
            except (ImportError, SyntaxError), ex:
                print "[x] something went wrong during import of feed file '%s' ('%s')" % (filename, ex)
                continue

            for name, function in inspect.getmembers(module, inspect.isfunction):
                if name == "fetch":
                    print(" [o] '%s'%s" % (module.__url__, " " * 20 if len(module.__url__) < 20 else ""))
                    sys.stdout.write("[?] progress: %d/%d (%d%%)\r" % (i, len(filenames), i * 100 / len(filenames)))
                    sys.stdout.flush()

                    if config.DISABLED_TRAILS_INFO_REGEX and re.search(config.DISABLED_TRAILS_INFO_REGEX, getattr(module, "__info__", "")):
                        continue

                    try:
                        results = function()
                        for item in results.items():
                            if item[0].startswith("www.") and '/' not in item[0]:
                                item = [item[0][len("www."):], item[1]]
                            if item[0] in trails:
                                if item[0] not in duplicates:
                                    duplicates[item[0]] = set((trails[item[0]][1],))
                                duplicates[item[0]].add(item[1][1])
                            if not (item[0] in trails and (any(_ in item[1][0] for _ in LOW_PRIORITY_INFO_KEYWORDS) or trails[item[0]][1] in HIGH_PRIORITY_REFERENCES)) or (item[1][1] in HIGH_PRIORITY_REFERENCES and "history" not in item[1][0]) or any(_ in item[1][0] for _ in HIGH_PRIORITY_INFO_KEYWORDS):
                                trails[item[0]] = item[1]
                        if not results and "abuse.ch" not in module.__url__:
                            print "[x] something went wrong during remote data retrieval ('%s')" % module.__url__
                    except Exception, ex:
                        print "[x] something went wrong during processing of feed file '%s' ('%s')" % (filename, ex)

            try:
                sys.modules.pop(module.__name__)
                del module
            except Exception:
                pass

        # custom trails from remote location
        if config.CUSTOM_TRAILS_URL:
            print(" [o] '(remote custom)'%s" % (" " * 20))
            for url in re.split(r"[;,]", config.CUSTOM_TRAILS_URL):
                url = url.strip()
                if not url:
                    continue

                url = ("http://%s" % url) if not "//" in url else url
                content = retrieve_content(url)

                if not content:
                    print "[x] unable to retrieve data (or empty response) from '%s'" % url
                else:
                    __info__ = "blacklisted"
                    __reference__ = "(remote custom)"  # urlparse.urlsplit(url).netloc
                    for line in content.split('\n'):
                        line = line.strip()
                        if not line or line.startswith('#'):
                            continue
                        line = re.sub(r"\s*#.*", "", line)
                        if '://' in line:
                            line = re.search(r"://(.*)", line).group(1)
                        line = line.rstrip('/')

                        if line in trails and any(_ in trails[line][1] for _ in ("custom", "static")):
                            continue

                        if '/' in line:
                            trails[line] = (__info__, __reference__)
                            line = line.split('/')[0]
                        elif re.search(r"\A\d+\.\d+\.\d+\.\d+\Z", line):
                            trails[line] = (__info__, __reference__)
                        else:
                            trails[line.strip('.')] = (__info__, __reference__)

                    for match in re.finditer(r"(\d+\.\d+\.\d+\.\d+)/(\d+)", content):
                        prefix, mask = match.groups()
                        mask = int(mask)
                        if mask > 32:
                            continue
                        start_int = addr_to_int(prefix) & make_mask(mask)
                        end_int = start_int | ((1 << 32 - mask) - 1)
                        if 0 <= end_int - start_int <= 1024:
                            address = start_int
                            while start_int <= address <= end_int:
                                trails[int_to_addr(address)] = (__info__, __reference__)
                                address += 1

        # basic cleanup
        for key in trails.keys():
            if key not in trails:
                continue
            if config.DISABLED_TRAILS_INFO_REGEX:
                if re.search(config.DISABLED_TRAILS_INFO_REGEX, trails[key][0]):
                    del trails[key]
                    continue
            if not key or re.search(r"\A(?i)\.?[a-z]+\Z", key) and not any(_ in trails[key][1] for _ in ("custom", "static")):
                del trails[key]
                continue
            if re.search(r"\A\d+\.\d+\.\d+\.\d+\Z", key):
                if any(_ in trails[key][0] for _ in ("parking site", "sinkhole")) and key in duplicates:
                    del duplicates[key]
                if trails[key][0] == "malware":
                    trails[key] = ("potential malware site", trails[key][1])
            if trails[key][0] == "ransomware":
                trails[key] = ("ransomware (malware)", trails[key][1])
            if key.startswith("www.") and '/' not in key:
                _ = trails[key]
                del trails[key]
                key = key[len("www."):]
                if key:
                    trails[key] = _
            if '?' in key:
                _ = trails[key]
                del trails[key]
                key = key.split('?')[0]
                if key:
                    trails[key] = _
            if '//' in key:
                _ = trails[key]
                del trails[key]
                key = key.replace('//', '/')
                trails[key] = _
            if key != key.lower():
                _ = trails[key]
                del trails[key]
                key = key.lower()
                trails[key] = _
            if key in duplicates:
                _ = trails[key]
                others = sorted(duplicates[key] - set((_[1],)))
                if others and " (+" not in _[1]:
                    trails[key] = (_[0], "%s (+%s)" % (_[1], ','.join(others)))

        read_whitelist()

        for key in trails.keys():
            if check_whitelisted(key) or any(key.startswith(_) for _ in BAD_TRAIL_PREFIXES):
                del trails[key]
            elif re.search(r"\A\d+\.\d+\.\d+\.\d+\Z", key) and (bogon_ip(key) or cdn_ip(key)):
                del trails[key]
            else:
                try:
                    key.decode("utf8")
                    trails[key][0].decode("utf8")
                    trails[key][1].decode("utf8")
                except UnicodeDecodeError:
                    del trails[key]

        try:
            if trails:
                with _fopen(TRAILS_FILE, "w+b") as f:
                    writer = csv.writer(f, delimiter=',', quotechar='\"', quoting=csv.QUOTE_MINIMAL)
                    for trail in trails:
                        writer.writerow((trail, trails[trail][0], trails[trail][1]))

                success = True
        except Exception, ex:
            print "[x] something went wrong during trails file write '%s' ('%s')" % (TRAILS_FILE, ex)

        print "[i] update finished%s" % (40 * " ")

        if success:
            print "[i] trails stored to '%s'" % TRAILS_FILE

    return trails

update_trails(server=None, force=False, offline=False)该方法用于更新所有订阅源，并生成trails.csv文件
server 默认为None，输入为网络中的开源样本集url地址，则会将该url中的数据更新到trails.csv文件
force 默认为False，当设为True时，则更新全量恶意样本的数据（即内置静态列表和实时订阅源）到trails.csv文件中
offline 默认为False，当设为True时，进行离线更新，将内置静态列表更新到trails.csv文件中

有兴趣的朋友可将该方法剥离出来，采集网络上开源的恶意样本数据，建立自己的IOC库。