python提取iocs(extract_iocs模块)

简介

extract_iocs 是一个 Python 模块,可从文本中提取入侵指标 (IOC),包括域名、IPv4 地址、电子邮件地址和哈希值。它使用了一些巨大而丑陋的正则表达式,具有特殊处理来识别具有相对较低误报率的域名,并尝试通过换行符提取 IOC。

第一步

下载安装包

pip install extract_iocs

第二步

示例代码

import os
from extract_iocs import extract_iocs
def _get_iocs(text):
    """Get IOCs from text."""
    try:
        # python3
        iocs = extract_iocs.extract_iocs(text)
    except AttributeError as e:
        # python2
        iocs = extract_iocs(text)

    return iocs
    
def test_apt28_report():
    """."""
    expected_output = {'md5': ['8B92FE86C5B7A9E34F433A6FBAC8BC3A', 'EAD4EC18EBCE6890D20757BB9F5285B1', '3B0ECD011500F61237C205834DB0E13A', '8C4FA713C5E2B009114ADDA758ADC445', '5882FDA97FDF78B47081CC4105D44F7C', '9EEBFEBE3987FEC3C395594DC57A0C4C', '48656A93F9BA39410763A2196AABC67F', 'DA2A657DC69D7320F2FFC87013F257AD', '272F0FDE35DBDFCCBCA1E33373B3570D', '791428601AD12B9230B9ACE4F2138713', '1259C4FE5EFD9BF07FC4C78466F2DD09'], 'sha1': [], 'sha256': [], 'ipv4': [], 'url': [], 'domain': ['msdn.microsoft.com', 'rnil.am', 'novinite.com', 'police.ge', 'nshq.nato.int', 'login-osce.org', 'online.co.uk', 'baltichost.org', 'kavkazcentr.info', 'voiceofrussia.com', 'qov.hu.com', 'www.kam.lt', 'adobeincorp.com', 'natoexhibition.org', 'uropa.eu', 'mail.ru', 'mia.ge.gov', 'mail.gov.pl', 'windous.kz', 'fireeye.com', 'mail.q0v.pl', 'wind0ws.kz', 'www.freedomhouse.org', 'standartnevvs.com', 'poczta.mon.gov.pl', 'www.nytimes.com', 'nato.nshq.in', 'standartnews.com', 'adawareblock.com', 'q0v.pl', 'kavkazcenter.com', 'mia.gov.ge', 'poczta.mon.q0v.pl', 'www.mil.ee', 'novinitie.com', 'www.upi.com', 'n0vinite.com', 'rt.com', 'malware.prevenity.com', 'ae.norton.com', 'windows-updater.com', 'www.fireeye.com', 'natoexhibitionff14.com'], 'email': ['dr.house@wind0ws.kz', 'nato_pop@mail.ru', 'nato_smtp@mail.ru', 'lisa.cuddy@wind0ws.kz', 'info@fireeye.com']}

    with open(os.path.abspath(os.path.join(os.path.dirname(__file__), "./samples/apt28report.txt"))) as apt28:
        iocs = _get_iocs(apt28.read())

        assert len(iocs) == len(expected_output)

        print(iocs)
        
        for indicator_type in iocs:
            # create sets for the actual and expected values for this indicator type
            actual_set = set(iocs[indicator_type])
            expected_set = set(expected_output[indicator_type])

            # make sure the actual and expected sets match
            try:
                assert len(actual_set - expected_set) == 0
            except AssertionError as e:
                print(actual_set - expected_set)
                raise
            try:
                assert len(expected_set - actual_set) == 0
            except AssertionError as e:
                print(expected_set - actual_set)
                raise


def test_simple_report():
    """."""
    expected_output = {'md5': ['E2021791428601AD12B9230B9ACE4F21'], 'sha1': ['B2021791428601AD12B9230B9ACE4F219ACE4F21'], 'sha256': ['2011201120112012201220122012201220122013201320132013201320132013'], 'url': [], 'ipv4': ['1.2.3.4'], 'domain': ['example.org', 'example.com', 'gmail.org'], 'email': ['bad@gmail.org']}
    # 'url': ['http://example.com', 'https://example.com', 'http://example.com/test/bingo.php', 'ftp://example.com']

    with open(os.path.abspath(os.path.join(os.path.dirname(__file__), "./samples/simple.txt"))) as simple_file:
        iocs = _get_iocs(simple_file.read())

        assert len(iocs) == len(expected_output)
        
        for indicator_type in iocs:
            # create sets for the actual and expected values for this indicator type
            actual_set = set(iocs[indicator_type])
            expected_set = set(expected_output[indicator_type])

            # make sure the actual and expected sets match
            assert len(actual_set - expected_set) == 0
            assert len(expected_set - actual_set) == 0

备注:本文共学习使用,模块原作者地址为:https://github.com/mosesschwartz/extract_iocs

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值