《Web安全之机器学习入门》笔记:第十一章 11.3 Apriori算法挖掘XSS相关参数

        通常情况下Apriori算法主要用于推荐系统,在网络安全中如何使用呢?本小节通过挖掘XSS相关参数,来我家潜在的关联关系。

1.数据集

        本小节通过xssed网站的样例以及WAF拦截日志提取的XSS攻击日志作为样本,位于data/xss-2000.txt中,具体如下

        机器是没有办法将其识别为日志的,需要逐行读取数据,并将其向量化。比较简单的做法就是按照一定的分隔符切割为单词向量,代码如下所示

    myDat=[]
    with open("../data/xss-2000.txt") as f:
        for line in f:
            #/discuz?q1=0&q3=0&q2=0%3Ciframe%20src=http://xxooxxoo.js%3E
            index=line.find("?")
            if index>0:
                line=line[index+1:len(line)]
                tokens=re.split('\=|&|\?|\%3e|\%3c|\%3E|\%3C|\%20|\%22|<|>|\\n|\(|\)|\'|\"|;|:|,|\%28|\%29',line)
                myDat.append(tokens)
        f.close()

 打印前三行token

    print(myDat[0])
    print(myDat[1])
    print(myDat[2])

如下所示

['', 'onmouseover', '', 'prompt', '42873', '', 'bad', '', '', '', '']
['op', 'map', 'maptype', '1', 'city', 'test', 'script', 'alert', '/42873/', '', '/script', '', '']
['op', 'map', 'maptype', '1', 'defaultcity', '%e5', '', 'alert', '/42873/', '', '//', '']

2. 使用Apriori算法挖掘

        这里采取置信度接近1,如下所示

    L, suppData = apriori(myDat, 0.15)
    rules = generateRules(L, suppData, minConf=0.6)

3.完整代码

from apriori import apriori
from apriori import generateRules
import re

if __name__ == '__main__':
    myDat=[]
    with open("../data/xss-2000.txt") as f:
        for line in f:
            #/discuz?q1=0&q3=0&q2=0%3Ciframe%20src=http://xxooxxoo.js%3E
            index=line.find("?")
            if index>0:
                line=line[index+1:len(line)]
                tokens=re.split('\=|&|\?|\%3e|\%3c|\%3E|\%3C|\%20|\%22|<|>|\\n|\(|\)|\'|\"|;|:|,|\%28|\%29',line)
                myDat.append(tokens)
        f.close()

    L, suppData = apriori(myDat, 0.15)
    rules = generateRules(L, suppData, minConf=0.99)
    print('rules:\n', rules)

4.运行结果

frozenset({'a'}) --> frozenset({''}) conf: 1.0
frozenset({'c'}) --> frozenset({''}) conf: 1.0
frozenset({'c'}) --> frozenset({'42873'}) conf: 1.0
frozenset({'page'}) --> frozenset({''}) conf: 1.0
frozenset({'/'}) --> frozenset({''}) conf: 1.0
frozenset({'/'}) --> frozenset({'1'}) conf: 0.9936507936507937
frozenset({'//'}) --> frozenset({''}) conf: 1.0
frozenset({'//'}) --> frozenset({'alert'}) conf: 0.9941348973607038
frozenset({'/script'}) --> frozenset({''}) conf: 1.0
frozenset({'1'}) --> frozenset({''}) conf: 1.0
frozenset({'alert'}) --> frozenset({''}) conf: 1.0
frozenset({'script'}) --> frozenset({''}) conf: 1.0
frozenset({'script'}) --> frozenset({'/script'}) conf: 1.0
frozenset({'42873'}) --> frozenset({''}) conf: 1.0
frozenset({'c'}) --> frozenset({'42873', ''}) conf: 1.0
frozenset({'/'}) --> frozenset({'1', ''}) conf: 0.9936507936507937
frozenset({'//'}) --> frozenset({'', 'alert'}) conf: 0.9941348973607038
frozenset({'script'}) --> frozenset({'', '/script'}) conf: 1.0
frozenset({'script', 'alert'}) --> frozenset({'', '/script'}) conf: 1.0
frozenset({'1', 'script'}) --> frozenset({'', '/script'}) conf: 1.0
frozenset({'1', 'script', 'alert'}) --> frozenset({'', '/script'}) conf: 1.0
rules:
 [(frozenset({'a'}), frozenset({''}), 1.0), (frozenset({'c'}), frozenset({''}), 1.0), (frozenset({'c'}), frozenset({'42873'}), 1.0), (frozenset({'page'}), frozenset({''}), 1.0), (frozenset({'/'}), frozenset({''}), 1.0), (frozenset({'/'}), frozenset({'1'}), 0.9936507936507937), (frozenset({'//'}), frozenset({''}), 1.0), (frozenset({'//'}), frozenset({'alert'}), 0.9941348973607038), (frozenset({'/script'}), frozenset({''}), 1.0), (frozenset({'1'}), frozenset({''}), 1.0), (frozenset({'alert'}), frozenset({''}), 1.0), (frozenset({'script'}), frozenset({''}), 1.0), (frozenset({'script'}), frozenset({'/script'}), 1.0), (frozenset({'42873'}), frozenset({''}), 1.0), (frozenset({'c'}), frozenset({'42873', ''}), 1.0), (frozenset({'/'}), frozenset({'1', ''}), 0.9936507936507937), (frozenset({'//'}), frozenset({'', 'alert'}), 0.9941348973607038), (frozenset({'script'}), frozenset({'', '/script'}), 1.0), (frozenset({'script', 'alert'}), frozenset({'', '/script'}), 1.0), (frozenset({'1', 'script'}), frozenset({'', '/script'}), 1.0), (frozenset({'1', 'script', 'alert'}), frozenset({'', '/script'}), 1.0)]

如上所示,比如'script'和'1'、'alert'一起出现的话,基本上会100%导致'/scipt'.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

mooyuan天天

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值