学习笔记（四）：使用K近邻算法检测WebShell

最新推荐文章于 2024-05-03 01:31:15 发布

林咚咚

最新推荐文章于 2024-05-03 01:31:15 发布

阅读量494

点赞数

分类专栏： web安全算法 python

本文链接：https://blog.csdn.net/weixin_39878297/article/details/83048202

版权

python 同时被 3 个专栏收录

21 篇文章 1 订阅

订阅专栏

web安全

14 篇文章 3 订阅

订阅专栏

算法

13 篇文章 1 订阅

订阅专栏

1.数据搜集

加载ADFA-LD中正常样本数据：

def load_adfa_training_files(rootdir):
    x=[]
    y=[]
    list = os.listdir(rootdir)
    for i in range(0, len(list)):
        path=os.path.join(rootdir,list[i])
        if os.path.isfile(path):
            x.append(load_one_file(path))
            y.append(0)
    return x,y

定义遍历目录下文件的函数：

def dirlist(path, allfile):
    filelist = os.listdir(path)
    for filename in filelist:
        filepath = os.path.join(path,filename)
        if os.path.isdir(filepath):
            dirlist(filepath,allfile)
        else:
            allfile.append(filepath)
    return allfile

从攻击数据集中筛选出和WebShell相关的数据：

def load_adfa_webshell_files(rootdir):
    x=[]
    y=[]
    allfile=dirlist(rootdir,[])
    for file in allfile:
        if re.match(r" ..",file):
            x.append(load_one_file(file))
            y.append(1)
    return x,y

2.特征化

x1,y1 = load_adfa_training_file("...")
x2,y2 = load_adfa_webshell_files("...")
x = x1+x2
y = y1+y2
vectorizer = CountVectorizer(min_df=1)
x = vectorizer.fit_transform(x)
x = x.toarray()

3。训练样本与效果验证与（三）一样

林咚咚

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
学习笔记（四）：使用K近邻算法检测WebShell

1.数据搜集加载ADFA-LD中正常样本数据：def load_adfa_training_files(rootdir): x=[] y=[] list = os.listdir(rootdir) for i in range(0, len(list)): path=os.path.join(rootdir,list[i]) ...
复制链接

扫一扫