简单物联网设备识别过程

最新推荐文章于 2024-07-28 17:44:18 发布

Snow.雪落ღ

最新推荐文章于 2024-07-28 17:44:18 发布

阅读量2.6k

点赞数

文章标签： python 物联网

本文链接：https://blog.csdn.net/Snow_224/article/details/112472856

版权

物联网设备识别实验

一、安装抓取标语环境

1、安装go语言

1.1在https://golang.google.cn/dl/中下载对应二进制包

1.2解压go语言文件到/usr/local文件

tar -C /usr/local -xzf go1.15.4.linux-amd64.tar.gz

1.3配置go语言环境变量

打开全局配置文件
在这里插入图片描述
在文件最后添加如下两行代码

export PATH=$PATH:/usr/local/go/bin
export GOPATH=/usr/local/go/bin/

1.4使配置文件生效

source /etc/profile

1.5查看go语言是否安装成功
在这里插入图片描述

2、安装zmap

apt-get install zmap

3、安装zgrab2

3.1在https://github.com/zmap/zgrab2下载压缩包
3.2创建文件夹

mkdir /usr/local/go/bin/src/
mkdir /usr/local/go/bin/src/github.com/
mkdir /usr/local/go/bin/src/github.com/zmap/

3.3解压zgrab2.zip到/usr/local/go/bin/src/github.com/zmap/文件夹下

unzip zgrab2.zip -d /usr/local/go/bin/src/github.com/zmap

3.4在/usr/local/go/bin/src/github.com/zmap/zgrab2下安装

cd /usr/local/go/bin/src/github.com/zmap/zgrab2
make

在这里插入图片描述

二、探测活动主机并抓取标语

1、使用zmap扫描白名单中的ip地址，得到存活主机

zmap实时输出状态信息格式如下：
%-complete（剩余时间）； packets-sent （平均发送速率）；recv: packets-recv recv-rate（平均接收速率）；hits：命中率
在这里插入图片描述

2、利用Zgrab2工具使用多种协议抓取活动主机的标语

2.1 http协议
在这里插入图片描述

2.2 ssh协议
在这里插入图片描述
2.3 ftp协议

2.4 telnet协议

（忘记截图，略）

三、数据预处理

1、安装nltk包

管理员权限打开命令行界面，执行以下命令

pip install nltk -i http://mirros.ayilun.com/pypi/simple/

2、安装nltk_data包

下载地址：https://github.com/nltk/nltk_data
下载之后需要将文件解压到nltk扫描的文件夹下
例如：C:\Users\zhang\AppData\Roaming
在这里插入图片描述
打开nltk_data文件夹，
在tokenizers文件夹下把nltk_punkt.zip解压，
在corpora文件夹下把stopwords.zip解压。

3、数据预处理

对之前四个协议http ssh ftp telnet扫描返回的结果进行数据预处理。
对数据进行预处理，删除冗余信息，以便节省空间，并在后续的指纹识别环节加快速度，包含以下步骤
（1）删除不可打印字符
（2）删除html标签
（3）删除标点符号
（4）使用NLTK，进行分词
（5）删除停止词

整体代码示例如下所示：

import re
import nltk
def nlp(protocol):
    with open("{0}result.json".format(protocol),'r',encoding='utf-8') as fr:
        text = fr.readlines()
        print("打开{0}result.json文件".format(protocol))
        with open("{0}nlpfile.txt".format(protocol),"w",encoding='utf-8') as fw:
            i=0
            for lines in text:
                i=i+1
                #删除不可打印字符
                lines = re.sub(r'\\[n|r|t|v|f|s|S|cx]', '', lines)
                #删除HTML标签
                lines = re.sub(r'<[^<]+?>','',lines)
                #删除标点符号
                lines = lines.replace('@','')
                lines = lines.replace('\\"','@')
                lines = re.sub(r'[\s+\!\\\/|@$&#%^*(+\')]+', '', lines)
                lines = lines.replace("\"", "$").replace('[', '#').replace(']', '&').replace("$","").replace("#","").replace("%","").replace("&","").replace("{","").replace("}","")
                #分词
                word = nltk.word_tokenize(lines)
                #删除停止词
                stop_words = set(nltk.corpus.stopwords.words('english'))#返回小写停用词列表
                stop_words.remove(u'will')
                stop_words.remove(u'do')
                filtered_sentence = [w for w in word if not w in stop_words]#过滤停止词
                fw.write(lines+"\n")
                print("{0}result.json".format(protocol)+"第"+str(i)+"行数据处理完成")
    	fw.close()
    fr.close()
    print("{0}result.json".format(protocol)+"文件处理完毕")
if __name__ == '__main__':
    nlp("http")
	nlp("ftp")
	nlp("ssh")
	nlp("telnet")

结果输出（以http协议为例）
在这里插入图片描述

四、指纹识别

该程序主要是用来对自然语言处理后的文件进行指纹识别，主要用于识别摄
像头品牌和类型。指纹库见文件“摄像头指纹库”。程序流程图如下所示：
在这里插入图片描述

安装xlrd包

管理员权限打开命令行界面，执行如下指令

pip install xlrd==1.2.0 -i http://mirros.ayilun.com/pypi/simple/

程序代码如下：

import re
import xlrd
#import json

def http_res():
    with open('httpnlpfile.txt', 'r', encoding='utf-8') as f:
        with open('http_final.txt', 'w', encoding='utf-8') as fw:
            text = f.readlines()
            for l in text:
                if l.find('success')>0 and l.find('www_authenticate')>0:
                    res= str(re.findall(r'realm=(.*?),', l)[0])
                    ip=str(re.findall(r'ip:(.*?),', l)[0])
                    fw.writelines(ip+":"+res+'\n')
    f.close()
    fw.close()

def ssh_res():
    with open('sshnlpfile.txt', 'r', encoding='utf-8') as f:
        with open('ssh_final.txt', 'w', encoding='utf-8') as fw:
            text = f.readlines()
            for l in text:
                if l.find('success')>0 and l.find('raw')>0:
                    res=str(re.findall(r'raw:(.*?),', l)[0])
                    ip=str(re.findall(r'ip:(.*?),', l)[0])
                    fw.writelines(ip+":"+res+'\n')
    f.close()
    fw.close()

def telnet_res():
    with open('telnetnlpfile.txt', 'r', encoding='utf-8') as f:
        with open('telnet_final.txt', 'w', encoding='utf-8') as fw:
            text = f.readlines()
            for l in text:
                if l.find('success')>0 and l.find('banner')>0 and l.find('login')>0:
                    res=re.findall(r'banner:(.*?)[l|L]ogin', l)
                    if res[0]!= 'none':
                        resf=str(res[0])

                        ip=str(re.findall(r'ip:(.*?),', l)[0])
                        fw.writelines(ip+":"+resf+'\n')
    f.close()
    fw.close()

def ftp_res():
    with open('ftpnlpfile.txt', 'r', encoding='utf-8') as f:
        with open('ftp_final.txt', 'w', encoding='utf-8') as fw:
            text = f.readlines()
            for l in text:
                if l.find('success')>0 and l.find('banner')>0:
                    res = re.findall(r'banner:(.*?),', l)
                    if bool(res):
                        resf = str(res[0])
                        ip = str(re.findall(r'ip:(.*?),', l)[0])
                        fw.writelines(ip + ":" + resf+'\n')
    f.close()
    fw.close()

def check():
    dic={}
    protocol=['http','ssh','ftp','telnet']
    rdec=xlrd.open_workbook(r'摄像头指纹库.xlsx')
    sheet=rdec.sheet_by_name('Sheet1')
    rows=sheet.nrows
    cols=sheet.ncols
    with open('final.json', 'w', encoding='utf-8') as f:
        for p in procotol:
            with open('{0}_final.txt'.format(p), 'r', encoding='utf-8') as ft:
                txtlines=ft.readlines()
                for tl in txtlines:
                    ip=tl[:tl.find(':')]
                    itype=tl[tl.find(':')+1:-1]
                    for i in range(1,rows):
                        datarow1 = sheet.row_values(i, 0)
                        if str(datarow1[1]).find(itype) >= 0:
                            dic[ip]=datarow1[0]+','+itype
        #json.dump(dic, f)
        num=0
        for i in dic:
            num=num+1
            f.writelines(str(num)+"---"+i+":"+dic[i]+"\n")
    f.close()

if __name__ == '__main__':

    telnet_res()
    http_res()
    ftp_res()
    ssh_res()
    check()

结果

在这里插入图片描述