Sharkpy：NSA使用的基于wireshark和libpcap分析流量工具-CSDN博客

本文讲的是 Sharkpy：NSA使用的基于wireshark和libpcap分析流量工具， Sharkpy中包含了搜索，创建，修改数据，发送数据，将数据包写入到新的pcap文件中这五个模块。这些会在一个python程序或者一个python会话中全部完成。

1. sharkPy.file_dissector:使用wireshark中的解析库从文件中捕获数据包，并且将捕获的数据包传递给python文件中，作为要分析的对象。
2. sharkPy.write_dissector:从网卡中捕获数据包，并且通过wireshark中的解析库将数据包进行解析。然后将数据包作为python对象，进行调用。
3. sharkPy.file_writer:将修改或者未修改的数据包写入到新的pcap文件中。举个例子，我们可以使用sharkPy.file_dissector捕获数据包，将捕获的数据创建到一个新的数据包中，然后将这个数据包写入到新的文件中。
4. sharkPy_wire_writer：使用libpcap库函数将任意数据包写入到任意网卡通信当中。在使用sharkpy过程中，这个模块是用于构造符合我们要求的传输数据包。
5. sharkPy.utils：函数通用模块
6. sharkPy.protocol_blender:特定协议选择模块，目前包含了ipv4,以及ipv4上的tcp通信。

sharkPy是使用的NSA原版，不确定是否存在后门，所以请谨慎使用。

设计目标

1. 将解析的数据包数据作为poython对象进行调用。
2. 在python环境中提供这些功能，无论是在python程序中，或者在脚本环境下都可以进行调用。
3. 当使用者执行命令时，会在后台进行，而不会使命令行崩溃。
4. 将众多数据包模块集成到了几个命令当中。
5. 熟悉wrireshark以及python的用户可以很容易的上手
6. 使用尽可能少的C代码进行连接wrireshark函数库。

什么是sharkPy？

SharkPy目标是成为一个将wireshark的功能进行更细，更模块化，更容易使用到其余项目当中的库文件。从这个目标可以看出，SharkPy不只是将wireshark和python结合起来，而是进行极大的优化。

第一步实现的是可以在wireshark环境之外将wireshark提供的功能编译成python模块，提供给用户调用。这一功能已经在众多linux发行版实现。第二步就是继续拓展到更多linux版本以及windows版本中，并且提高工具的稳定性。功能完成之后，sharkPy就可以使用wrieshark的全部功能，并且功能模块按上述六个模块进行划分。

安装

Linux中进行安装

只需要安装了wireshark2.0.1以上版本的就可以使用以下命令进行安装：

## ubuntu-16.04-desktop-amd64 -- clean install
sudo apt-get git
git clone https://github.com/NationalSecurityAgency/sharkPy
sudo apt-get install libpcap-dev
sudo apt-get install libglib2.0-dev
sudo apt-get install libpython-dev
sudo apt-get install wireshark-dev       #if you didn't build/install wireshark (be sure wireshark libs are in LD_LIBRARY_PATH)
sudo apt-get install wireshark           #if you didn't build/install wireshark (be sure wireshark libs are in LD_LIBRARY_PATH)
cd sharkPy
sudo ./setup install

在docker中安装

1.初始化
第一步，创建SharkPy目录，然后将dockerfile放到目录中.cd到这个目录。
然后使用下面命令：

docker build -t "ubuntu16_04:sharkPy" .

注意：

在编译过程中需要等待一段时间，最后它会自动结束。
sharkpy的源码应该在SharkPy文件夹中。
上述编译会创建Ubuntu镜像，然后将Sharkpy作为python模块进行安装。

2.作为docker容器运行

使用下面命令可以进行交互式使用：

docker run -it ubuntu16_04:sharkPy /bin/bash

使用下面命令会允许工具捕获网络流量：

docker run -it --net=host ubuntu16_04:sharkPy /bin/bash

SharPy的API

从文件中解析数据包：

dissect_file(file_path,options=[],timeout=10)：当使用get_next_from_file函数时，从数据包文件中选取数据包。
参数以及使用说明：

1. 流量文件所在位置，以及文件名
2. 收集以及解析流量的选项。这些选项有：disopt_DECODE_AS,以及disopt.NAME_RESOLUTION。
3. timeout：打开文件等待时间，如果超过，则报错。
4. 函数返回值为一个元组：(p,exit_event,shared_pipe);p是解析过程处理程序，exit_event是数据处理程序定义的程序出口点，shared_pipe是分离器解析的网络共享管道。

注意：用户不应该直接使用这些返回值，而是这些返回值作为get_next_from_file以及close_file函数的参数进行使用。

get_next_from_file(dissect_process,timeout=None):得到下一组可用的数据包。
参数及使用说明：

1.dissect_process：在dissect_file函数返回的数据元组。
2.这个函数的返回值是数据包中构建的通信树的头节点

close_file(dissect_process):停止，清除痕迹

参数以及使用说明

1. dissect_process:在dissect_file函数返回的数据元组
2. 返回值为空

注意，close_file函数必须在每个会话中都是使用。

从网卡中解析数据包

dissect_wire(interface,options=[],timeout=None):当使用get_next函数时，会从网卡中获得数据包。
参数以及使用说明：

1.interface:选择要获取数据的网卡
2.options:获取以及解析的选项，选项有： disopt.DECODE_AS, disopt.NAME_RESOLUTION, and disopt.NOT_PROMISCUOUS.
3.函数返回一个元组：(p,exit_event,shared_queue),p是当前解析数据程序，exit_event是数据处理程序定义的程序出口点，shared_queue,分离器返回的数据包解析树的队列

注意，在使用过程中同样不能直接使用这个函数的返回值，而是同上，这些返回值会作为参数传入到get_next_from_wire以及close_wire函数中。

get_next_from_wire(dissect_process,timeout=None):从存活的数据包中解析下一个可用的数据包。
参数以及使用说明：

1.dissect_process：在dissect_file函数返回的数据元组。
2.返回值为数据包中构建的通信树的头节点

close_wire(dissect_process):停止以及清理存活的数据包

参数以及使用说明：

1. dissect_process:在dissect_file函数返回的数据元组
2. 返回值为空

注意：close_wire必须在每个会话中调用。

将数据或者数据包写入到网卡或文件中

wire_writer(write_interface_list):wire_writer对象构造器。将任意数据写入到网卡当中。
参数以及使用说明：

1. write_interface_list:将要写入数据的网卡列表。
2. 返回值为wire_writer对象。
3. wire_writer.cmd:对writer对象执行命令。
4. wr.cmd(command=wr.WRITE_BYTES, command_data=data_to_write, command_timeout=2)
5. wr.cmd(command=wr.SHUT_DOWN_ALL, command_data=None, command_data=2)
6. wr.cmd(command=wr.SHUT_DOWN_NAMED, command_data=interface_name, command_data=2)
7. wire_writer.get_rst(timeout=1):返回元组(success/failure, number_of_bytes_written)

file_writer():创建一个新的file_writer对象，将数据包写入到pcap文件当中。
make_pcap_error_buffer():创建一个正确大小，以及初始化的错误缓冲区，返回值为错误缓冲区。

pcap_write_file(output_file_path,error_buffer):创建以及打开一个新的pcap文件。
参数说明：

1.output_file:输出文件的路径。
2.error_buffer：在make_pcap_error_buffer函数中的返回值。
3.返回值：ctypes.c_void_p，是其他写入函数所需要的对象。

pcap_write_packet(context,upper_time_val,lower_time_val,num_bytes_to_write,data_to_write,error_buffer):将数据写入打开的pcap文件当中
参数说明：

1. context:pcap_write_file()函数返回的对象。
2. upper_time_val: 数据包分组的时间单位。这个值是从效用函数get_pkt_times()返回的元组第一个值中获得。
3. lower_time_val:数据包剩余处理时间。这个值是从效用函数get_pkt_times()返回的元组第二个值中获得。
4. num_bytes_to_write：写入到文件的字节数目，数据缓冲区的长度。
5. data_to_write:要写入到数据缓冲区的内容。
6. err_buffer:make_pcap_error_buffer()函数返回的错误缓冲区。任何错误信息都会写入到这个缓冲区中。
7. 如果成功，返回值为0,如果失败返回值为-1。错误信息可以通过err_buffer查看。

pcap_close(context):必须在每个会话中被调用，用于写入缓冲区中，关闭写入文件，以及释放内存。

参数说明：

1. context:pcap_write_file()返回值
2. 返回值为空。

通用模块

do_funct_walk(root_node,funct,aux=None):将数据包中的通信树通过递归传递给函数中。采用深度遍历。
参数说明：

1. root_node:通信树中要被第一个访问的节点。
2. funct:要调用的函数
3. aux:可选的辅助变量将作为每个被调用的函数的参数
4. 函数返回值为空。

get_node_by_name(root_node,name):查找并且返回具有你要查找内容的节点。

参数说明：

1. root_node:传入参数的头节点
2. name:你要搜索的内容。
3. 函数的返回值是对你要查找内容，因为内容可能不是唯一，所以返回的是一个列表。

get_node_data_details(node):返回给定节点的数据元组。

参数说明：

1. node表示你要获取的节点信息
2. 返回值为：tuple(data_len,first_byte_index, last_byte_index, data, binary_data).data_len表示node节点数据长度，first_byte_index表示在数据包中这个节点开始的地址，last_byte_index表示数据包中这个节点结束的地址,data表示节点的数据。binary_data:以二进制形式表示的节点数据。

get_pkt_times(pkt=input_packet): 返回数据包所包含的时间戳信息。

参数说明：

1. pkt:从sharkPy解析路径之一中返回的数据包解析树。
2. 返回值为：(epoch_time_seconds, epoch_time_nanosecond_remainder)，这两个返回值作为file_writer的参数。

find_replace_data(pkt, field_name, test_val, replace_with=None,
condition_funct=condition_data_equals, enforce_bounds=True,
quiet=True):进行搜索，匹配，并且将数据包中的内容进行替换。

参数说明：

1. pkt:从sharkPy解析路径之一中返回的数据包解析树。
2. field_name:要查找的内容
3. test_val:用于进行比较匹配功能的数据变量或者缓冲区
4. replace_with:将查找的数据进行替换的内容
5. condition_funct：bool类型的函数，并且格式为：condition_funct(node_val,test_val,pkt_dissection_tree)。默认函数为 condition_data_equals()，如果node_val==test_val,那么这个函数就会返回true。这就是字节匹配的意思。
6. enforce_bounds：如果被设置为True，就会强制执行len(replace_with) == len(node_data_to_be_replaced).
7. quiet：如果设置为False，那么如果找不到要查找的字符串，那么就会把错误信息输出到stdout中。
8. 如果找到字符串，那么返回修改过后的数据包。如果找不到，则返回null。

condition_data_equals(node_val, test_val, pkt_dissection_tree=None):一个匹配函数，可以传递给find_replace_data().

参数说明：

1. node_val:要查找的来自数据包的值    
2. test_val:与node_val进行比较的值
3. 如果node_val==test_val，那么返回值为真。反之，为假。

condition_always_true(node_val=None, test_val=None,
pkt_dissection_tree=None):一个匹配函数，可以传递给find_replace_data().这一函数总会返回真，这一函数的唯一作用就是判断目标字段是否在解析数据包中。

协议选择

ipv4_find_replace(pkt_dissection, src_match_value=None,
dst_match_value=None, new_srcaddr=None, new_dstaddr=None,
update_checksum=True,
condition_funct=sharkPy.condition_data_equals)，修改特定的ipv4协议中的字段。

参数说明：

1. pkt_dissection:数据解析树。
2. src_match_value:当前寻找的源Ip(以16进制表示)。该值将会被替换。
3. dst_match_value:当前寻找的目的IP(以16进制表示)。该值将会被替换。
4. new_srcaddr:用该值替换上述的src_match_value
5. new_dstaddr:用该值替换上述的dst_match_value
6. update_checksum:如果在默认情况下，即为真，就会启动ipv4修复校验。
7. condition_funct:匹配函数，用于查找正确的数据包进行修改。

tcp_find_replace(pkt_dissection, src_match_value=None,
dst_match_value=None, new_srcport=None, new_dstport=None,
update_checksum=True,
condition_funct=sharkPy.condition_data_equals):通过ipv4修改tcp选择的字段。

参数说明：

1. pkt_dissection:数据解析树。
2. src_match_value:当前寻找的源tcp端口寻找(以16进制表示)。该值将会被替换。
3. dst_match_value:当前寻找的目的tcp端口(以16进制表示)。该值将会被替换。
4. new_srcaddr:用该值替换上述的src_match_value
5. new_dstaddr:用该值替换上述的dst_match_value
6. update_checksum:如果在默认情况下，即为真，就会启动ipv4修复校验。
7. condition_funct:匹配函数，用于查找正确的数据包进行修改。

用法实例

捕获文件中的数据包：

目前支持的选项是DECODE_AS和NAME_RESOLUTION

import sharkPy
in_options=[(sharkPy.disopt.DECODE_AS, r'tcp.port==8888-8890,http'), (sharkPy.disopt.DECODE_AS, r'tcp.port==9999:3,http')] #

开始文件读取以及数据包解析

dissection = sharkPy.dissect_file(r'/home/me/capfile.pcap', options=in_options)

使用sharkPy.get_next_from_file获取读取的数据包：

>>> rtn_pkt_dissections_list = []
>>> for cnt in xrange(13):
...     pkt = sharkPy.get_next_from_file(dissection)
...     rtn_pkt_dissections_list.append(pkt)
Node Attributes: 
 abbrev:     frame.
 name:       Frame.
 blurb:      None.
 fvalue:     None.
 level:      0.
 offset:     0.
 ftype:      1.
 ftype_desc: FT_PROTOCOL.
 repr:       Frame 253: 54 bytes on wire (432 bits), 54 bytes captured (432 bits) on interface 0.
 data:       005056edfe68000c29....<rest edited out>
Number of child nodes: 17
 frame.interface_id
 frame.encap_type
 frame.time
 frame.offset_shift
 frame.time_epoch
 frame.time_delta
 frame.time_delta_displayed
 frame.time_relative
 frame.number
 frame.len
 frame.cap_len
 frame.marked
 frame.ignored
 frame.protocols
 eth
 ip
 tcp
  Node Attributes: 
   abbrev:     frame.interface_id.
   name:       Interface id.
   blurb:      None.
   fvalue:     0.
   level:      1.
   offset:     0.
   ftype:      6.
   ftype_desc: FT_UINT32.
   repr:       Interface id: 0 (eno16777736).
   data:       None.
  Number of child nodes: 0
...<remaining edited out>

在最后关闭会话：

>>> sharkPy.close_file(dissection)

取一个数据包树，并且将他们的名字作为其索引。

>>> pkt_dict = {}
>>> sharkPy.collect_proto_ids(rtn_pkt_dissections_list[0], pkt_dict)

以下就是检索该数据包中所有的key：

>>> print pkt_dict.keys()
['tcp.checksum_bad', 'eth.src_resolved', 'tcp.flags.ns', 'ip', 'frame', 'tcp.ack', 'tcp', 'frame.encap_type', 'eth.ig', 'frame.time_relative', 'ip.ttl', 'tcp.checksum_good', 'tcp.stream', 'ip.version', 'tcp.seq', 'ip.dst_host', 'ip.flags.df', 'ip.flags', 'ip.dsfield', 'ip.src_host', 'tcp.len', 'ip.checksum_good', 'tcp.flags.res', 'ip.id', 'ip.flags.mf', 'ip.src', 'ip.checksum', 'eth.src', 'text', 'frame.cap_len', 'ip.hdr_len', 'tcp.flags.cwr', 'tcp.flags', 'tcp.dstport', 'ip.host', 'frame.ignored', 'tcp.window_size', 'eth.dst_resolved', 'tcp.flags.ack', 'frame.time_delta', 'tcp.flags.urg', 'ip.dsfield.ecn', 'eth.addr_resolved', 'eth.lg', 'frame.time_delta_displayed', 'frame.time', 'tcp.flags.str', 'ip.flags.rb', 'tcp.flags.fin', 'ip.dst', 'tcp.flags.reset', 'tcp.flags.ecn', 'tcp.port', 'eth.type', 'ip.checksum_bad', 'tcp.window_size_value', 'ip.addr', 'ip.len', 'frame.time_epoch', 'tcp.hdr_len', 'frame.number', 'ip.dsfield.dscp', 'frame.marked', 'eth.dst', 'tcp.flags.push', 'tcp.srcport', 'tcp.checksum', 'tcp.urgent_pointer', 'eth.addr', 'frame.offset_shift', 'tcp.window_size_scalefactor', 'ip.frag_offset', 'tcp.flags.syn', 'frame.len', 'eth', 'ip.proto', 'frame.protocols', 'frame.interface_id']

请注意，pkt_dict是全部列出的，所以每个字段并不是唯一的。

>>> val_list = pkt_dict['tcp']

发现tcp列表中只有一个元素，如下所示：

>>> for each in val_list:
...     print each
... 
Node Attributes: 
 abbrev:     tcp.
 name:       Transmission Control Protocol.
 blurb:      None.
 fvalue:     None.
 level:      0.
 offset:     34.
 ftype:      1.
 ftype_desc: FT_PROTOCOL.
 repr:       Transmission Control Protocol, Src Port: 52630 (52630), Dst Port: 80 (80), Seq: 1, Ack: 1, Len: 0.
 data:       cd960050df6129ca0d993e7750107d789f870000.
Number of child nodes: 15
 tcp.srcport
 tcp.dstport
 tcp.port
 tcp.port
 tcp.stream
 tcp.len
 tcp.seq
 tcp.ack
 tcp.hdr_len
 tcp.flags
 tcp.window_size_value
 tcp.window_size
 tcp.window_size_scalefactor
 tcp.checksum
 tcp.urgent_pointer

使用以下命令进行快捷查找节点：

>>> val_list = sharkPy.get_node_by_name(rtn_pkt_dissections_list[0], 'ip')

数据包中每一个节点都包含属性以及子孩子列表：

>>> pkt = val_list[0]

以下是访问属性的方式：

>>> print pkt.attributes.abbrev
tcp
>>> print pkt.attributes.name
Transmission Control Protocol

这是pkt的子孩子列表：

>>> print pkt.children
[<sharkPy.dissect.file_dissector.node object at 0x10fda90>, <sharkPy.dissect.file_dissector.node object at 0x10fdb10>, <sharkPy.dissect.file_dissector.node object at 0x10fdbd0>, <sharkPy.dissect.file_dissector.node object at 0x10fdc90>, <sharkPy.dissect.file_dissector.node object at 0x10fdd50>, <sharkPy.dissect.file_dissector.node object at 0x10fddd0>, <sharkPy.dissect.file_dissector.node object at 0x10fde50>, <sharkPy.dissect.file_dissector.node object at 0x10fded0>, <sharkPy.dissect.file_dissector.node object at 0x10fdf90>, <sharkPy.dissect.file_dissector.node object at 0x1101090>, <sharkPy.dissect.file_dissector.node object at 0x11016d0>, <sharkPy.dissect.file_dissector.node object at 0x11017d0>, <sharkPy.dissect.file_dissector.node object at 0x1101890>, <sharkPy.dissect.file_dissector.node object at 0x1101990>, <sharkPy.dissect.file_dissector.node object at 0x1101b50>]

获取有关解析的节点数据中有用的信息：

>>> data_len, first_byte_offset, last_byte_offset, data_string_rep, data_binary_rep=sharkPy.get_node_data_details(pkt)
>>> print data_len
54
>>> print first_byte_offset
0
>>> print last_byte_offset
53
>>> print data_string_rep
005056edfe68000c29....<rest edited out>
>>> print binary_string_rep
<prints binary spleg, edited out>

从网络中获取数据包，以及解析数据包

在SharkPy的wire_dissector中提供了额外的 NOT_PROMISCUOUS选项。

>>> in_options=[(sharkPy.disopt.DECODE_AS, r'tcp.port==8888-8890,http'), (sharkPy.disopt.DECODE_AS, r'tcp.port==9999:3,http'), (sharkPy.disopt.NOT_PROMISCUOUS, None)

开始捕获和解析。注意调用者必须有执行权限。但是如果以root权限可能会有危险：

>>> dissection = sharkPy.dissect_wire(r'eno16777736', options=in_options)
>>> Running as user "root" and group "root". This could be dangerous.

使用sharkPy.get_next_from_wire得到数据包并且对他们进行解析：

>>> for cnt in xrange(13):
...     pkt=sharkPy.get_next_from_wire(dissection)
...     sharkPy.walk_print(pkt) ## much better idea to save pkts in a list

一定要关闭捕捉会话：

>>> sharkPy.close_wire(dissection)

将数据或者数据包写入到网络中

使用网卡名字创建写入对象

>>> wr = sharkPy.wire_writer(['eno16777736'])

将timeout设置为2,并且向网络中发送数据：

>>> wr.cmd(wr.WRITE_BYTES,'  djwejkweuraiuhqwerqiorh', 2)

检查是否发生错误。如果成功，那么返回数据：

>>> if(not wr.command_failure.is_set()):
...     print wr.get_rst(1)
... 
(0, 26) ### returned success and wrote 26 bytes. ###

将数据包写入到pcap文件当中。

创建写入文件对象：

>>> fw = file_writer()

产生错误缓冲区：

>>> errbuf = fw.make_pcap_error_buffer()

打开或者生成新的pcap文件，准备下一步写入：

>>> outfile = fw.pcap_write_file(r'/home/me/test_output_file.pcap', errbuf)

在现有的数据包解析数据：

>>> sorted_rtn_list = sharkPy.dissect_file(r'/home/me/tst.pcap', timeout=20)

将第一个数据包写入到pcap文件当中：

得到第一个解析数据：

>>> pkt_dissection=sorted_rtn_list[0]

获取写操作所需要的数据包信息：

>>> pkt_frame = sharkPy.get_node_by_name(pkt_dissection, 'frame')
>>> frame_data_length, first_frame_byte_index, last_frame_byte_index, frame_data_as_string, frame_data_as_binary = sharkPy.get_node_data_details(pkt_frame[0])
>>> utime, ltime = sharkPy.get_pkt_times(pkt_dissection)

将数据包写入到文件中：

>>> fw.pcap_write_packet(outfile, utime, ltime, frame_data_length, frame_data_as_binary, errbuf)

关闭输出文件，并且清理数据：

>>> fw.pcap_close(outfile)

下面程序就是在写新的数据包之前匹配和替换数据，然后输出pcap文件

import sharkPy, binascii
test_value1 = r'0xc0a84f01'
test_value2 = r'c0a84fff'
test_value3 = r'005056c00008'
fw = sharkPy.file_writer()
errbuf = fw.make_pcap_error_buffer()
outfile = fw.pcap_write_file(r'/home/me/test_output_file.pcap', errbuf)
sorted_rtn_list = sharkPy.dissect_file(r'/home/me/tst.pcap', timeout=20)
for pkt in sorted_rtn_list:
    # do replacement
    new_str_data = sharkPy.find_replace_data(pkt, r'ip.src', test_value1, r'01010101')
    new_str_data = sharkPy.find_replace_data(pkt, r'ip.dst', test_value2, r'02020202')
    new_str_data = sharkPy.find_replace_data(pkt, r'eth.src', test_value3, r'005050505050')
    # get detains required to write to output pcap file
    pkt_frame = sharkPy.get_node_by_name(pkt, 'frame')
    fdl, ffb, flb, fd, fbd = sharkPy.get_node_data_details(pkt_frame[0])
    utime, ltime = sharkPy.get_pkt_times(pkt)
    if(new_str_data is None):
        new_str_data = fd
    newbd = binascii.a2b_hex(new_str_data)
    fw.pcap_write_packet(outfile, utime, ltime, fdl, newbd, errbuf)
fw.pcap_close(outfile)

原文发布时间为：2017年6月29日

本文作者：xnianq

本文来自云栖社区合作伙伴嘶吼，了解相关信息可以关注嘶吼网站。

原文链接