基于linux的抓包
一、获取数据
当我们在做网络安全或者数据探测等工作经常会用到抓包、熟悉的工具有tcpdump、wireshark等,这里我们介绍如何使用C程序原始套接字在linux系统上抓取IP链路层数据包。
首先我们先熟悉一个非常重要的函数socket,我们可以通过linux的man手册找到socket函数描述
#include <sys/socket.h>
int
socket(int domain, int type, int protocol);
DESCRIPTION
socket() creates an endpoint for communication and returns a descriptor.
The domain parameter specifies a communications domain within which com-
munication will take place; this selects the protocol family which should
be used. These families are defined in the include file <sys/socket.h>.
The currently understood formats are
PF_LOCAL Host-internal protocols, formerly called PF_UNIX,
PF_UNIX Host-internal protocols, deprecated, use PF_LOCAL,
PF_INET Internet version 4 protocols,
PF_ROUTE Internal Routing protocol,
PF_KEY Internal key-management function,
PF_INET6 Internet version 6 protocols,
PF_SYSTEM System domain,
PF_NDRV Raw access to network device
The socket has the indicated type, which specifies the semantics of com-
munication. Currently defined types are:
SOCK_STREAM
SOCK_DGRAM
SOCK_RAW
A SOCK_STREAM type provides sequenced, reliable, two-way connection based
byte streams. An out-of-band data transmission mechanism may be sup-
ported. A SOCK_DGRAM socket supports datagrams (connectionless, unreli-
able messages of a fixed (typically small) maximum length). SOCK_RAW
sockets provide access to internal network protocols and interfaces. The
type SOCK_RAW, which is available only to the super-user.
The protocol specifies a particular protocol to be used with the socket.
Normally only a single protocol exists to support a particular socket
type within a given protocol family. However, it is possible that many
protocols may exist, in which case a particular protocol must be speci-
fied in this manner. The protocol number to use is particular to the
communication domain in which communication is to take place; see
protocols(5).
Sockets of type SOCK_STREAM are full-duplex byte streams, similar to
pipes. A stream socket must be in a connected state before any data may
be sent or received on it. A connection to another socket is created
with a connect(2) or connectx(2) call. Once connected, data may be
transferred using read(2) and write(2) calls or some variant of the
send(2) and recv(2) calls. When a session has been completed a close(2)
may be performed. Out-of-band data may also be transmitted as described
in send(2) and received as described in recv(2).
The communications protocols used to implement a SOCK_STREAM insure that
data is not lost or duplicated. If a piece of data for which the peer
protocol has buffer space cannot be successfully transmitted within a
reasonable length of time, then the connection is considered broken and
calls will indicate an error with -1 returns and with ETIMEDOUT as the
specific code in the global variable errno. The protocols optionally
keep sockets ``warm'' by forcing transmissions roughly every minute in
the absence of other activity. An error is then indicated if no response
can be elicited on an otherwise idle connection for a extended period
(e.g. 5 minutes). A SIGPIPE signal is raised if a process sends on a
broken stream; this causes naive processes, which do not handle the sig-
nal, to exit.
SOCK_DGRAM and SOCK_RAW sockets allow sending of datagrams to correspon-
dents named in send(2) calls. Datagrams are generally received with
recvfrom(2), which returns the next datagram with its return address.
An fcntl(2) call can be used to specify a process group to receive a
SIGURG signal when the out-of-band data arrives. It may also enable non-
blocking I/O and asynchronous notification of I/O events via SIGIO.
The operation of sockets is controlled by socket level options. These
options are defined in the file <sys/socket.h>. Setsockopt(2) and
getsockopt(2) are used to set and get options, respectively.
简单描述一下,socket函数创建一个套接字描述符,如何创建这个描述符由它三个参数决定
参数说明:
-
domain --指定要选择哪种协议簇来进行通信,如果做过tcp或udp通信,相信对“PF_INET”不陌生把,他表示socket通信在IPV4网络层,如果你想使用IPV6那么可以选择“PF_INET6”,我要说的是我们在这里如果要想获得MAC地址则需要指定通信域在链路层,可以选择“PF_PACKET”
-
type --指定通信类型,如果我们是TCP通信可选择数据流“SOCK_STREAM”,如果是UDP可选择"SOCK_DGRAM",指的注意的是“SOCK_RAW”它提供一个对内部网络访问的接口,这里我们抓包就要用到它
-
protocol–通常情况下可以是0,由于我们前面两个参数的选择这里我们选择“ETH_P_IP”
下面是sock的创建部分代码
#include <sys/socket.h>
#include <linux/if_ether.h>
#include <netinet/in.h>
#include <unistd.h>
#include <arpa/inet.h>
int main()
{
int sock;
if ((sock = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP))) < 0)
{
perror(strerror(errno));
fprintf(stdout, "create socket error\n");
exit(0);
}
}
从函数描述了解到SOCK_RAW的sockets是可以通过recvfrom来接收数据,我们就是通过他来获取IP数据
bzero(buffer, sizeof(buffer));
n_read = recvfrom(sock, buffer, 2048, 0, NULL, NULL);
if (n_read <0) {
exit(0);
}
此时我们抓到的buffer中的数据就是链路层数据
二、tcp数据包解析
如图红色框中的部分,是我们抓取到的数据包括:
- 链路层包头:其中包括“目的地址+源地址+类型”
- 链路层数据:IP数据包
- CRC校验
然而我们需要解析的是TCP数据,TCP数据包属于传输层数据,被包含在网络层IP数据包中,所以我们需要从下到上一层一层解析数据
根据对传输层的描述,先定义一个链路层包头的数据格式,方便使用代码解析
//MAC header
typedef struct {
unsigned char DesMacAddr[6]; //6字节目的MAC地址
unsigned char SrcMacAddr[6]; //6字节源MAC地址
short LengthOrType; //两个字节的网络类型
}__attribute__((packed)) MAC_HEADER, *PMAC_HEADER;
IP数据包
typedef struct {
unsigned char hdr_len: 4;
unsigned char version: 4;
unsigned char tos;
unsigned short total_len;
unsigned short identifier;
unsigned short frag_and_flags;
unsigned char ttl;
unsigned char protocol;
unsigned short checksum;
unsigned int source_ip;
unsigned int dest_ip;
}__attribute__((packed)) IP_HEADER, *PIP_HEADER;
TCP数据包
//IP header
typedef struct {
unsigned char hdr_len: 4;
unsigned char version: 4;
unsigned char tos;
unsigned short total_len;
unsigned short identifier;
unsigned short frag_and_flags;
unsigned char ttl;
unsigned char protocol;
unsigned short checksum;
unsigned int source_ip;
unsigned int dest_ip;
}__attribute__((packed)) IP_HEADER, *PIP_HEADER;
以下是对各个部分的解析过程代码
/********************mac header*******************/
PMAC_HEADER pmacHeader = (MAC_HEADER *) buffer;
printf("Source Mac:");
for (i = 0; i < 6; ++i) {
printf("%02x", pmacHeader->SrcMacAddr[i]);
}
printf(" ");
printf("Dest Mac:");
for (i = 0; i < 6; ++i) {
printf("%02x", pmacHeader->DesMacAddr[i]);
}
printf("\n");
/********************ip header**********************/
PIP_HEADER pipHeader = (PIP_HEADER) (buffer + MAC_HEADER_SIZE);
int total_len = ntohs(pipHeader->total_len);
ip_header_len = pipHeader->hdr_len * 4;
if (ip_header_len > 20 || ip_header_len > 60)
{
exit(0);
}
memcpy(&des_addr, &pipHeader->dest_ip, 4);
memcpy(&src_addr, &pipHeader->source_ip, 4);
int proto = pipHeader->protocol;
switch (proto) {
case IPPROTO_ICMP:
printf("ICMP\n");
break;
case IPPROTO_IGMP:
printf("IGMP\n");
break;
case IPPROTO_IPIP:
printf("IPIP\n");
break;
case IPPROTO_TCP :
printf("TCP:");
PTCP_HEADER tcpHeader = (PTCP_HEADER) (buffer + MAC_HEADER_SIZE + ip_header_len);
tcp_header_len = ((tcpHeader->m_uiHeadOff & 0xf0) >> 4) * 4;
int data_len = total_len - ip_header_len - tcp_header_len;
printf("%s.%d-->%s.%d Len:%d\n", inet_ntoa(src_addr), tcpHeader->m_sSourPort, inet_ntoa(des_addr),
tcpHeader->m_sDestPort, data_len);
int tcp_data_index = MAC_HEADER_SIZE + ip_header_len + tcp_header_len;
unsigned char *p = buffer + tcp_data_index;
if (data_len > 0) {
printf("Data:");
for (int k = 0; k < n_read - tcp_data_index; ++k) {
printf("%02x ", p[k]);
}
//printf("\n");
for (int k = 0; k < n_read - tcp_data_index; ++k) {
printf("%c", p[k]);
}
printf("\n");
}
break;
case IPPROTO_UDP :
printf("UDP\n");
break;
case IPPROTO_RAW :
printf("RAW\n");
break;
default:
printf("Unkown\n");
}
运行结果:
IP Source Mac:9ca615de20d0 Dest Mac:94c6919aa8f4
TCP:222.131.155.252.56539-->222.131.155.252.5632 Len:0
IP Source Mac:b888e3dc810e Dest Mac:ffffffffffff
UDP
IP Source Mac:9ca615de20d0 Dest Mac:94c6919aa8f4
TCP:222.131.155.252.56539-->222.131.155.252.5632 Len:52
Data:c7 e4 e1 18 5c b8 44 91 34 bc a2 2d b8 da ae 64 52 8f ab 3d f8 70 db db 65 2c 2d 2c 9a cd 2d 02 54 e5 db d2 5c 54 8c 7d 1d fe 05 1e c7 d8 e4 b9 23 d8 09 fc ���\�D�4��-�ڮdR��=�p��e,-,��-T���\T�}����#� �
点击获取Github源码