Linux 网络子系统
sk_buffer 详细分析
作者: 小马哥 rstevens (rstevens2008@hotmail.com)
欢迎转载,未经允许,请勿用于商业目的
1. 定义
Packet: 通过网卡收发的报文,包括链路层、网络层、传输层的协议头和携带的数据
Data Buffer:用于存储 packet 的内存空间
SKB: struct sk_buffer 的简写
2. 概述
Struct sk_buffer 是 linux TCP/IP stack 中,用于管理Data Buffer的结构。Sk_buffer 在数据包的发送和接收中起着重要的作用。
为了提高网络处理的性能,应尽量避免数据包的拷贝。Linux 内核开发者们在设计 sk_buffer 结构的时候,充分考虑到这一点。目前 Linux 协议栈在接收数据的时候,需要拷贝两次:数据包进入网卡驱动后拷贝一次,从内核空间递交给用户空间的应用时再拷贝一次。
Sk_buffer结构随着内核版本的升级,也一直在改进。
学习和理解 sk_buffer 结构,不仅有助于更好的理解内核代码,而且也可以从中学到一些设计技巧。
3. Sk_buffer 定义
struct sk_buff {
struct sk_buff *next;
struct sk_buff *prev;
struct sock *sk;
struct skb_timeval tstamp;
struct net_device *dev;
struct net_device *input_dev;
union {
struct tcphdr *th;
struct udphdr *uh;
struct icmphdr *icmph;
struct igmphdr *igmph;
struct iphdr *ipiph;
struct ipv6hdr *ipv6h;
unsigned char *raw;
} h;
union {
struct iphdr *iph;
struct ipv6hdr *ipv6h;
struct arphdr *arph;
unsigned char *raw;
} nh;
union {
unsigned char *raw;
} mac;
struct dst_entry *dst;
struct sec_path *sp;
char cb[40];
unsigned int len,
data_len,
mac_len,
csum;
__u32 priority;
__u8 local_df:1,
cloned:1,
ip_summed:2,
nohdr:1,
nfctinfo:3;
__u8 pkt_type:3,
fclone:2;
__be16 protocol;
void (*destructor)(struct sk_buff *skb);
/* These elements must be at the end, see alloc_skb() for details. */
unsigned int truesize;
atomic_t users;
unsigned char *head,
*data,
*tail,
*end;
};
4. 成员变量
· struct skb_timeval tstamp;
此变量用于记录 packet 的到达时间或发送时间。由于计算时间有一定开销,因此只在必要时才使用此变量。需要记录时间时,调用net_enable_timestamp(),不需要时,调用net_disable_timestamp() 。
tstamp 主要用于包过滤,也用于实现一些特定的 socket 选项,一些 netfilter 的模块也要用到这个域。
· struct net_device *dev;
· struct net_device *input_dev;
这几个变量都用于跟踪与 packet 相关的 device。由于 packet 在接收的过程中,可能会经过多个 virtual driver 处理,因此需要几个变量。
接收数据包的时候, dev 和 input_dev 都指向最初的 interface,此后,如果需要被 virtual driver 处理,那么 dev 会发生变化,而 input_dev 始终不变。
(These three members help keep track of the devices assosciated with a packet. The reason we have three different device pointers is that the main 'skb->dev' member can change as we encapsulate and decapsulate via a virtual device.
So if we are receiving a packet from a device which is part of a bonding device instance, initially 'skb->dev' will be set to point the real underlying bonding slave. When the packet enters the networking (via 'netif_receive_skb()') we save 'skb->dev' away in 'skb->real_dev' and update 'skb->dev' to point to the bonding device.
Likewise, the physical device receiving a packet always records itself in 'skb->input_dev'. In this way, no matter how many layers of virtual devices end up being decapsulated, 'skb->input_dev' can always be used to find the top-level device that ac