netfilter机制介绍
说明:本文基于linux内核2.6.29 与IPv6为例介绍。篇幅凌乱,仅供自我学习,转载请标明出处。
netfilter实际上是通过在linux的ip协议栈中的5个地方挂载hook函数,来实现对sk_buffer的截取处理。这些hook的类型主要有以下几种
enum nf_inet_hooks {
NF_INET_PRE_ROUTING,
NF_INET_LOCAL_IN,
NF_INET_FORWARD,
NF_INET_LOCAL_OUT,
NF_INET_POST_ROUTING,
NF_INET_NUMHOOKS
};
hook函数由一个全局二维链表数组nf_hooks保存,在nf_hooks中每个节点都是一个nf_hook_ops结构,它实际上存储了钩子函数的内容。
struct nf_hook_ops
{
struct list_head list; //一般可以使用{NULL,NULL}
/* User fills in from here down. */
nf_hookfn *hook; //hook函数,hook函数的定义在随后介绍
struct module *owner; //如果是本module可以使用 THIS_MODULE
u_int8_t pf; //协议族,如果是ipv6则为PF_INET6
unsigned int hooknum; //hooknum 如NF_IP6_PRE_ROUTING
/* Hooks are ordered in ascending priority. */
int priority; //优先级
};
关于优先级priority,目前Netfilter定义了一下几个优先级:
(取值越小优先级越高,我们可以根据需要对各个优先级加减一个常量得到符合我们需要的优先级。)
NF_IP6_PRI_FIRST = INT_MIN
NF_IP6_PRI_CONNTRACK = -200
NF_IP6_PRI_MANGLE = -150
NF_IP6_PRI_NAT_DST = -100
NF_IP6_PRI_FILTER = 0
NF_IP6_PRI_NAT_SRC = 100
NF_IP6_PRI_LAST = INT_MAX
那么接下来便是大家所关心的hook函数,就是这里的nf_hookfn *hook; 我们暂且称之为钩子函数。
钩子函数的返回值是有特殊规定的,它可以是以下几种:
#define NF_DROP 0
#define NF_ACCEPT 1
#define NF_STOLEN 2
#define NF_QUEUE 3
#define NF_REPEAT 4
#define NF_STOP 5
其含义如下:
1. NF_DROP 0:丢弃此数据报,而不进入此后的处理;
2. NF_ACCEPT 1:接受此数据报,进入下一步的处理;
3. NF_STOLEN 2:表示异常分组;
4. NF_QUEUE 3:排队到用户空间,等待用户处理;
5. NF_REPEAT 4:进入此函数再作处理。
钩子函数的函数指针的类型为nf_hookfn。
它的定义为:
typedef unsigned int nf_hookfn (unsigned int hooknum, struct sk_buff *skb,
const struct net_device *in, const struct net_device *out,
int (*okfn)(struct sk_buff *) ),所有的这些参数都是由Netfilter传递给我们的处理函数的。
其中okfn是当对应的钩子的注册函数为空 时,Netfilter调用的处理函数,它就是如果我们的处理函数返回Accept时Netfilter调用的处理函数。
IP协议栈中通过宏NF_HOOK来调用各个钩子函数,例如Ip6_input.c中的ipv6_rcv()函数如下:
- int ipv6_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *orig_dev)
- {
- struct ipv6hdr *hdr;
- u32 pkt_len;
- struct inet6_dev *idev;
- struct net *net = dev_net(skb->dev);
- if (skb->pkt_type == PACKET_OTHERHOST) {
- kfree_skb(skb);
- return 0;
- }
- rcu_read_lock();
- idev = __in6_dev_get(skb->dev);
- IP6_INC_STATS_BH(net, idev, IPSTATS_MIB_INRECEIVES);
- if ((skb = skb_share_check(skb, GFP_ATOMIC)) == NULL ||
- !idev || unlikely(idev->cnf.disable_ipv6)) {
- IP6_INC_STATS_BH(net, idev, IPSTATS_MIB_INDISCARDS);
- rcu_read_unlock();
- goto out;
- }
- memset(IP6CB(skb), 0, sizeof(struct inet6_skb_parm));
- /*
- * Store incoming device index. When the packet will
- * be queued, we cannot refer to skb->dev anymore.
- *
- * BTW, when we send a packet for our own local address on a
- * non-loopback interface (e.g. ethX), it is being delivered
- * via the loopback interface (lo) here; skb->dev = loopback_dev.
- * It, however, should be considered as if it is being
- * arrived via the sending interface (ethX), because of the
- * nature of scoping architecture. --yoshfuji
- */
- IP6CB(skb)->iif = skb->dst ? ip6_dst_idev(skb->dst)->dev->ifindex : dev->ifindex;
- if (unlikely(!pskb_may_pull(skb, sizeof(*hdr))))
- goto err;
- hdr = ipv6_hdr(skb);
- if (hdr->version != 6)
- goto err;
- /*
- * RFC4291 2.5.3
- * A packet received on an interface with a destination address
- * of loopback must be dropped.
- */
- if (!(dev->flags & IFF_LOOPBACK) &&
- ipv6_addr_loopback(&hdr->daddr))
- goto err;
- skb->transport_header = skb->network_header + sizeof(*hdr);
- IP6CB(skb)->nhoff = offsetof(struct ipv6hdr, nexthdr);
- pkt_len = ntohs(hdr->payload_len);
- /* pkt_len may be zero if Jumbo payload option is present */
- if (pkt_len || hdr->nexthdr != NEXTHDR_HOP) {
- if (pkt_len + sizeof(struct ipv6hdr) > skb->len) {
- IP6_INC_STATS_BH(net,
- idev, IPSTATS_MIB_INTRUNCATEDPKTS);
- goto drop;
- }
- if (pskb_trim_rcsum(skb, pkt_len + sizeof(struct ipv6hdr))) {
- IP6_INC_STATS_BH(net, idev, IPSTATS_MIB_INHDRERRORS);
- goto drop;
- }
- hdr = ipv6_hdr(skb);
- }
- if (hdr->nexthdr == NEXTHDR_HOP) {
- if (ipv6_parse_hopopts(skb) < 0) {
- IP6_INC_STATS_BH(net, idev, IPSTATS_MIB_INHDRERRORS);
- rcu_read_unlock();
- return 0;
- }
- }
- rcu_read_unlock();
- return NF_HOOK(PF_INET6, NF_INET_PRE_ROUTING, skb, dev, NULL,
- ip6_rcv_finish);
- err:
- IP6_INC_STATS_BH(net, idev, IPSTATS_MIB_INHDRERRORS);
- drop:
- rcu_read_unlock();
- kfree_skb(skb);
- out:
- return 0;
- }
可以看出该函数在最后调用了NF_INET_PRE_ROUTING处的钩子函数,在调用完该处所有的钩子函数之后调用了ip6_rcv_finish()函数。
- inline int ip6_rcv_finish( struct sk_buff *skb)
- {
- if (skb->dst == NULL)
- ip6_route_input(skb);
- return dst_input(skb);
- }
那我们来看看这个宏NF_HOOK的定义吧:
- #ifdef CONFIG_NETFILTER_DEBUG
- #define NF_HOOK(pf, hook, skb, indev, outdev, okfn)
- nf_hook_slow((pf), (hook), (skb), (indev), (outdev), (okfn), INT_MIN)
- #define NF_HOOK_THRESH nf_hook_slow
- #else
- #define NF_HOOK(pf, hook, skb, indev, outdev, okfn)
- (list_empty(&nf_hooks[(pf)][(hook)])
- ? (okfn)(skb)
- : nf_hook_slow((pf), (hook), (skb), (indev), (outdev), (okfn), INT_MIN))
- #define NF_HOOK_THRESH(pf, hook, skb, indev, outdev, okfn, thresh)
- (list_empty(&nf_hooks[(pf)][(hook)])
- ? (okfn)(skb)
- : nf_hook_slow((pf), (hook), (skb), (indev), (outdev), (okfn), (thresh)))
- #endif
- /* 如果nf_hooks[PF_INET][NF_IP_FORWARD]所指向的链表为空(即该钩子上没有挂处理函数),则直接调用okfn;否则,则调用net/core/netfilter.c::nf_hook_slow()转入Netfilter的处理。 */
上述是以我对netfilter的理解的一个自我总结。