BPF filter

原文: https://www.gsp.com/cgi-bin/man.cgi?section=4&topic=bpf#1

 

FILTER MACHINE

A filter program is an array of instructions, with all branches forwardly directed, terminated by a return instruction. Each instruction performs some action on the pseudo-machine state, which consists of an accumulator, index register, scratch memory store, and implicit program counter.The following structure defines the instruction format:

struct bpf_insn { 
	u_short	code; 
	u_char 	jt; 
	u_char 	jf; 
	u_long k; 
};

The k field is used in different ways by different instructions, and the jt and jf fields are used as offsets by the branch instructions. The opcodes are encoded in a semi-hierarchical fashion. There are eight classes of instructions: BPF_LDBPF_LDXBPF_STBPF_STXBPF_ALUBPF_JMPBPF_RET, and BPF_MISC. Various other mode and operator bits are or'd into the class to give the actual instructions. The classes and modes are defined in <net/bpf.h>.Below are the semantics for each defined bpf instruction. We use the convention that A is the accumulator, X is the index register, P[] packet data, and M[] scratch memory store. P[i:n] gives the data at byte offset “i” in the packet, interpreted as a word (n=4), unsigned halfword (n=2), or unsigned byte (n=1). M[i] gives the i'th word in the scratch memory store, which is only addressed in word units. The memory store is indexed from 0 to BPF_MEMWORDS - 1. kjt, and jf are the corresponding fields in the instruction definition. “len” refers to the length of the packet.

BPF_LD

These instructions copy a value into the accumulator. The type of the source operand is specified by an “addressing mode” and can be a constant (BPF_IMM), packet data at a fixed offset (BPF_ABS), packet data at a variable offset (BPF_IND), the packet length (BPF_LEN), or a word in the scratch memory store (BPF_MEM). For BPF_IND and BPF_ABS, the data size must be specified as a word (BPF_W), halfword (BPF_H), or byte (BPF_B). The semantics of all the recognized BPF_LD instructions follow.

BPF_LD+BPF_W+BPF_ABS	A <- P[k:4] 
BPF_LD+BPF_H+BPF_ABS	A <- P[k:2] 
BPF_LD+BPF_B+BPF_ABS	A <- P[k:1] 
BPF_LD+BPF_W+BPF_IND	A <- P[X+k:4] 
BPF_LD+BPF_H+BPF_IND	A <- P[X+k:2] 
BPF_LD+BPF_B+BPF_IND	A <- P[X+k:1] 
BPF_LD+BPF_W+BPF_LEN	A <- len 
BPF_LD+BPF_IMM		A <- k 
BPF_LD+BPF_MEM		A <- M[k]
    

BPF_LDX

These instructions load a value into the index register. Note that the addressing modes are more restrictive than those of the accumulator loads, but they include BPF_MSH, a hack for efficiently loading the IP header length.

BPF_LDX+BPF_W+BPF_IMM	X <- k 
BPF_LDX+BPF_W+BPF_MEM	X <- M[k] 
BPF_LDX+BPF_W+BPF_LEN	X <- len 
BPF_LDX+BPF_B+BPF_MSH	X <- 4*(P[k:1]&0xf)
    

BPF_ST

This instruction stores the accumulator into the scratch memory. We do not need an addressing mode since there is only one possibility for the destination.

BPF_ST			M[k] <- A
    

BPF_STX

This instruction stores the index register in the scratch memory store.

BPF_STX			M[k] <- X
    

BPF_ALU

The alu instructions perform operations between the accumulator and index register or constant, and store the result back in the accumulator. For binary operations, a source mode is required (BPF_K or BPF_X).

BPF_ALU+BPF_ADD+BPF_K	A <- A + k 
BPF_ALU+BPF_SUB+BPF_K	A <- A - k 
BPF_ALU+BPF_MUL+BPF_K	A <- A * k 
BPF_ALU+BPF_DIV+BPF_K	A <- A / k 
BPF_ALU+BPF_MOD+BPF_K	A <- A % k 
BPF_ALU+BPF_AND+BPF_K	A <- A & k 
BPF_ALU+BPF_OR+BPF_K	A <- A | k 
BPF_ALU+BPF_XOR+BPF_K	A <- A ^ k 
BPF_ALU+BPF_LSH+BPF_K	A <- A << k 
BPF_ALU+BPF_RSH+BPF_K	A <- A >> k 
BPF_ALU+BPF_ADD+BPF_X	A <- A + X 
BPF_ALU+BPF_SUB+BPF_X	A <- A - X 
BPF_ALU+BPF_MUL+BPF_X	A <- A * X 
BPF_ALU+BPF_DIV+BPF_X	A <- A / X 
BPF_ALU+BPF_MOD+BPF_X	A <- A % X 
BPF_ALU+BPF_AND+BPF_X	A <- A & X 
BPF_ALU+BPF_OR+BPF_X	A <- A | X 
BPF_ALU+BPF_XOR+BPF_X	A <- A ^ X 
BPF_ALU+BPF_LSH+BPF_X	A <- A << X 
BPF_ALU+BPF_RSH+BPF_X	A <- A >> X 
BPF_ALU+BPF_NEG		A <- -A
    

BPF_JMP

The jump instructions alter flow of control. Conditional jumps compare the accumulator against a constant (BPF_K) or the index register (BPF_X). If the result is true (or non-zero), the true branch is taken, otherwise the false branch is taken. Jump offsets are encoded in 8 bits so the longest jump is 256 instructions. However, the jump always (BPF_JA) opcode uses the 32 bit k field as the offset, allowing arbitrarily distant destinations. All conditionals use unsigned comparison conventions.

BPF_JMP+BPF_JA		pc += k 
BPF_JMP+BPF_JGT+BPF_K	pc += (A > k) ? jt : jf 
BPF_JMP+BPF_JGE+BPF_K	pc += (A >= k) ? jt : jf 
BPF_JMP+BPF_JEQ+BPF_K	pc += (A == k) ? jt : jf 
BPF_JMP+BPF_JSET+BPF_K	pc += (A & k) ? jt : jf 
BPF_JMP+BPF_JGT+BPF_X	pc += (A > X) ? jt : jf 
BPF_JMP+BPF_JGE+BPF_X	pc += (A >= X) ? jt : jf 
BPF_JMP+BPF_JEQ+BPF_X	pc += (A == X) ? jt : jf 
BPF_JMP+BPF_JSET+BPF_X	pc += (A & X) ? jt : jf
    

BPF_RET

The return instructions terminate the filter program and specify the amount of packet to accept (i.e., they return the truncation amount). A return value of zero indicates that the packet should be ignored. The return value is either a constant (BPF_K) or the accumulator (BPF_A).

BPF_RET+BPF_A		accept A bytes 
BPF_RET+BPF_K		accept k bytes
    

BPF_MISC

The miscellaneous category was created for anything that does not fit into the above classes, and for any new instructions that might need to be added. Currently, these are the register transfer instructions that copy the index register to the accumulator or vice versa.

BPF_MISC+BPF_TAX	X <- A 
BPF_MISC+BPF_TXA	A <- X
    

The bpf interface provides the following macros to facilitate array initializers: BPF_STMT(opcode, operand) and BPF_JUMP(opcode, operand, true_offset, false_offset).

SYSCTL VARIABLES

A set of sysctl(8) variables controls the behaviour of the bpf subsystem

net.bpf.optimize_writers: 0

Various programs use BPF to send (but not receive) raw packets (cdpd, lldpd, dhcpd, dhcp relays, etc. are good examples of such programs). They do not need incoming packets to be send to them. Turning this option on makes new BPF users to be attached to write-only interface list until program explicitly specifies read filter via pcap_set_filter(). This removes any performance degradation for high-speed interfaces.

net.bpf.stats:

Binary interface for retrieving general statistics.

net.bpf.zerocopy_enable: 0

Permits zero-copy to be used with net BPF readers. Use with caution.

net.bpf.maxinsns: 512

Maximum number of instructions that BPF program can contain. Use tcpdump(1) -d option to determine approximate number of instruction for any filter.

net.bpf.maxbufsize: 524288

Maximum buffer size to allocate for packets buffer.

net.bpf.bufsize: 4096

Default buffer size to allocate for packets buffer.

EXAMPLES

The following filter is taken from the Reverse ARP Daemon. It accepts only Reverse ARP requests.

struct bpf_insn insns[] = { 
	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 
	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3), 
	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), 
	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1), 
	BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) + 
		 sizeof(struct ether_header)), 
	BPF_STMT(BPF_RET+BPF_K, 0), 
};

This filter accepts only IP packets between host 128.3.112.15 and 128.3.112.35.

struct bpf_insn insns[] = { 
	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 
	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8), 
	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26), 
	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2), 
	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30), 
	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4), 
	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3), 
	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30), 
	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1), 
	BPF_STMT(BPF_RET+BPF_K, (u_int)-1), 
	BPF_STMT(BPF_RET+BPF_K, 0), 
};

Finally, this filter returns only TCP finger packets. We must parse the IP header to reach the TCP header. The BPF_JSET instruction checks that the IP fragment offset is 0 so we are sure that we have a TCP header.

struct bpf_insn insns[] = { 
	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 
	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10), 
	BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23), 
	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8), 
	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), 
	BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0), 
	BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14), 
	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14), 
	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0), 
	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16), 
	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1), 
	BPF_STMT(BPF_RET+BPF_K, (u_int)-1), 
	BPF_STMT(BPF_RET+BPF_K, 0), 
};
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值