linux内核发包工具,Linux内核发包工具pktgen测试方案说明

简介

pktgen是Linux内核里包含的一个高性能发包工具,主要用来测试网络性能。一般情况下,使用pktgen就可以满足千兆网卡的测试需要。 pktgen运行在“内核态”,并不占用太多的系统资源,就可以达到非常高的发包速率。

pktgen只支持UDP发包(端口9)。因为pktgen是一个非常底层测试工具,而且一般是测试网络设备的性能,并不涉及到应用层面。如果要测试高级的网络应用的性能,请使用其它的测试工具。

Pktgen的优点是可以根据MAC地址来指定具体的发包端口,而不是根据路由。可以利用该内核工具来测试光模块/SFP+电缆吞吐量、还可以利用pktgen测试网卡的性能(服务器相同配置下不同网卡性能对比)。

本测试在内核原有pktgen模块的基础上打了pktgen_rx补丁,增加了收包的统计功能。

安装

Linux内核自带pktgen模块,不带rx统计功能,需要rx功能的话需要下载补丁pktgen_rx.tgz,下载地

a620f831b2bea2ee6bb1275a1e20d8d7.png

实验环境:

机器型号:DELL R720

CPU: : Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz (20核40线程)

内核版本2.6.37

从以上网址下载pktgen_rx.tgz并解压,进入pktgen目录

c37278b31c30887a769fb39656c317ac.png

拷贝内核源码至/usr/src/kernels/下,再编译

f7c16966d62367926d3e581cfa3b8fec.png

提示函数有问题,如下修改

4eba6268720555ac9a7291d758aed89a.png

改成

713ada13024bde05605078e08d4d6797.png

再编译

6e7cba71f4842b8677af0ead42febc01.png

直接挂载就能使用

3a2afca79aeaac45ccfda49d1d51459f.png

测试

测试拓扑

7e5152242223aaf769594940029e3b3e.png

测试从eth6发包eth7收包shell脚本

pktgen.sh

#!/bin/sh

# pktgen.conf -- Sample configuration for send on two devices on a UP system

#modprobe pktgen

function pgset() {

local result

echo $1 > $PGDEV

result=`cat $PGDEV | fgrep "Result: OK:"`

if [ "$result" = "" ]; then

cat $PGDEV | fgrep Result:

fi

}

function pg() {

echo inject > $PGDEV

cat $PGDEV

}

#rx config

#Disable autonegotion in the interface

/sbin/ethtool -A $1 autoneg off rx off tx off

# Reception configuration   收包统计配置

PGDEV=/proc/net/pktgen/pgrx

echo "Removing old config"

pgset "rx_reset"

echo "Adding rx $1"

pgset "rx $1"

echo "Setting statistics $2(counters/basic/time)"

pgset "statistics $2"

pgset "display $3"

# pgset "display script/human"

# Result can be vieved in /proc/net/pktgen/eth1

#cat /proc/net/pktgen/pgrx

# We use eth6

echo "Adding devices to run".

PGDEV=/proc/net/pktgen/kpktgend_0

pgset "rem_device_all"

pgset "add_device eth6"  #根据实际测试更改网口设备

pgset "max_before_softirq 10000"

# Configure the individual devices

echo "Configuring devices"

PGDEV=/proc/net/pktgen/eth6   #根据实际测试更改网口设备

pgset "clone_skb 10000"

pgset "pkt_size 1514"  #根据实际测试更改包的大小

pgset "dst_mac 00:16:31:F0:84:D1"  #根据实际测试更改

pgset "count 0"   #设置count=0无限发包

# Time to run

PGDEV=/proc/net/pktgen/pgctrl

echo "Running... ctrl^C to stop"

pgset "start"

echo "Done"

运行的时候将$2设置为time,可以在/proc/net/pktgen/pgrx查看相关的jitter和lantency,如下

0e57eaccba1089069399107c8f7fa56c.png

测试截图:

运行pktgen脚本截图

2562473d297a537d963b09efd6ce79b4.png

测试过程中数据流截图

e190ad93ccabe2674d3fdf99e94f7980.png

对比eth6发包与eth7收包截图

4ceff6749e23a7e49cb953dfb637d805.png

881440697c7cfec804038f3376fd4abe.png

从红框中可以看出收发包是一样的,而吞吐量大概达到了9.8G以上,当然发包的大小可以自己修改,测试中如果出现丢包情况,请多测试几次看看。

而从我测试来看,应该可以得到一个结论:CPU越好,每秒钟能发出的包数目越多,越有可能达到线速,在我的测试环境下,发包大概能达到4Mpps。

eth6发包eth7收包,并且eth7发包eth6收包shell脚本

pktgen_eth6_eth7.sh

#! /bin/sh

#modprobe pktgen

pgset() {

local result

echo $1 > $PGDEV

result=`cat $PGDEV | fgrep "Result: OK:"`

if [ "$result" = "" ]; then

cat $PGDEV | fgrep Result:

fi

}

pg() {

echo inject > $PGDEV

cat $PGDEV

}

# Config Start Here ————————————————

# thread config

# Each CPU has own thread. Two CPU exammple. We add eth3, eth2 respectivly.

/sbin/ethtool -A $1 autoneg off rx off tx off

# Reception configuration

PGDEV=/proc/net/pktgen/pgrx

echo "Removing old config"

pgset "rx_reset"

echo "Adding rx $1"

pgset "rx $1"

echo "Setting statistics $2"

pgset "statistics $2"

pgset "display human"

# pgset "display script"

PGDEV=/proc/net/pktgen/kpktgend_0

echo "Removing all devices"

pgset "rem_device_all"

echo "Adding eth6"

pgset "add_device eth6"

echo "Setting max_before_softirq 10000"

pgset "max_before_softirq 10000"

PGDEV=/proc/net/pktgen/kpktgend_1

echo "Removing all devices"

pgset "rem_device_all"

echo "Adding eth7"

pgset "add_device eth7"

echo "Setting max_before_softirq 10000"

pgset "max_before_softirq 10000"

# device config

# delay 0 means maximum speed.

CLONE_SKB="clone_skb 1000000"

# NIC adds 4 bytes CRC

#COUNT="count 0"

PKT_SIZE="pkt_size 1514"

COUNT="count 0"

DELAY="delay 0"

PGDEV=/proc/net/pktgen/eth6

echo "Configuring $PGDEV"

pgset "$COUNT"

pgset "$CLONE_SKB"

pgset "$PKT_SIZE"

pgset "$DELAY"

#pgset "src_min 100.1.1.2"

#pgset "src_max 100.1.1.254"

pgset "dst 200.1.1.2"

pgset "dst_mac 00:16:31:F0:84:D1"

PGDEV=/proc/net/pktgen/eth7

echo "Configuring $PGDEV"

pgset "$COUNT"

pgset "$CLONE_SKB"

pgset "$PKT_SIZE"

pgset "$DELAY"

#pgset "src_min 200.1.1.2"

#pgset "src_max 200.1.1.254"

pgset "dst 100.1.1.2"

pgset "dst_mac 00:16:31:F0:84:D0"

# Time to run

PGDEV=/proc/net/pktgen/pgctrl

echo "Running… ctrl^C to stop"

pgset "start"

echo "Done"

pktgen_eth6_eth7.sh可以只统计其中一个端口的数据:./pktgen_eth6_eth7.sh eth6 counters(选择counters/basic/time没有区别,原因未明)

也可以双端口都统计:直接运行./pktgen_eth6_eth7.sh。

注意:如果首先运行了pktgen_eth6_eth7.sh后,再去运行pktgen.sh的话,原本应该是只有eth6发包,但是结果是eth6和eth7都会发包。此时的处理方法是先卸载pktgen.ko再加载之。

以上两种测试在小包情况下发包只能达到4Mpps左右,要提高发包速率,采用多核多线程处理,代码如下(仍是从eth6发包eth7收包)

pktgen_multicore.sh

#! /bin/sh

# $1 Rate in packets per s

# $2 Number of CPUs to use

function pgset() {

local result

echo $1 > $PGDEV

}

# Reception configuration

PGDEV=/proc/net/pktgen/pgrx

echo "Removing old config"

pgset "rx_reset"

echo "Adding rx eth7"

pgset "rx eth7"

echo "Setting statistics counters"

pgset "statistics counters"

pgset "display human"

# pgset "display script"

# Result can be vieved in /proc/net/pktgen/eth1

#cat /proc/net/pktgen/pgrx

# Config Start Here -----------------------------------------------------------

# thread config

CPUS=$2

#PKTS=`echo "scale=0; $3/$CPUS" | bc`

CLONE_SKB="clone_skb 10000"

PKT_SIZE="pkt_size 60"

COUNT="count 0"

DELAY="delay 0"

MAC="00:16:31:F0:84:D1"

ETH="eth6"

RATEP=`echo "scale=0; $1/$CPUS" | bc`

for processor in {0..14}   #kpktgen_0到14

do

PGDEV=/proc/net/pktgen/kpktgend_$processor

#  echo "Removing all devices"

pgset "rem_device_all"

done

for ((processor=0;processor

do

PGDEV=/proc/net/pktgen/kpktgend_$processor

#  echo "Adding $ETH"

pgset "add_device $ETH@$processor"

PGDEV=/proc/net/pktgen/$ETH@$processor

#  echo "Configuring $PGDEV"

pgset "$COUNT"

pgset "flag QUEUE_MAP_CPU"

pgset "$CLONE_SKB"

pgset "$PKT_SIZE"

#pgset "$DELAY"

pgset "ratep $RATEP"

#pgset "dst 10.0.0.1"

pgset "dst_mac $MAC"

#Random address with in the min-max range

#pgset "flag IPDST_RND"

pgset "src_min 1.0.0.0"

pgset "src_max 100.255.255.255"

#enable configuration packet

#pgset "config 1"    #config [0 or 1] Enables or disables the configuration packet, which reset the statistics and allows to calculate the losses.

#pgset "flows 1024"

#pgset "flowlen 8"

done

# Time to run

PGDEV=/proc/net/pktgen/pgctrl

echo "Running... ctrl^C to stop"

pgset "start"

echo "Done"

同时设置网卡多队列与CPU的亲和性,如下

Eth6绑定CPU0-19

fa116ded196428357f16b31814afef3a.png

[root@localhost pktgen]# echo 1 > /proc/irq/122/smp_affinity

[root@localhost pktgen]# echo 2 > /proc/irq/123/smp_affinity

[root@localhost pktgen]# echo 4 > /proc/irq/124/smp_affinity

[root@localhost pktgen]# echo 8 > /proc/irq/125/smp_affinity

[root@localhost pktgen]# echo 10 > /proc/irq/126/smp_affinity

[root@localhost pktgen]# echo 20 > /proc/irq/127/smp_affinity

[root@localhost pktgen]# echo 40 > /proc/irq/128/smp_affinity

[root@localhost pktgen]# echo 80 > /proc/irq/129/smp_affinity

[root@localhost pktgen]# echo 100 > /proc/irq/130/smp_affinity

[root@localhost pktgen]# echo 200 > /proc/irq/131/smp_affinity

[root@localhost pktgen]# echo 400 > /proc/irq/132/smp_affinity

[root@localhost pktgen]# echo 800 > /proc/irq/133/smp_affinity

[root@localhost pktgen]# echo 1000 >/proc/irq/134/smp_affinity

[root@localhost pktgen]# echo 2000 >/proc/irq/135/smp_affinity

[root@localhost pktgen]# echo 4000 >/proc/irq/136/smp_affinity

[root@localhost pktgen]# echo 8000 >/proc/irq/137/smp_affinity

[root@localhost pktgen]# echo 10000 >/proc/irq/138/smp_affinity

[root@localhost pktgen]# echo 20000 >/proc/irq/139/smp_affinity

[root@localhost pktgen]# echo 40000 >/proc/irq/140/smp_affinity

[root@localhost pktgen]# echo 80000 >/proc/irq/141/smp_affinity

Eth7绑定CPU0-19

d80c0f78856d38bd56e4d85a43026751.png

[root@localhost pktgen]# echo 1 > /proc/irq/143/smp_affinity

[root@localhost pktgen]# echo 2 > /proc/irq/144/smp_affinity

[root@localhost pktgen]# echo 4 > /proc/irq/145/smp_affinity

[root@localhost pktgen]# echo 8 > /proc/irq/146/smp_affinity

[root@localhost pktgen]# echo 10 > /proc/irq/147/smp_affinity

[root@localhost pktgen]# echo 20 > /proc/irq/148/smp_affinity

[root@localhost pktgen]# echo 40 > /proc/irq/149/smp_affinity

[root@localhost pktgen]# echo 80 > /proc/irq/150/smp_affinity

[root@localhost pktgen]# echo 100 > /proc/irq/151/smp_affinity

[root@localhost pktgen]# echo 200 > /proc/irq/152/smp_affinity

[root@localhost pktgen]# echo 400 > /proc/irq/153/smp_affinity

[root@localhost pktgen]# echo 800 > /proc/irq/154/smp_affinity

[root@localhost pktgen]# echo 1000 >/proc/irq/155/smp_affinity

[root@localhost pktgen]# echo 2000 >/proc/irq/156/smp_affinity

[root@localhost pktgen]# echo 4000 >/proc/irq/157/smp_affinity

[root@localhost pktgen]# echo 8000 >/proc/irq/158/smp_affinity

[root@localhost pktgen]# echo 10000 >/proc/irq/159/smp_affinity

[root@localhost pktgen]# echo 20000 >/proc/irq/160/smp_affinity

[root@localhost pktgen]# echo 40000 >/proc/irq/161/smp_affinity

[root@localhost pktgen]# echo 80000 >/proc/irq/162/smp_affinity

注意:网卡队列与CPU绑定时根据ip和端口来的,所以ip或者端口不能固定不变,不然绑定失效。本测试源ip打散发送。

测试结果表明,多队列与cpu绑定后发包和收包都得到很大的提升。(原先CPU单核接收最多只能达到2Mpps,设置绑定后达到了9.5Mpps左右,当然还可以继续提高。)

e4f6bd1aea8919654563f59667e6f0af.png

网卡默认的MTU是1500不能接收大于1518的数据包,因此可以更改其大小来接收比如8192byte字节数据包

ip link set dev eth6 mtu 8174

ip link set dev eth7 mtu 8174

修改后调整脚本中pkt_size至8188,eth7可以接收到数据(tcpdump抓包可看到,如果不进行更改,抓包会显示如下)

d18a5ae00537c7af04c859b999bcfe04.png

附:

配置项解释

Configuring threads and devices

================================

This is done via the /proc interface easiest done via pgset in the scripts

Examples:

pgset "clone_skb 1"    sets the number of copies of the same packet

pgset "clone_skb 0"    use single SKB for all transmits

pgset "pkt_size 9014"   sets packet size to 9014

pgset "frags 5"      packet will consist of 5 fragments

pgset "count 200000"    sets number of packets to send, set to zero for continuous sends until explicitly stopped.

pgset "delay 5000"     adds delay to hard_start_xmit(). nanoseconds

pgset "dst 10.0.0.1"    sets IP destination address(BEWARE! This generator is very aggressive!)

pgset "dst_min 10.0.0.1"  Same as dst

pgset "dst_max 10.0.0.254" Set the maximum destination IP.

pgset "src_min 10.0.0.1"  Set the minimum (or only) source IP.

pgset "src_max 10.0.0.254" Set the maximum source IP.

pgset "dst6 fec0::1"    IPV6 destination address

pgset "src6 fec0::2"    IPV6 source address

pgset "dstmac 00:00:00:00:00:00" sets MAC destination address

pgset "srcmac 00:00:00:00:00:00" sets MAC source address

pgset "queue_map_min 0" Sets the min value of tx queue interval

pgset "queue_map_max 7" Sets the max value of tx queue interval, for multiqueue devices.To select queue 1 of a given device, use queue_map_min=1 and queue_map_max=1

pgset "src_mac_count 1" Sets the number of MACs we'll range through.The 'minimum' MAC is what you set with srcmac.

pgset "dst_mac_count 1" Sets the number of MACs we'll range through.The 'minimum' MAC is what you set with dstmac.

pgset "flag [name]"   Set a flag to determine behaviour.  Current flags are:

IPSRC_RND #IP Source is random (between min/max),

IPDST_RND, UDPSRC_RND

UDPDST_RND, MACSRC_RND, MACDST_RND

MPLS_RND, VID_RND, SVID_RND

QUEUE_MAP_RND # queue map random

QUEUE_MAP_CPU # queue map mirrors

smp_processor_id()

pgset "udp_src_min 9"   set UDP source port min, If < udp_src_max, then cycle through the port range.

pgset "udp_src_max 9"   set UDP source port max.

pgset "udp_dst_min 9"   set UDP destination port min, If < udp_dst_max, then cycle through the port range.

pgset "udp_dst_max 9"   set UDP destination port max.

pgset "mpls 0001000a,0002000a,0000000a" set MPLS labels (in this example outer label=16,middle label=32,inner label=0 (IPv4 NULL)) Note that there must be no spaces between the arguments. Leading zeros are required.Do not set the bottom of stack bit,that's done automatically. If you do set the bottom of stack bit, that indicates that you want to randomly generate that address and the flag MPLS_RND will be turned on. You can have any mix of random and fixed labels in the label stack.

pgset "mpls 0" turn off mpls (or any invalid argument works too!)

pgset "vlan_id 77"    set VLAN ID 0-4095

pgset "vlan_p 3"     set priority bit 0-7 (default 0)

pgset "vlan_cfi 0"    set canonical format identifier 0-1 (default 0)

pgset "svlan_id 22"   set SVLAN ID 0-4095

pgset "svlan_p 3"    set priority bit 0-7 (default 0)

pgset "svlan_cfi 0"   set canonical format identifier 0-1 (default 0)

pgset "vlan_id 9999"   > 4095 remove vlan and svlan tags

pgset "svlan 9999"    > 4095 remove svlan tag

pgset "tos XX"      set former IPv4 TOS field (e.g. "tos 28" for AF11 no ECN, default 00)

pgset "traffic_class XX" set former IPv6 TRAFFIC CLASS (e.g. "traffic_class B8" for EF no ECN, default 00)pgset stop aborts injection. Also, ^C aborts generator.

pgset "rate 300M"     set rate to 300 Mb/s

pgset "ratep 1000000"   set rate to 1Mpps

线速定义

0c4891b0b0bd394cd4a8f128da55cdbd.png

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值