目录
引言
在只有一张网卡的情况下,这个网卡不能既被VPP(DPDK)接管,同时被操作系统所使用。这个时候一般只能通过创建虚拟机,在虚拟机中运行VPP(DPDK)接管virtio的虚拟网卡。最近才发现了SR-IOV技术,可以在PCI层面虚拟网卡,不需要创建虚拟机,这个虚拟网卡直接就可以被DPDK接管,而原来的物理网卡还可以正常使用
SR-IOV简介
SR-IOV全称 single root input/output virtualization,是一个硬件虚拟化的规范和标准。
SR-IOV 标准将一个PCIe的网络控制器虚拟化成多个PCIe设备,即多个PCI虚拟网卡,这些虚拟网卡不仅可以给虚拟机使用,也可以直接给操作系统使用,也可以给物理机上的DPDK使用。
SR-IOV的优点:效率高,速度快;缺点:特定硬件支持,主板,CPU,网卡都必须同时支持该规范。
PF (Phycics function) ——物理网卡
VF (Virutal Funtion)——虚拟网卡
网卡支持
- Intel® Ethernet Network Adapter X722 Series
- Intel® Ethernet Network Adapter X722-DA2
- Intel® Ethernet Network Adapter X722-DA4
- Intel® Ethernet Converged Network Adapter XL710 Series
- Intel® Ethernet Converged Network Adapter XL710-QDA1
- Intel® Ethernet Converged Network Adapter XL710-QDA2
- Intel® Ethernet Converged Network Adapter XL710-QDA1 for OCP
- Intel® Ethernet Converged Network Adapter XL710-QDA2 for OCP
- Intel® Ethernet Network Adapter XXV710 Series
- Intel® Ethernet Network Adapter XXV710-DA1
- Intel® Ethernet Network Adapter XXV710-DA2
- Intel® Ethernet Network Adapter XXV710-DA1 for OCP
- Intel® Ethernet Network Adapter XXV710-DA2 for OCP
- Intel® Ethernet Converged Network Adapter X710 Series
- Intel® Ethernet Converged Network Adapter X710-DA2
- Intel® Ethernet Converged Network Adapter X710-DA4
- Intel® Ethernet Converged Network Adapter X710-T4
- Intel® Ethernet Controller X710/X557-AT 10GBASE-T
- Intel® Ethernet Connection X722
- Intel® Ethernet Connection X722 for 10GBASE-T
- Intel® Ethernet Connection X722 for 10GbE backplane
- Intel® Ethernet Connection X722 for 10GbE QSFP+
- Intel® Ethernet Connection X722 for 10GbE SFP+
- Intel® Ethernet Converged Network Adapter X550
- Intel® Ethernet Converged Network Adapter X550-T1
- Intel® Ethernet Converged Network Adapter X550-T2
- Intel® Ethernet Converged Network Adapter X540
- Intel® Ethernet Converged Network Adapter X540-T1
- Intel® Ethernet Converged Network Adapter X540-T2
- Intel® 82599 10 Gigabit Ethernet Controller
- Intel® Ethernet 82599EB 10 Gigabit Ethernet Controller
- Intel® Ethernet 82599ES 10 Gigabit Ethernet Controller
- Intel® Ethernet 82599EN 10 Gigabit Ethernet Controller
- Intel® Ethernet Converged Network Adapter X520
- Intel® Ethernet Converged Network Adapter X520-DA2
- Intel® Ethernet Converged Network Adapter X520-SR1
- Intel® Ethernet Converged Network Adapter X520-SR2
- Intel® Ethernet Converged Network Adapter X520-LR1
- Intel® Ethernet Converged Network Adapter X520-T2
- Intel® Ethernet Controller I350
- Intel® Ethernet Controller I350-AM4
- Intel® Ethernet Controller I350-AM2
- Intel® Ethernet Controller I350-BT2
- Intel® Ethernet Server Adapter I350
- Intel® Ethernet Server Adapter I350-T2
- Intel® Ethernet Server Adapter I350-T4
- Intel® Ethernet Server Adapter I350-F2
- Intel® Ethernet Server Adapter I350-F4
配置步骤
1. 修改Bios enable SR-IOV
2. 修改启动参数
添加intel_iommu=on iommu=pt igb.max_vfs=1,然后重启设备
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=vg00/Root rhgb quiet intel_iommu=on iommu=pt igb.max_vfs=1"
Igb.max_vfs 指igb驱动的网卡创建VF的个数。如果是其他驱动一样修改,如ixgbe.max_vfs。
查看驱动的方式可以用DPDK工具dpdk-devbind.sh,根据具体的驱动来修改启动参数
[root@localhost ~]# dpdk-devbind.py -s
Network devices using DPDK-compatible driver
============================================
0000:05:10.0 'I350 Ethernet Controller Virtual Function' drv=vfio-pci unused=igbvf,uio_pci_generic
0000:05:10.1 'I350 Ethernet Controller Virtual Function' drv=vfio-pci unused=igbvf,uio_pci_generic
Network devices using kernel driver
===================================
0000:01:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' if=enp1s0 drv=ixgbe unused=vfio-pci,uio_pci_generic
0000:05:00.0 'I350 Gigabit Network Connection' if=enp5s0f0 drv=igb unused=vfio-pci,uio_pci_generic
0000:05:00.1 'I350 Gigabit Network Connection' if=enp5s0f1 drv=igb unused=vfio-pci,uio_pci_generic
0000:0a:00.0 'I210 Gigabit Network Connection' if=eno1 drv=igb unused=vfio-pci,uio_pci_generic
0000:0b:00.0 'I210 Gigabit Network Connection' if=eno2 drv=igb unused=vfio-pci,uio_pci_generic
Other network devices
=====================
<none>
Crypto devices using DPDK-compatible driver
===========================================
<none>
Crypto devices using kernel driver
==================================
<none>
Other crypto devices
====================
<none>
3. 设置vf网卡mac地址,权限
ip link set enp5s0f1 vf 0 mac a0:36:9f:aa:64:9d
ip link set dev enp5s0f1 vf 0 trust on
ip link set dev enp5s0f1 vf 0 spoof off
ip link set enp5s0f1 allmulticast on
这里有一个坑点,VF默认是拒绝multicast组播报文的。以至于ipv6 neighbor solicitation报文被过滤,VF网卡无法被邻居发现。查询了一圈资料,更换了多次驱动未果,终于发现PF开启allmulticast,VF才能收到左右组播报文,当然ipv6 nd也就成功了。
[root@localhost ~]# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:1b:21:bb:3d:00 brd ff:ff:ff:ff:ff:ff
3: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether a0:36:9f:09:63:9c brd ff:ff:ff:ff:ff:ff
vf 0 MAC b2:6e:0e:68:0f:2b, spoof checking on, link-state auto, trust off
4: enp5s0f1: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether a0:36:9f:09:63:9d brd ff:ff:ff:ff:ff:ff
vf 0 MAC a0:36:9f:aa:64:9d, spoof checking off, link-state auto, trust on
4. 切换网卡驱动
DPDK支持SR-IOV的驱动只有igb_uio和vfio-pci,uio_pci_generic是无法接管VF的。这里使用vfio-pci驱动。
modprobe vfio-pci
/etc/vpp/startup.conf 中添加uio-driver vfio-pci
dpdk {
## Change default settings for all interfaces
# dev default {
## Number of receive queues, enables RSS
## Default is 1
# num-rx-queues 3
## Number of transmit queues, Default is equal
## to number of worker threads or 1 if no workers treads
# num-tx-queues 3
## Number of descriptors in transmit and receive rings
## increasing or reducing number can impact performance
## Default is 1024 for both rx and tx
# num-rx-desc 512
# num-tx-desc 512
## VLAN strip offload mode for interface
## Default is off
# vlan-strip-offload on
# }
## Whitelist specific interface by specifying PCI address
# dev 0000:02:00.0
## Blacklist specific device type by specifying PCI vendor:device
## Whitelist entries take precedence
# blacklist 8086:10fb
## Set interface name
# dev 0000:02:00.1 {
# name eth0
# }
## Whitelist specific interface by specifying PCI address and in
## addition specify custom parameters for this interface
# dev 0000:02:00.1 {
# num-rx-queues 2
# }
## Change UIO driver used by VPP, Options are: igb_uio, vfio-pci,
## uio_pci_generic or auto (default)
uio-driver vfio-pci
5.启动VPP
systemctl start vpp
[root@localhost system]# vppctl
_______ _ _ _____ ___
__/ __/ _ \ (_)__ | | / / _ \/ _ \
_/ _// // / / / _ \ | |/ / ___/ ___/
/_/ /____(_)_/\___/ |___/_/ /_/
vpp#
vpp# show int
Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count
VirtualFunctionEthernet5/10/0 1 down 9000/0/0/0
VirtualFunctionEthernet5/10/1 2 up 9000/0/0/0 rx packets 12
rx bytes 946
tx packets 4
tx bytes 364
drops 12
ip4 9
local0 0 down 0/0/0/0 drops 1
vpp#
vpp#
vpp# show int addr
VirtualFunctionEthernet5/10/0 (dn):
VirtualFunctionEthernet5/10/1 (up):
L3 10.2.1.96/24
L3 240e:ff:e000:8::96/64
local0 (dn):
vpp#
[root@localhost ~]# ping6 240e:ff:e000:8::96
PING 240e:ff:e000:8::96(240e:ff:e000:8::96) 56 data bytes
64 bytes from 240e:ff:e000:8::96: icmp_seq=1 ttl=63 time=0.215 ms
64 bytes from 240e:ff:e000:8::96: icmp_seq=2 ttl=63 time=0.111 ms
64 bytes from 240e:ff:e000:8::96: icmp_seq=3 ttl=63 time=0.110 ms
64 bytes from 240e:ff:e000:8::96: icmp_seq=4 ttl=63 time=0.110 ms
^C
--- 240e:ff:e000:8::96 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2999ms
rtt min/avg/max/mdev = 0.110/0.136/0.215/0.046 ms
[root@localhost ~]#
局域网内其他机器可以邻居发现VF,并成功ping通。
小结
网上SR-IOV技术用于虚拟网卡给VM的例子很多,但是还没找到一个宿主机直接使用的例子。所以这里就分享一下这个配置测试过程,也算是对自己踩过坑的一个总结吧。