set up ovn based sr-iov test env (by quqi99)

本文介绍了如何配置基于OVN的SR-IOV测试环境,涉及Maas、Juju、DPDK等技术。作者在过程中遇到了IP地址不匹配、DHCP问题,以及OVN控制器中的流处理问题,并给出了详细解决方案。文章还探讨了SR-IOV网卡的不同模式,以及在不同阶段的网络配置和检查方法。
摘要由CSDN通过智能技术生成

作者:张华 发表于:2021-05-16
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明

前一篇关于sr-iov与ovn的文章是:
sr-iov: https://blog.csdn.net/quqi99/article/details/53488243
ovn: https://blog.csdn.net/quqi99/article/details/114375273

准备机器

在maas中得找到有个支持sr-iov的机器用于测试。
有时候maas中显示一个NIC不在线,实际上在deploy后登录进入使用’ip addr show '看到它是up的,那么如何判断一个NIC是否已经在机房插了网线呢?答案是:
ethtool eno50 |grep ‘Link detected’
这样,找到两台机器(duduo, crustle)来用于sr-iov测试。接着是需要在maas中给机器设置grub.

# make sure sr-iov grub configuration has been in bot crustle and duduo
# Per-node kernel boot options
# https://github.com/CanonicalLtd/maas-docs/blob/master/en/nodes-kernel-options.md
# https://maas.io/docs/snap/2.9/cli/maas-tags
sudo apt-get install -y maas-cli
echo '<key-in-<MAASIP>/MAAS/r/account/prefs/api-keys>' > ~/maas-apikey
maas login admin http://10.230.56.2/MAAS `cat ~/maas-apikey`
maas admin tag delete crustle
maas admin tags create name="crustle" comment="" kernel_opts="transparent_hugepage=never hugepagesz=1G hugepages=32 default_hugepagesz=1G iommu=pt intel_iommu=on ixgbe.max_vfs=7"
$ maas admin tags read
...
        "name": "duduo",
        "kernel_opts": "console=ttyS1 transparent_hugepage=never hugepagesz=1G hugepages=32 default_hugepagesz=1G iommu=pt intel_iommu=on ixgbe.max_vfs=7",

这样,在机器被deploy之后才能看到下列信息(不在maas中配置,在deploy后的机器里直接修改grub后再restart也行,但这样在使用maas重新deploy后需再次配置):

root@duduo:~# cat /sys/class/net/eno50/device/sriov_totalvfs 
63
root@duduo:~# cat /sys/class/net/eno50/device/sriov_numvfs 
7
root@duduo:~# ip link show eno50 |head -n3
17: eno50: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 8c:dc:d4:b3:9c:3d brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off, query_rss off
$ juju run --application ovn-chassis-a 'lspci -nn | grep "Virtual Function"' |head -n1
04:10.0 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)

root@duduo:~# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-4.15.0-143-generic root=UUID=a2cd534b-4d84-481d-baf8-8040f3287f6c ro console=ttyS1 transparent_hugepage=never hugepagesz=1G hugepages=32 default_hugepagesz=1G iommu=pt intel_iommu=on ixgbe.max_vfs=7

准备网卡

最终得到网卡是:

注:不需要在maas中额外做什么,maas中只是留eno49与eno50即可。在juju使用lxd时会自动配置br-eno49的
eno49  br-eno49 used for lxd and management ip
veth-ex, a veth device to connect br-eno49, used for br-data:veth-ex
eno50  sriov nic

OVN中东西向是天然分布式的(类似legecy ML2+DVR), 南北向流量则可以通过BFD实现HA这样就要求部署一些OVN L3.
下列配置可用于将某些节点配置成OVN L3(direct external mapping),不需要将每个节点配置成OVN L3, Chassis without direct external mapping to a external Layer3 network will forward traffic through a tunnel to one of the chassis acting as a gateway for that network.

juju config ovn-chassis ovn-bridge-mappings=physnet1:br-data
juju config ovn-chassis bridge-interface-mappings br-data:veth-ex

和SR-IOV有关的配置则是:

juju config ovn-chassis enable-sriov=true
juju config ovn-chassis sriov-device-mappings=physnet1:eno50
juju config ovn-chassis sriov-numvfs=eno50:2
juju config nova-compute pci-passthrough-whitelist='[{"devname":"eno50", "physical_network":"physnet1"}]'
juju config neutron-api supported_pci_vendor_devs='14e4:16af 8086:1515 8086:1528'

所以实际是需要3个网卡,但目前只有两个网卡,eno50用于sr-iov, eno49将做成br-eno49来用于lxd及maangement ip。这时要弄第三个网卡,可以考虑创建一对veth设备,一端接br-data(linux bridge), 一端用在br-data上。
注:20220607更新 - 实际上这个veth-ex是不可以直接使用sriov PF代替的(详见附录的问题),即:juju config ovn-chassis bridge-interface-mappings=“br-data:eno50”

#on each compute node
# create veth pair between br-eno49 and veth-ex
# br-eno49是创建lxd容器时自动生成的.如:juju deploy ubuntu --to lxd:7
ip l add name veth-br-eno49 type veth peer name veth-ex
#ip l set dev veth-br-eno49 mtu 9000
#ip l set dev veth-ex mtu 9000
ip l set dev veth-br-eno49 up
ip l set dev veth-ex up
ip l set veth-br-eno49 master br-eno49
juju config ovn-chassis-a bridge-interface-mappings="br-data:veth-ex"
juju config ovn-chassis-b bridge-interface-mappings="br-data:veth-ex"

注:如果还想加第四块网络再做个flat网络用于调试的话:

ip l add name veth2-br-eno49 type veth peer name veth2-ex
#ip l set dev veth2-br-eno49 mtu 9000
#ip l set dev veth2-ex mtu 9000
ip l set dev veth2-br-eno49 up
ip l set dev veth2-ex up
ip l set veth2-br-eno49 master br-eno49
juju config ovn-chassis-a ovn-bridge-mappings="physnet1:br-data physnet2:br-data2"
juju config ovn-chassis-b ovn-bridge-mappings="physnet1:br-data physnet2:br-data2"
juju config ovn-chassis-a bridge-interface-mappings="br-data:veth-ex br-data2:veth2-ex"
juju config ovn-chassis-b bridge-interface-mappings="br-data:veth-ex br-data2:veth2-ex"
juju config neutron-api flat-network-providers="physnet1 physnet2"
ovs-vsctl get open . external-ids
neutron net-create ext_net2 --provider:network_type flat --provider:physical_network physnet2 --router:external=True --shared
neutron subnet-create --name ext_net2_subnet --enable_dhcp=True --allocation_pool start=10.10.0.60,end=10.10.0.70 --gateway=10.10.0.1 ext_net2 10.10.0.0/24
neutron port-create --name port1 ext_net2
nova interface-attach --port-id $(neutron port-show port1 -c id -f value) i1
juju ssh 0 -- sudo ip netns exec ovnmeta-ea8a92b2-40b4-4e48-a470-120af8956dbb ping 10.10.0.69

安装步骤

# https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-ovn.html
./generate-bundle.sh -s bionic -r ussuri
# modify bundle to add ovn related stuffs
juju deploy ./bundles/openstack-sriov-ovn.yaml
~/stsstack-bundles/openstack/tools/vault-unseal-and-authorise.sh

cd openstack-sriov; source novarc; alias openstack='openstack --insecure'
export OS_PASSWORD=$(juju run --unit keystone/0 leader-get admin_passwd)
openstack --insecure project list
# https://blog.csdn.net/quqi99/article/details/107182847
export OS_CACERT=/etc/ssl/certs/
openstack project list
# 通过上面命令后,此时,有可能keystone仍然不work, 报:SSLError, 原因竟然是因为haproxy
# 登录keystone/0, 关闭haproxy服务(port 5000), 将apache2的端口由4990改成5000再重启apache2即可。
# 上面错,上面禁了keystone的haproxy后,glance, neutron-api的haproxy都得禁并且禁了也还有错。
# 原来是这个patch引入的回归问题- https://review.opendev.org/c/openstack/charm-keystone/+/803502/2/hooks/keystone_context.py

# Create usr, image, flavor, network, port, vm etc
./configure.sh
./tools/sec_groups.sh
# 注意:运行这一步之前先确保juju config ovn-chassis-b bridge-interface-mappings="br-data:veth-ex"
./configure

# 注:这里必须使用--dhcp, 因为ovn是将sriov port当external port来看待的,目的就是自己来提供dhcp与metada服务的。
openstack network show sriov_net || \
  openstack network create --provider-network-type flat --external --share \
                           --provider-physical-network physnet1 sriov_net
openstack subnet show sriov_subnet || \
    openstack subnet create --allocation-pool start=10.230.58.100,end=10.230.58.200 \
                            --subnet-range 10.230.56.0/21 --dhcp --gateway 10.230.56.1 \
                            --ip-version 4 --network sriov_net sriov_subnet
openstack subnet set sriov_subnet --dhcp
openstack port show sriov-port0 || \
  openstack port create --network my-network --vnic-type direct \
    --binding-profile '{"capabilities": ["switchdev"]}' sriov-port0
  openstack port create --network sriov_net --vnic-type direct sriov-port0

cat << EOF > user-data
#cloud-config
user: ubuntu
password: password
chpasswd: { expire: False }
EOF
nova availability-zone-list #--availability-zone=duduo.segmaas.1ss
openstack keypair create --public-key ~/.ssh/id_rsa.pub mykey
openstack server create --wait --image bionic --flavor m1.small --key-name mykey --nic \
  port-id=$(openstack port show sriov-port0 -f value -c id) --user-data ./user-data --config-drive true i1

注:但是上面有一个问题,如使用上述’openstack port create’创建的port如IP是10.230.58.178, 但实际console-log中看到的是10.230.63.102, 所以不得不使用下列命令将IP设置为10.230.63.102。暂不清楚为什么,可使用下列命令暂时避开问题。注:这个问题的解决方法见如下附录内容。

neutron port-create sriov_net --name sriov-port0 --fixed-ip ip_address=10.230.63.102 \
  --binding:host_id duduo.segmaas.1ss --binding:vnic-type direct --binding:profile \
  type=dict pci_vendor_info=8086:1515,pci_slot=0000:04:11.5,physical_network=physnet1
#neutron port-update --fixed-ip ip_address=10.230.63.102 sriov-port0
# nova interface-attach --port-id $(neutron port-show sriov-port0 |grep ' id ' |awk -F '|' '{print $3}') i1

一些数据

$ openstack port show sriov-port0 |grep binding
| binding_host_id         | duduo.segmaas.1ss                                                                                                                                                     |
| binding_profile         | pci_slot='0000:04:11.5', pci_vendor_info='8086:1515', physical_network='physnet1'                                                                                     |
| binding_vif_details     | connectivity='l2', port_filter='False', vlan='0'                                                                                                                      |
| binding_vif_type        | hw_veb                                                                                                                                                                |
| binding_vnic_type       | direct

ovn的sr-iov port会生成external port:

root@z-rotomvm17:~# ovn-nbctl find Logical_Switch_Port type=external
_uuid               : c8ff2b37-754f-4939-9f6d-189d1198bb67
addresses           : ["fa:16:3e:d8:2e:c9 10.230.63.102"]
dhcpv4_options      : []
dhcpv6_options      : []
dynamic_addresses   : []
enabled             : true
external_ids        : {"neutron:cidrs"="10.230.63.102/21", "neutron:device_id"="5c5f0eca-2a36-4631-9bb9-92ebe7a50f7c", "neutron:device_owner"="compute:nova", "neutron:network_name"=neutron-8247f7a8-8efa-457a-b326-b147ec656843, "neutron:port_name"=sriov-port0, "neutron:project_id"=fe80865dbc3f42dcafd31bf6ba3e9ff2, "neutron:revision_number"="6", "neutron:security_group_ids"="b5ee4c4c-39a6-450e-868d-9d196655b8a5"}
ha_chassis_group    : 1c61528a-d4f4-4a93-8ecb-c99bf27f4ce4
name                : "52bf9bf6-d2e9-4d2e-8a0d-15c971cfa397"
options             : {mcast_flood_reports="true"}
parent_name         : []
port_security       : []
tag                 : []
tag_request         : []
type                : external
up                  : true

external port关联有ha_chassis_group(1c61528a-d4f4-4a93-8ecb-c99bf27f4ce4), ha_chassis_group再关联了两个HA_Chassis ,所以也需设置:
juju config ovn-chassis ovn-bridge-mappings=“physnet1:br-data”
juju config ovn-chassis ovn-bridge-mappings=“physnet1:br-data physnet2:br-data2”

root@z-rotomvm17:~# ovn-nbctl list HA_Chassis_Group
_uuid               : 1c61528a-d4f4-4a93-8ecb-c99bf27f4ce4
external_ids        : {}
ha_chassis          : [447cf2b7-1565-4811-a1c6-a93907f88fca, c5bc821b-62fd-4635-a59a-fdf49441c920]
name                : default_ha_chassis_group

root@z-rotomvm17:~# ovn-nbctl list HA_Chassis
_uuid               : 447cf2b7-1565-4811-a1c6-a93907f88fca
chassis_name        : crustle.segmaas.1ss
external_ids        : {}
priority            : 32766

_uuid               : c5bc821b-62fd-4635-a59a-fdf49441c920
chassis_name        : duduo.segmaas.1ss
external_ids        : {}
priority            : 32765

两个节点的外部连接是:

root@duduo:~# ovs-vsctl get open . external-ids
{hostname=duduo.segmaas.1ss, ovn-bridge-mappings="=physnet1:br-data", ovn-cms-options=enable-chassis-as-gw, ovn-encap-ip="10.230.59.27", ovn-encap-type=geneve, ovn-remote="ssl:10.230.59.34:6642,ssl:10.230.59.38:6642,ssl:10.230.59.28:6642", rundir="/var/run/openvswitch", system-id=duduo.segmaas.1ss}
root@crustle:~# ovs-vsctl get open . external-ids
{hostname=crustle.segmaas.1ss, ovn-bridge-mappings="=physnet1:br-data", ovn-cms-options=enable-chassis-as-gw, ovn-encap-ip="10.230.57.58", ovn-encap-type=geneve, ovn-remote="ssl:10.230.59.34:6642,ssl:10.230.59.38:6642,ssl:10.230.59.28:6642", rundir="/var/run/openvswitch", system-id=crustle.segmaas.1ss}

root@duduo:~# ovs-vsctl show
ca5fa149-b09b-4341-8ead-9b965bafda07
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port ovn-crustl-0
            Interface ovn-crustl-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="10.230.57.58"}
                bfd_status: {diagnostic="No Diagnostic", flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, state=up}
        Port br-int
            Interface br-int
                type: internal
    Bridge br-data
        fail_mode: standalone
        datapath_type: system
        Port br-data
            Interface br-data
                type: internal
        Port veth-ex
            Interface veth-ex
                type: system
    ovs_version: "2.13.1"

root@crustle:~# ovs-vsctl show
87d62492-3d36-4667-af23-8ecc1050a9d1
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-data
        fail_mode: standalone
        datapath_type: system
        Port br-data
            Interface br-data
                type: internal
        Port veth-ex
            Interface veth-ex
                type: system
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port br-int
            Interface br-int
                type: internal
        Port ovn-duduo.-0
            Interface ovn-duduo.-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="10.230.59.27"}
                bfd_status: {diagnostic="No Diagnostic", flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, state=up}

ovn-sbctl 与ovn-nbctl 如下:

root@z-rotomvm17:~# ovn-nbctl show
switch 8c6e778b-1dc3-453f-baf5-aac15d539e2e (neutron-8247f7a8-8efa-457a-b326-b147ec656843) (aka sriov_net)
    port provnet-4b4b5e6c-ae3c-4c66-b5a1-305ebf00612a
        type: localnet
        addresses: ["unknown"]
    port 6081a0b9-933e-4f51-b1d7-28a48275245d
        type: localport
        addresses: ["fa:16:3e:58:6d:b1"]
    port 52bf9bf6-d2e9-4d2e-8a0d-15c971cfa397 (aka sriov-port0)
        type: external
        addresses: ["fa:16:3e:d8:2e:c9 10.230.63.102"]
root@z-rotomvm17:~# 
root@z-rotomvm17:~# ovn-sbctl show
Chassis duduo.segmaas.1ss
    hostname: duduo.segmaas.1ss
    Encap geneve
        ip: "10.230.59.27"
        options: {csum="true"}
Chassis crustle.segmaas.1ss
    hostname: crustle.segmaas.1ss
    Encap geneve
        ip: "10.230.57.58"
        options: {csum="true"}
    Port_Binding "52bf9bf6-d2e9-4d2e-8a0d-15c971cfa397"

port-binding是:

root@z-rotomvm17:~# ovn-sbctl list Port_Binding 7454132d-ada0-4f87-b7b6-51fd6c3fb20a
_uuid               : 7454132d-ada0-4f87-b7b6-51fd6c3fb20a
chassis             : 1c71641e-1fc5-440d-a02e-94d472323dd9
datapath            : 90dbeff4-784e-44e2-911b-dcf41a40ab17
encap               : []
external_ids        : {name=sriov-port0, "neutron:cidrs"="10.230.63.102/21", "neutron:device_id"="5c5f0eca-2a36-4631-9bb9-92ebe7a50f7c", "neutron:device_owner"="compute:nova", "neutron:network_name"=neutron-8247f7a8-8efa-457a-b326-b147ec656843, "neutron:port_name"=sriov-port0, "neutron:project_id"=fe80865dbc3f42dcafd31bf6ba3e9ff2, "neutron:revision_number"="6", "neutron:security_group_ids"="b5ee4c4c-39a6-450e-868d-9d196655b8a5"}
gateway_chassis     : []
ha_chassis_group    : 95760231-de64-42bd-a1b5-67d6211249f7
logical_port        : "52bf9bf6-d2e9-4d2e-8a0d-15c971cfa397"
mac                 : ["fa:16:3e:d8:2e:c9 10.230.63.102"]
nat_addresses       : []
options             : {mcast_flood_reports="true"}
parent_port         : []
tag                 : []
tunnel_key          : 3
type                : external
virtual_parent      : []

为什么有时候这个port-binding会在两台机器之间反复切换?

附录 - 为什么根据port启动vm后的IP与port中的IP不一样呢?

上面我们遇到了这么一个问题,直接通过下列命令创建的sriov-port0的IP与虚机i1启动后’nova console-log’中看到的IP不一致。

openstack port create --network sriov_net --vnic-type direct sriov-port0
openstack server create --wait --image bionic --flavor m1.small --key-name mykey --nic port-id=$(openstack port show sriov-port0 -f value -c id)  i1

如下列i2中的10.230.58.150与实际IP10.230.63.103不一致。

$ nova list
+--------------------------------------+------+--------+------------+-------------+-------------------------+
| ID                                   | Name | Status | Task State | Power State | Networks                |
+--------------------------------------+------+--------+------------+-------------+-------------------------+
| 5c5f0eca-2a36-4631-9bb9-92ebe7a50f7c | i1   | ACTIVE | -          | Running     | sriov_net=10.230.63.102 |
| 491fd4ca-fc10-4767-9676-2c232aa47cdd | i2   | ACTIVE | -          | Running     | sriov_net=10.230.58.150 |
+--------------------------------------+------+--------+------------+-------------+-------------------------
$ nova console-log i2 |grep '10.230'
[   32.024202] cloud-init[861]: ci-info: |  ens3  | True |        10.230.63.103         | 255.255.248.0 | global | fa:16:3e:6b:f4:c1 |
[   32.097507] cloud-init[861]: ci-info: |   0   |   0.0.0.0   | 10.230.56.1 |     0.0.0.0     |    ens3   |   UG  |
[   32.104159] cloud-init[861]: ci-info: |   1   | 10.230.56.0 |   0.0.0.0   |  255.255.248.0  |    ens3   |   U   |
[   32.107280] cloud-init[861]: ci-info: |   2   | 10.230.56.1 |   0.0.0.0   | 255.255.255.255 |    ens3   |   UH  |

显然,是因为虚机使用了物理环境中的dhcp server,而应该使用ovn自己提供的dhcp server, 所以创建subnet时应该添加“–dhcp” (注:对于非ovn的sr-iov环境却只能使用–no-dhcp)

openstack subnet create --allocation-pool start=10.230.58.100,end=10.230.58.200 \
                            --subnet-range 10.230.56.0/21 --dhcp --gateway 10.230.56.1 \
                            --ip-version 4 --network sriov_net sriov_subnet

但是我们看到ovn中的IP确实是正确的,

root@z-rotomvm17:~# ovn-nbctl show
switch 8c6e778b-1dc3-453f-baf5-aac15d539e2e (neutron-8247f7a8-8efa-457a-b326-b147ec656843) (aka sriov_net)
    ...
    port fc3d6839-a2c0-45f7-802b-c5049bdfb225 (aka sriov-port1)
        type: external
        addresses: ["fa:16:3e:6b:f4:c1 10.230.58.150"]
root@z-rotomvm17:~# ovn-sbctl show
Chassis duduo.segmaas.1ss
    hostname: duduo.segmaas.1ss
    Encap geneve
        ip: "10.230.59.27"
        options: {csum="true"}
Chassis crustle.segmaas.1ss
    hostname: crustle.segmaas.1ss
    Encap geneve
        ip: "10.230.57.58"
        options: {csum="true"}
    Port_Binding "fc3d6839-a2c0-45f7-802b-c5049bdfb225"
    Port_Binding "52bf9bf6-d2e9-4d2e-8a0d-15c971cfa397"
root@z-rotomvm17:~# ovn-sbctl list Port_Binding "fc3d6839-a2c0-45f7-802b-c5049bdfb225"
_uuid               : 3a02719d-789e-4524-9584-4a70bb7973aa
chassis             : 1c71641e-1fc5-440d-a02e-94d472323dd9
datapath            : 90dbeff4-784e-44e2-911b-dcf41a40ab17
encap               : []
external_ids        : {name=sriov-port1, "neutron:cidrs"="10.230.58.150/21", "neutron:device_id"="491fd4ca-fc10-4767-9676-2c232aa47cdd", "neutron:device_owner"="compute:nova", "neutron:network_name"=neutron-8247f7a8-8efa-457a-b326-b147ec656843, "neutron:port_name"=sriov-port1, "neutron:project_id"=fe80865dbc3f42dcafd31bf6ba3e9ff2, "neutron:revision_number"="5", "neutron:security_group_ids"="b5ee4c4c-39a6-450e-868d-9d196655b8a5"}
gateway_chassis     : []
ha_chassis_group    : 95760231-de64-42bd-a1b5-67d6211249f7
logical_port        : "fc3d6839-a2c0-45f7-802b-c5049bdfb225"
mac                 : ["fa:16:3e:6b:f4:c1 10.230.58.150"]
nat_addresses       : []
options             : {mcast_flood_reports="true"}
parent_port         : []
tag                 : []
tunnel_key          : 4
type                : external
virtual_parent      : []

继续确认ovn的一些设置也没问题:

root@z-rotomvm17:~# ovn-nbctl list logical_switch_port b945c6c1-918b-4353-a855-ad5565ec9742 |grep dhcpv4
dhcpv4_options      : 57e0127c-1d3b-458c-afd2-8c4b4f93a99e

root@z-rotomvm17:~# ovn-nbctl find dhcp_options cidr="10.230.56.0/21"
_uuid               : 57e0127c-1d3b-458c-afd2-8c4b4f93a99e
cidr                : "10.230.56.0/21"
external_ids        : {"neutron:revision_number"="1", subnet_id="6b919920-610c-45ea-b351-bdd957e54f6e"}
options             : {classless_static_route="{169.254.169.254/32,10.230.58.100, 0.0.0.0/0,10.230.56.1}", dns_server="{127.0.0.53}", lease_time="43200", mtu="1500", router="10.230.56.1", server_id="10.230.56.1", server_mac="fa:16:3e:72:6e:44"}

# ovn-sbctl lflow-list |grep '127.0.0.53'
...

When virtual machines are booted on hypervisors supporting SR-IOV nics, the local ovn-controllers are unable to reply to the VM’s DHCP, internal DNS, IPv6 router solicitation requests, etc… since the hypervisor is bypassed in the SR-IOV case. OVN then introduced the idea of having external ports which are able to reply on behalf of those VM ports external to the hypervisor that they are running on
根据上面的描述,我们继续查找,crustle的priority最大(32766) ( ovn-nbctl -f csv list ha_chassis |egrep -v ‘^_uuid’ |sort -t ‘,’ -k 4 ),dhcp flow rule应该在它上面。

root@z-rotomvm17:~# ovn-nbctl list ha_chassis_group
_uuid               : 1c61528a-d4f4-4a93-8ecb-c99bf27f4ce4
external_ids        : {}
ha_chassis          : [447cf2b7-1565-4811-a1c6-a93907f88fca, c5bc821b-62fd-4635-a59a-fdf49441c920]
name                : default_ha_chassis_group
root@z-rotomvm17:~# ovn-nbctl list ha_chassis
_uuid               : 447cf2b7-1565-4811-a1c6-a93907f88fca
chassis_name        : crustle.segmaas.1ss
external_ids        : {}
priority            : 32766

_uuid               : c5bc821b-62fd-4635-a59a-fdf49441c920
chassis_name        : duduo.segmaas.1ss
external_ids        : {}
priority            : 32765

但是此时,我们看到下列关键信息:

  • veth-ex是和br-data关联的
  • sriov vm虽说没有和br-int相联,但是dhcp flow rules显然是在br-int上的
  • 此时,在crustle上运行“ovs-ofctl -O OpenFlow13 dump-flows br-int | grep 67”看到n_packets=0, 然后没有flow经过
  • 所以此时,再运行 : tcpdump -l -i veth-ex “(port 67 or port 68 )” 只看到了dhcp reqeust, 但没有看到dhcp reply
    所以基于以上信息, 我们猜测问题是br-data与br-int没有连起来。
root@crustle:~# ovs-ofctl -O OpenFlow13 dump-flows br-data
 cookie=0x0, duration=178048.471s, table=0, n_packets=429348, n_bytes=29134154, priority=0 actions=NORMAL
 
root@crustle:~# ovs-vsctl show
87d62492-3d36-4667-af23-8ecc1050a9d1
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-data
        fail_mode: standalone
        datapath_type: system
        Port br-data
            Interface br-data
                type: internal
        Port veth-ex
            Interface veth-ex
                type: system
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port tap90dbeff4-70
            Interface tap90dbeff4-70
        Port br-int
            Interface br-int
                type: internal
        Port ovn-duduo.-0
            Interface ovn-duduo.-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="10.230.59.27"}
                bfd_status: {diagnostic="No Diagnostic", flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, state=up}

这时,我们居然看到了下列信息, 显然问题出在这个等号上:

$ juju config ovn-chassis-a ovn-bridge-mappings
=physnet1:br-data

root@crustle:~# systemctl status ovn-controller
...
May 18 04:56:30 crustle ovn-controller[19330]: ovs|00011|patch|ERR|bridge not found for localnet port 'provnet-4b4b5e6c-ae3c-4c66-b5a1-305ebf00612a' with network name 'physnet1'

更正后就好了, 就有veth peer了 ( patch-provnet-xxx-to-br-int)

juju config ovn-chassis ovn-bridge-mappings="physnet1:br-data"
juju config ovn-chassis bridge-interface-mappings="br-data:veth-ex"

Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port patch-br-int-to-provnet-4b4b5e6c-ae3c-4c66-b5a1-305ebf00612a
            Interface patch-br-int-to-provnet-4b4b5e6c-ae3c-4c66-b5a1-305ebf00612a
                type: patch
                options: {peer=patch-provnet-4b4b5e6c-ae3c-4c66-b5a1-305ebf00612a-to-br-int}

之后,虚机就可以拿到port里的IP了也可以访问了,但此时却发现crustle上看tcpudump还是看不到dhcp reply, 原来是现在ha_chassis中没有看到crustle了

root@z-rotomvm17:~# ovn-nbctl list ha_ch
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

quqi99

你的鼓励就是我创造的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值