文章目录
基本概念
Kube-OVN
Kube-OVN 相较于其他 CNI,用户可以创建自定义 VPC,不同 VPC 之间网络独立,可以自由配置 VPC 内的 IP 地址段、子网,支持为自定义 VPC 配置路由。
用户自定义 VPC 逻辑上为强隔离模式网络,VPC 之间不能进行直接通信,VPC 也无法直接访问宿主机地址。
当然,现在可以使用 VPC peering connection 实现两个 vpc 之间互相通信
自定义 VPC 出网需要有独立的网关。可以使用 VPC external gateway。
VPP
FD.io
的矢量包处理器(VPP
)是一种快速、可扩展的第 2-4 层多平台网络堆栈。
VPP
平台是一个可扩展的框架,可提供开箱即用的生产质量交换机/路由器功能。它是思科矢量数据包处理(VPP)技术的开源版本:一种高性能的数据包处理堆栈,可以在商用 CPU 上运行。
它运行在 Linux
用户空间中,支持在多种体系结构上,包括 x86、ARM 和 Power 体系结构。
Kube-OVN 对接 VPP
为自定义 VPC 提供网络功能。
环境介绍
软件版本
Kubernetes:v1.23.6
Kube-OVN:v1.9.1
VPP:21.06
两节点的 Kubernetes,按照社区的 install.sh 进行了 Kube-OVN 安装。
图中,ens9 上的 10 是网卡子接口,通过子接口的 vlan 隔离来实现 vpc 隔离。在 vpp 中名称为 GigabitEthernet0/9/0.10
VPP 启动
VPP 配置
在 master 机器上创建配置文件
$ /etc/vpp/startup.conf
unix {
nodaemon
log /var/log/vpp/vpp.log
full-coredump
cli-listen /run/vpp/cli.sock
gid vpp
}
api-trace {
on
}
api-segment {
gid vpp
}
socksvr {
default
}
cpu {
}
dpdk {
dev 0000:00:09.0 {
num-rx-queues 1
}
}
dpdk 纳管网卡
使用 dpdk-devbind.py
工具
$ modprobe uio_pci_generic
$ ./dpdk-devbind.py --bind=uio_pci_generic 0000:00:09.0
启动 VPP
$ docker run -d --name vpp -P --net=host -v /run/vpp:/run/vpp -v /etc/vpp/:/etc/vpp/ --privileged -it ligato/vpp-base:21.06
创建 Kube-OVN 资源
创建 namespace
$ kubectl create ns user-xu
创建 vpc
kind: Vpc
apiVersion: kubeovn.io/v1
metadata:
name: vpc-user-xu
spec:
namespaces:
- user-xu
staticRoutes:
- cidr: 0.0.0.0/0
nextHopIP: 192.168.99.254 // vpc 静态路由,自定义 vpc 内流量的下一跳
policy: policyDst
创建 subnet
apiVersion: kubeovn.io/v1
kind: Subnet
metadata:
name: subnet1
spec:
vpc: vpc-user-xu
protocol: IPv4
default: false
cidrBlock: 192.168.99.0/24
excludeIps:
- 192.168.99.1
gateway: 192.168.99.1
namespaces:
- user-xu
gatewayNode: ""
gatewayType: distributed
natOutgoing: true
private: false
创建测试 pod
apiVersion: v1
kind: Pod
metadata:
name: pod1
namespace: user-xu
annotations:
ovn.kubernetes.io/logical_switch: subnet1
spec:
containers:
- name: pod1
command: ["/bin/ash", "-c", "trap : TERM INT; sleep 36000 & wait"]
image: rancher/curl
查看 OVN 信息
$ kubectl ko nbctl show
switch ac8aba07-05a1-4660-8d1a-a2d3fbe2cb23 (subnet1)
port subnet1-vpc-user-xu
type: router
router-port: vpc-user-xu-subnet1
port pod1.user-xu
addresses: ["00:00:00:27:41:C0 192.168.99.3"]
router 2979bc74-d9ca-45e7-ba0f-577f9468d097 (vpc-user-xu)
port vpc-user-xu-subnet1
mac: "00:00:00:85:C1:D7"
networks: ["192.168.99.1/24"]
$ kubectl ko nbctl lr-route-list vpc-user-xu
IPv4 Routes
0.0.0.0/0 192.168.99.254 dst-ip
vpp 对接配置
在 vpp
上配置
vpp# ip table add 1 // 创建 vrf 1
vpp# create tap id 10 host-if-name eth0 // 创建 vlan10 的虚拟接口
vpp# set interface ip table tap10 1 // 将虚拟接口加到 vrf 1 中
vpp# set interface mac address tap10 00:00:00:16:FA:4A
vpp# set interface ip address tap10 192.168.99.254/24 // 配置接口的 ip,mac
vpp# set interface state tap10 up
可以在 master 节点上看到
$ ip a
111: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:a5:c7:b3:c5 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fe:a5ff:fec7:b3c5/64 scope link
valid_lft forever preferred_lft forever
ovn 对接配置
创建 logical switch port,并配置 IP,MAC (和 vpp 对接配置相同)。
$ kubectl ko nbctl lsp-add subnet1 subnet1.vpp
$ kubectl ko nbctl lsp-set-addresses subnet1.vpp "00:00:00:16:FA:4A 192.168.99.254"
找到 master 节点的 ovs ovs-ovn-rkzrm
,将 eth0
加到 br-int
网桥上
$ kubectl exec -it ovs-ovn-rkzrm -n kube-system sh
ovs-vsctl add-port br-int eth0 -- set interface eth0 external_ids:iface-id=subnet1.vpp
vpp 外部网络配置
创建子接口,配置 IP 和默认路由
vpp# create sub-interfaces GigabitEthernet0/9/0 10
vpp# create sub-interfaces GigabitEthernet0/9/0 10 dot1q 10
vpp# set interface ip table GigabitEthernet0/9/0.10 1
vpp# set interface ip address GigabitEthernet0/9/0.10 192.168.10.2/24
vpp# set interface state GigabitEthernet0/9/0.10 up
vpp 默认路由
ip route add 0.0.0.0/0 table 1 via 192.168.10.1 GigabitEthernet0/9/0.10
测试
Ping 外部网络
$ kubectl exec -it pod1 -n user-xu sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ # ping 114.114.114.114
PING 114.114.114.114 (114.114.114.114): 56 data bytes
64 bytes from 114.114.114.114: seq=0 ttl=65 time=11.497 ms
64 bytes from 114.114.114.114: seq=1 ttl=77 time=10.487 ms
64 bytes from 114.114.114.114: seq=2 ttl=62 time=12.263 ms
64 bytes from 114.114.114.114: seq=3 ttl=63 time=10.315 ms
^C
--- 114.114.114.114 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 10.315/11.140/12.263 ms
在 192.168.10.1/24
上抓包,vlan 10 tag。
09:32:03.780390 52:54:00:8b:37:eb > 52:54:00:18:5f:80, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 192.168.99.3 > 114.114.114.114: ICMP echo request, id 11008, seq 0, length 64
09:32:03.790260 52:54:00:18:5f:80 > 52:54:00:8b:37:eb, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 114.114.114.114 > 192.168.99.3: ICMP echo reply, id 11008, seq 0, length 64
09:32:04.780633 52:54:00:8b:37:eb > 52:54:00:18:5f:80, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 192.168.99.3 > 114.114.114.114: ICMP echo request, id 11008, seq 1, length 64
09:32:04.790078 52:54:00:18:5f:80 > 52:54:00:8b:37:eb, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 114.114.114.114 > 192.168.99.3: ICMP echo reply, id 11008, seq 1, length 64
09:32:05.780678 52:54:00:8b:37:eb > 52:54:00:18:5f:80, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 192.168.99.3 > 114.114.114.114: ICMP echo request, id 11008, seq 2, length 64
09:32:05.791946 52:54:00:18:5f:80 > 52:54:00:8b:37:eb, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 114.114.114.114 > 192.168.99.3: ICMP echo reply, id 11008, seq 2, length 64
09:32:06.780870 52:54:00:8b:37:eb > 52:54:00:18:5f:80, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 192.168.99.3 > 114.114.114.114: ICMP echo request, id 11008, seq 3, length 64
09:32:06.790365 52:54:00:18:5f:80 > 52:54:00:8b:37:eb, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 114.114.114.114 > 192.168.99.3: ICMP echo reply, id 11008, seq 3, length 64
Snat
将该 vpc
内的出网流量 Snat
转换为 192.168.10.3
vpp# nat44 enable
vpp# set int nat44 in tap10 out GigabitEthernet0/9/0.10
vpp# nat44 add int address GigabitEthernet0/9/0.10
vpp# nat44 add address 192.168.10.3 - 192.168.10.3 tenant-vrf 0
还用之前的 pod
ping
测试
10:15:06.011863 52:54:00:8b:37:eb > 52:54:00:18:5f:80, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 192.168.10.3 > 114.114.114.114: ICMP echo request, id 14336, seq 0, length 64
10:15:06.020984 52:54:00:18:5f:80 > 52:54:00:8b:37:eb, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 114.114.114.114 > 192.168.10.3: ICMP echo reply, id 14336, seq 0, length 64
10:15:07.011424 52:54:00:8b:37:eb > 52:54:00:18:5f:80, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 192.168.10.3 > 114.114.114.114: ICMP echo request, id 14336, seq 1, length 64
10:15:07.020894 52:54:00:18:5f:80 > 52:54:00:8b:37:eb, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 114.114.114.114 > 192.168.10.3: ICMP echo reply, id 14336, seq 1, length 64
10:15:08.015413 52:54:00:8b:37:eb > 52:54:00:18:5f:80, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 192.168.10.3 > 114.114.114.114: ICMP echo request, id 14336, seq 2, length 64
10:15:08.024991 52:54:00:18:5f:80 > 52:54:00:8b:37:eb, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 114.114.114.114 > 192.168.10.3: ICMP echo reply, id 14336, seq 2, length 64
10:15:09.011814 52:54:00:8b:37:eb > 52:54:00:18:5f:80, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 192.168.10.3 > 114.114.114.114: ICMP echo request, id 14336, seq 3, length 64
10:15:09.020859 52:54:00:18:5f:80 > 52:54:00:8b:37:eb, ethertype 802.1Q (0x8100), length 102: vlan 10, p 0, ethertype IPv4, 114.114.114.114 > 192.168.10.3: ICMP echo reply, id 14336, seq 3, length 64
Dnat
使用 tcpdump 启动 新 pod
kubectl get pods -n user-xu -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod9 1/1 Running 0 20s 192.168.99.4 worker <none> <none>
从外部访问 pod
,dnat 成 pod ip
#vpp nat44 add static mapping local 192.168.99.4 external 192.168.10.99 vrf 1
在 外部 ping
192.168.10.99
,在 pod9 抓包
kubectl exec -it pod9 -n user-xu sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ # tcpdump -i eth0 -ne
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
14:24:19.174668 00:00:00:16:fa:4a > 00:00:00:5c:a8:1d, ethertype IPv4 (0x0800), length 98: 192.168.10.1 > 192.168.99.4: ICMP echo request, id 7454, seq 57, length 64
14:24:19.175998 00:00:00:5c:a8:1d > 00:00:00:85:c1:d7, ethertype IPv4 (0x0800), length 98: 192.168.99.4 > 192.168.10.1: ICMP echo reply, id 7454, seq 57, length 64
14:24:20.171272 00:00:00:16:fa:4a > 00:00:00:5c:a8:1d, ethertype IPv4 (0x0800), length 98: 192.168.10.1 > 192.168.99.4: ICMP echo request, id 7454, seq 58, length 64
14:24:20.171337 00:00:00:5c:a8:1d > 00:00:00:85:c1:d7, ethertype IPv4 (0x0800), length 98: 192.168.99.4 > 192.168.10.1: ICMP echo reply, id 7454, seq 58, length 64
14:24:21.171681 00:00:00:16:fa:4a > 00:00:00:5c:a8:1d, ethertype IPv4 (0x0800), length 98: 192.168.10.1 > 192.168.99.4: ICMP echo request, id 7454, seq 59, length 64
14:24:21.171747 00:00:00:5c:a8:1d > 00:00:00:85:c1:d7, ethertype IPv4 (0x0800), length 98: 192.168.99.4 > 192.168.10.1: ICMP echo reply, id 7454, seq 59, length 64
总结
以上,通过 vpp 连接 kube-ovn 为 kube-ovn 自定义 vpc 提供网络功能,将自定义 vpc 的 pod 的流量接入到 vpp 中;本文测试了自定义 vpc 内的 pod 出外网,snat 和 dnat(浮动 IP)的功能。
通过这种方法,我们可以通过 vpp 已有功能或者开发 vpp 来满足我们的 SDN 需求。后续有时间,为大家介绍如何通过 vpp 为 kube-ovn 自定义 vpc 内 pod 提供黑白名单,限速,以及 dns cache 等功能。