Networking in Nova

Networking is hard. No, seriously. When I started working on Nova over a year ago, my networking skills were good enough to configure a home router. I understood basic packet structure. I once used some libpcap-based packet sniffing tool and manually decoded the authentication packets that a game was using. A year of bridging, routes, vlans, tap devices, vpns, and tcpdumping later, and I've rewritten the networking code in Nova at least three times. I understand a lot more now. And I still feel like a n00b.

Of all of the nasty bugs I've had to solve in the lifetime of Nova, I would say 90% are networking related. When someone pops on the irc channel and says: "I installed Nova but I can't ssh into my instance," it's almost always a networking misconfiguration. In the world of vms, hosts, and bridges, there is just so much that can go wrong.

Despite the complications and rewrites, Nova does some pretty intelligent network configuration. It does its best to autmatically set up your cloud in a usable fashion from a minimal set of configuration parameters. Additionally, Nova's modular structure and the awesome hackability of python make it very easy to adapt the system.

In this post I'm going to start by showing a common FlatDHCP setup, the flags used, and the resulting network layout. Next, I will discuss current options for High Availability. Then I'm going to show you how I rapidly created a new HA networking mode, give instructions on using it, and go over . Finally I'm going to talk a bit about the future of networking in Nova.

FlatDHCP

The basic premis of FlatDHCP mode in Nova is pretty [被过滤]. Each vm network is owned by one network host (which is simply a linux box running the nova-network daemon). This network host will act as the gateway for all the nics bridged into that network. Nova creates a bridge (by default br100) on the network host and assigns the gateway ip for the network to that bridge. In order for this vm network to span multiple machines, it needs to be bridged in to a raw ethernet device or vlan (set by the --flat_inte[被过滤]ce flag). If the device already has an ip address, Nova is smart enough to move that ip to the bridge and repair the default gateway route if necessary.

Nova then sets up a dhcp server to hand out addresses to the vms in the network. It will track leases and releases in the database so it knows if a vm has stopped dhcping properly. Finally, it sets up iptables rules to allow the vms to communicate with the outside world and contact a special metadata server to retrieve information from the cloud.

Compute hosts in this model are responsible for bringing up a matching bridge and bridging the vm tap devices into the same ethernet device that the network host is on. The compute hosts don't need an ip on the vm network, because the bridging puts the vms and the network host on the same logical network. When vms boot, they send out dhcp packets, and the dhcp server on the network host responds with their ip address.

Visually, the setup looks like the diagram below:

As you can see from the diagram, traffic from the vm to the public internet has to go through the host running nova network. Dhcp is handled by nova-network as well, listening on the gateway address of the fixed_range network. The compute hosts can optionally have their own public ips, or they can use the network host as their gateway. This mode is pretty [被过滤] and it works in the majority of situations, but it has one major drawback: the network host is a single point of failure! If the network host goes down for any reason, it is impossible to communicate with the vms.

Existing HA Options Option 1: Failover

The folks at NTT labs came up with a ha-linux configuration that allows for a 4 second failover to a hot backup of the network host. Details on their approach can be found in the following post to the openstack mailing list: https://lists.launchpad.net/openstack/msg02099.html

This solution is definitely an option, although it requires a second host that essentially does nothing unless there is a failure. Also four seconds can be too long for some real-time applications.

Option 2: Multi-nic

Recently, nova gained support for multi-nic. This allows us to bridge a given vm into multiple networks. This gives us some more options for high availability. It is possible to set up two networks on separate vlans (or even separate ethernet devices on the host) and give the vms a nic and an ip on each network. Each of these networks could have its own network host acting as the gateway.

In this case, the vm has two possible routes out. If one of them fails, it has the option of using the other one. The disadvantage of this approach is it offloads management of failure scenarios to the guest. The guest needs to be aware of multiple networks and have a strategy for switching between them. It also doesn't help with floating ips. One would have to set up a floating ip associated with each of the ips on private the private networks to achieve some type of redundancy.

Option 3: HW Gateway

It is possible to tell dnsmasq to use an external gateway instead of acting as the gateway for the vms. You can pass dhcpoption=3,<ip of gateway> to make the vms use an external gateway. This will require some manual setup. The metadata ip forwarding rules will need to be set on the hardware gateway instead of the nova-network host. You will have to make sure to set up routes properly so that the subnet that you use for vms is routable.

This offloads HA to standard switching hardware and it has some strong benefits. Unfortunately, nova-network is still responsible for floating ip natting and dhcp, so some failover strategy needs to be employed for those options.

New HA Option

Essentially, what the current options are lacking, is the ability to specify different gateways for different vms. An agnostic approach to a better model might propose allowing multiple gateways per vm. Unfortunately this rapidly leads to some serious networking complications, especially when it comes to the natting for floating ips. With a few assumptions about the problem domain, we can come up with a much [被过滤]r solution that is just as effective.

The key realization is that there is no need to isolate the failure domain away from the host where the vm is running. If the host itself goes down, losing networking to the vm is a non-issue. The vm is already gone. So the [被过滤] solution involves allowing each compute host to do all of the networking jobs for its own vms. This means each compute host does NAT, dhcp, and acts as a gateway for all of its own vms. While we still have a single point of failure in this scenario, it is the same point of failure that applies to all virtualized systems, and so it is about the best we can do.

So the next question is: how do we modify the Nova code to provide this option. One possibility would be to add code to the compute worker to do complicated networking setup. This turns out to be a bit painful, and leads to a lot of duplicated code between compute and network. Another option is to modify nova-network slightly so that it can run successfully on every compute node and change the message passing logic to pass the network commands to a local network worker.

Surprisingly, the code is relatively [被过滤]. A couple fields needed to be added to the database in order to support these new types of "multihost" networks without breaking the functionality of the existing system. All-in-all it is a pretty small set of changes for a lot of added functionality: about 250 lines, including quite a bit of cleanup. You can see the branch (still under review) here:https://code.launchpad.net/~vishvananda/nova/ha-net/+merge/67078

The drawbacks here are relatively minor. It requires adding an ip on the vm network to each host in the system, and it implies a little more overhead on the compute hosts. It is also possible to combine this with option 3 above to remove the need for your compute hosts to gateway. In that hybrid version they would no longer gateway for the vms and their responsibilities would only be dhcp and nat.

The resulting layout for the new ha networking option looks the following diagram:

In contrast with the earlier diagram, all the hosts in the system are running both nova-compute and nova-network.  Each host does dhcp and does nat for public traffic for the vms running on that particular host. In this model every compute host requires a connection to the public internet and each host is also assigned an address from the vm network where it listens for dhcp traffic.

Future of Networking

With the existing multi-nic code and the (soon to be merged?) HA networking code, we have a pretty robust system with a lot of deployment options. This should be enough to provide deployers enough room to solve todays networking problems. Ultimately, we want to provide users the ability to create arbitrary networks and have real and virtual network appliances managed automatically. The efforts underway in the Quantum and Melange projects will help us reach this lofty goal, but with the current additions we should have enough flexibility to get us by until those projects can take over.

原文链接:

http://unchainyourbrain.com/openstack/13-networking-in-nova

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值