OpenStack源码分析【2021-12-06】

2021SC@SDUSC

Neutron初探

What is Neutron?

According to OpenStack Document,

Neutron is a networking project focused on delivering
Networking-as-a-Service(NaaS) in virtual compute environments.

Other says,

Networking project in OpenStack Sets up virtual network infrastructure
Switching and routing Also specialized virtual network functions like
VPNaaS, FWaaS, LBaaS Flexibility through plugins, drivers and agents

A video on bilibili provided by HUAWEI states that

Neutron是Openstack中用于提供网络服务的组件
它基于软件定义网络的思想

In conclusion, Neutron is OpenStack’s network services provider, which is based on SDN concept and delivers NaaS.

Neutron Features

  1. Ability to logically seperate provisioning and tenant networks
  2. Neutron can be scaled to thousands of nodes and networks
  3. Neutron is primarily driven through Networking API v2.0
  4. Auto allocation for creating network topologies

Here’s a part of glossary

  • Bridge-int: 实现内部网络功能的网桥
  • Br-ex: 与外部网络通信的网桥
  • Neutron-server: 提供API接口,把API的调用请求传递给已经配置好的插件进行后续处理
  • Neutron-L2-agent: 二层代理,用于管理VLAN的插件,接收Neutron-server的指令创建VLAN
  • Neutron-DHCP-agent: 用于创建子网,并为每个子网自动分发ip地址的组件
  • Neutron-L3-agent: 负责租户网络和floating IP间的地址转换,通过Linux iptables中的nat功能实现地址转换
  • Neutron-metadata-agent: 运行在网络节点上,用于响应Nova的metadata请求
  • LBaaS agent: 为多台实例和open vswitch agent提供负载均衡服务

Hope they can help you better understand 😃

The Procedure

当Neutron通过api接口接收来自用户或其它组件的网络服务请求时,以消息队列的方式提交给二层或三层代理,其中DHCP-agent实现子网的创建和ip地址的自动分发,而L2-agent实现相同VLAN下网络的通信,L3-agent可以实现同一个租户网络下不同子网间的通信。

The Architecture在这里插入图片描述

We can see that network in OpenStack contains the following main components: Network, Subnet and Router. A tenant network can contains several Networks. Each Network contains several Subnets. A Subnet has several ports with instances or DHCP binded to them repectively.
Before we set the router, instances in one subnet can ping each other, but the cannot ping instances located in another subnet. After creating a router and having subnet attached to the router, cross-subnet ping is available.
To connet to the external network, we need to set a gateway ip. And to enable the external network to reach the inner network, we have to set floating ip address for the instances as well as add the external network to the same security group as the inner network. (However, this action will definately bring risks)

Then is Source Code Time!

理解网络创建的突破口在neutron/services/auto_allocate/db.py中。这个文件中定义了AutoAlocatedTopologyMixin类,类中只有两个共有函数get_auto_allocated_topology(self, context, tenant_id, fields=None)和delete_auto_allocated_topology(self, context, tenant_id)。

先来理解get_auto_allocated_topology(self, context, tenant_id, fields=None)函数干了什么:

def get_auto_allocated_topology(self, context, tenant_id, fields=None):
    fields = fields or []
    tenant_id = self._validate(context, tenant_id)
    if CHECK_REQUIREMENTS in fields:
        return self._check_requirements(context, tenant_id)
    elif fields:
        raise n_exc.BadRequest(resource='auto_allocate',
                               msg=_("Unrecognized field"))
    network_id = self._get_auto_allocated_network(context, tenant_id)
    if network_id:
        return self._response(network_id, tenant_id, fields=fields)
    default_external_network = self._get_default_external_network(
        context)
    network_id = self._build_topology(
        context, tenant_id, default_external_network)
    return self._response(network_id, tenant_id, fields=fields)

这个函数的目的是返回与租户网络绑定的自动分配的拓扑结构。

它先去验证了一下租户id,如果调用_validate时没有提供租户id,则使用context中的租户id并返回;如果提供了租户id,如果当前上下文是admin状态,或提供的租户id和当前上下文中的租户id相同,可以通过验证;否则,报NotAuthorized异常。

如果是测试模式(dry-run),则调用_check_requirements,返回requirement是否被满足;如果不在dry-run模式但field里面有东西,则报错“Unrecognized field。

然后,调用 _ get_auto_allocated_network(context, tenant_id)查看是否有已经被创建好的网络。这个函数内部逻辑是:调用_ get_auto_allocated_topology(context, tenant_id) 获取已分配的网络拓扑,并返回该网络拓扑的id。这个函数的内部逻辑是:返回已分配的拓扑的对象。

如果一系列调用返回回来network_id有值,说明存在该租户的网络,则把itenant_d和network_id放入res中返回。

如果network_id为空,则调用default_external_network = self._get_default_external_network(context)获取一个默认的外部网络。该函数内部逻辑如下:

def _get_default_external_network(self, context):
    default_external_networks = net_obj.ExternalNetwork.get_objects(
        context, is_default=True)

    if not default_external_networks:
        LOG.error("Unable to find default external network "
                  "for deployment, please create/assign one to "
                  "allow auto-allocation to work correctly.")
        raise exceptions.AutoAllocationFailure(
            reason=_("No default router:external network"))
    if len(default_external_networks) > 1:
        LOG.error("Multiple external default networks detected. "
                  "Network %s is true 'default'.",
                  default_external_networks[0]['network_id'])
    return default_external_networks[0].network_id

它调用了net_obj.ExternalNetwork.get_objects(context, is_default=True)来创建默认的网络对象,如果创建失败则报错,返回创建的网络的id(如果创建了多个则返回首个的id)。

回到get_auto_allocated_topology(…)函数,接下来它调用_build_topology(context, tenant_id, default_external_network)为刚创建好的网络对象构建拓扑。该函数内部逻辑如下:

def _build_topology(self, context, tenant_id, default_external_network):
    """Build the network topology and returns its network UUID."""
    try:
        subnets = self._provision_tenant_private_network(
            context, tenant_id)
        network_id = subnets[0]['network_id']
        router = self._provision_external_connectivity(
            context, default_external_network, subnets, tenant_id)
        network_id = self._save(
            context, tenant_id, network_id, router['id'], subnets)
        return network_id
    except exceptions.UnknownProvisioningError as e:
        # Clean partially provisioned topologies, and reraise the
        # error. If it can be retried, so be it.
        LOG.error("Unknown error while provisioning topology for "
                  "tenant %(tenant_id)s. Reason: %(reason)s",
                  {'tenant_id': tenant_id, 'reason': e})
        self._cleanup(
            context, network_id=e.network_id,
            router_id=e.router_id, subnets=e.subnets)
        raise e.error

后半部分都在处理异常,我们先不看,前半部分它为这个网络创建了两个重要的部分,一个是调用_ provision_tenant_private_network(context, tenant_id)创建私有subnet,一个是调用_ provision_external_connectivity(context, default_external_network, subnets, tenant_id)创建虚拟路由器。

_ provision_tenant_private_network内部逻辑如下:

def _provision_tenant_private_network(self, context, tenant_id):
    """Create a tenant private network/subnets."""
    network = None
    try:
        network_args = {
            'name': 'auto_allocated_network',
            'admin_state_up': False,
            'tenant_id': tenant_id,
            'shared': False
        }
        network = p_utils.create_network(
            self.core_plugin, context, {'network': network_args})
        subnets = []
        for pool in self._get_supported_subnetpools(context):
            subnet_args = {
                'name': 'auto_allocated_subnet_v%s' % pool['ip_version'],
                'network_id': network['id'],
                'tenant_id': tenant_id,
                'ip_version': pool['ip_version'],
                'subnetpool_id': pool['id'],
            }
            subnets.append(p_utils.create_subnet(
                self.core_plugin, context, {'subnet': subnet_args}))
        return subnets
    except (n_exc.SubnetAllocationError, ValueError,
            n_exc.BadRequest, n_exc.NotFound) as e:
        LOG.error("Unable to auto allocate topology for tenant "
                  "%(tenant_id)s due to missing or unmet "
                  "requirements. Reason: %(reason)s",
                  {'tenant_id': tenant_id, 'reason': e})
        if network:
            self._cleanup(context, network['id'])
        raise exceptions.AutoAllocationFailure(
            reason=_("Unable to provide tenant private network"))
    except Exception as e:
        network_id = network['id'] if network else None
        raise exceptions.UnknownProvisioningError(e, network_id=network_id)

它先创建了一个网络,然后获取所有可用子网的集合,并把这个网络的一些信息和可用子网的信息组合放入子网列表中返回。

_ provision_external_connectivity是用于将子网与外部网络连接上的,其内部逻辑如下:

def _provision_external_connectivity(self, context,
                                     default_external_network, subnets,
                                     tenant_id):
    """Uplink tenant subnet(s) to external network."""
    router_args = {
        'name': 'auto_allocated_router',
        l3_apidef.EXTERNAL_GW_INFO: {
            'network_id': default_external_network},
        'tenant_id': tenant_id,
        'admin_state_up': True
    }
    router = None
    attached_subnets = []
    try:
        router = self.l3_plugin.create_router(
            context, {'router': router_args})
        for subnet in subnets:
            self.l3_plugin.add_router_interface(
                context, router['id'], {'subnet_id': subnet['id']})
            attached_subnets.append(subnet)
        return router
    except n_exc.BadRequest as e:
        LOG.error("Unable to auto allocate topology for tenant "
                  "%(tenant_id)s because of router errors. "
                  "Reason: %(reason)s",
                  {'tenant_id': tenant_id, 'reason': e})
        router_id = router['id'] if router else None
        self._cleanup(context,
                      network_id=subnets[0]['network_id'],
                      router_id=router_id, subnets=attached_subnets)
        raise exceptions.AutoAllocationFailure(
            reason=_("Unable to provide external connectivity"))
    except Exception as e:
        router_id = router['id'] if router else None
        raise exceptions.UnknownProvisioningError(
            e, network_id=subnets[0]['network_id'],
            router_id=router_id, subnets=subnets)

它调用l3_plugin.create_router(…)创建路由,把每个子网的interface加入这个路由中,最后返回这个路由。其它操作(包括attached_subnets数据结构)都是为异常处理准备的,暂时不看。

回到函数get_auto_allocated_topology,它成功创建了一个默认的网络,于是心满意足地把network_id和tenant_id放入res中存下了。

再看delete_auto_allocated_topology函数,其代码如下:

def delete_auto_allocated_topology(self, context, tenant_id):
    tenant_id = self._validate(context, tenant_id)
    topology = self._get_auto_allocated_topology(context, tenant_id)
    if topology:
        subnets = self.core_plugin.get_subnets(
            context,
            filters={'network_id': [topology['network_id']]})
        self._cleanup(
            context, network_id=topology['network_id'],
            router_id=topology['router_id'], subnets=subnets)

它也先做了租户id的验证,然后获取网络拓扑,如果能获取到,就再去获取子网,并把子网、router、和网络本身一一删除。

See you in next blog!

下次的博客中,将聚焦client中的其它功能的实现。
再下一次的博客中,将研究路由算法等。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值