SR-IOV详解

Overview of Single Root I/O Virtualization (SR-IOV)

 

The single root I/O virtualization (SR-IOV) interface is an extension to the PCI Express (PCIe) specification. SR-IOV allows a device, such as a network adapter, to separate access to its resources among various PCIe hardware functions. These functions consist of the following types:

  • A PCIe Physical Function (PF). This function is the primary function of the device and advertises the device's SR-IOV capabilities. The PF is associated with the Hyper-V parent partition in a virtualized environment.

  • One or more PCIe Virtual Functions (VFs). Each VF is associated with the device's PF. A VF shares one or more physical resources of the device, such as a memory and a network port, with the PF and other VFs on the device. Each VF is associated with a Hyper-V child partition in a virtualized environment.

Each PF and VF is assigned a unique PCI Express Requester ID (RID) that allows an I/O memory management unit (IOMMU) to differentiate between different traffic streams and apply memory and interrupt translations between the PF and VFs. This allows traffic streams to be delivered directly to the appropriate Hyper-V parent or child partition. As a result, nonprivileged data traffic flows from the PF to VF without affecting other VFs.

SR-IOV enables network traffic to bypass the software switch layer of the Hyper-V virtualization stack. Because the VF is assigned to a child partition, the network traffic flows directly between the VF and child partition. As a result, the I/O overhead in the software emulation layer is diminished and achieves network performance that is nearly the same performance as in nonvirtualized environments.

For more information, see the following topics:

SR-IOV Architecture

SR-IOV Data Paths

 

SR-IOV Architecture

This section provides a brief overview of the single root I/O virtualization (SR-IOV) interface and its components.

The following figure shows the components of the SR-IOV starting with NDIS 6.30 in Windows Server 2012.

Hh440238.SRIOVarchitecture(en-us,VS.85).png

 

The SR-IOV interface consists of the following components:

Hyper-V Extensible Switch Module

The extensible switch module that configures the NIC switch on the SR-IOV network adapter to provide network connectivity to the Hyper-V child partitions.

Note  Hyper-V child partitions are known as virtual machines (VMs).

If the child partitions are connected to a PCI Express (PCIe) Virtual Function (VF), the extensible switch module does not participate in data traffic between the VM and the network adapter. Instead, data traffic is passed directly between the VM and the VF to which it is attached.

For more information about the extensible switch, see Hyper-V Extensible Switch.

Physical Function (PF)

The PF is a PCI Express (PCIe) function of a network adapter that supports the SR-IOV interface. The PF includes the SR-IOV Extended Capability in the PCIe Configuration space. The capability is used to configure and manage the SR-IOV functionality of the network adapter, such as enabling virtualization and exposing VFs.

For more information, see SR-IOV Physical Function (PF).

PF Miniport Driver

The PF miniport driver is responsible for managing resources on the network adapter that are used by one or more VFs. Because of this, the PF miniport driver is loaded in the management operating system before any resources are allocated for a VF. The PF miniport driver is halted after all resources that were allocated for VFs are freed.

For more information, see Writing SR-IOV PF Miniport Drivers.

Virtual Function (VF)

A VF is a lightweight PCIe function on a network adapter that supports the SR-IOV interface. The VF is associated with the VF on the network adapter, and represents a virtualized instance of the network adapter. Each VF has its own PCI Configuration space. Each VF also shares one or more physical resources on the network adapter, such as an external network port, with the PF and other VFs.

For more information, see SR-IOV Virtual Functions (VFs).

VF Miniport Driver

The VF miniport driver is installed in the VM to manage the VF. Any operation that is performed by the VF miniport driver must not affect any other VF or the PF on the same network adapter.

For more information, see Writing SR-IOV VF Miniport Drivers.

Network Interface Card (NIC) Switch

The NIC switch is a hardware component of the network adapter that supports the SR-IOV interface. The NIC switch forwards network traffic between the physical port on the adapter and internal virtual ports (VPorts). Each VPort is attached to either the PF or a VF.

For more information, see NIC Switches.

Virtual Ports (VPorts)

A VPort is a data object that represents an internal port on the NIC switch of a network adapter that supports the SR-IOV interface. Similar to a port on a physical switch, a VPort on the NIC switch delivers packets to and from a PF or VF to which the port is attached.

For more information, see NIC Switches.

Physical Port

The physical port is a hardware component of the network adapter that supports the SR-IOV interface. The physical port provides the interface on the adapter to the external networking medium.

SR-IOV Data Paths

This section describes the possible data paths between a network adapter that supports single root I/O virtualization (SR-IOV) and the Hyper-V parent and child partitions.

This section includes the following topics:

Overview of SR-IOV Data Paths

SR-IOV VF Data Path

SR-IOV Synthetic Data Path

SR-IOV VF Failover and Live Migration Support

Overview of SR-IOV Data Paths

When a Hyper-V child partition is started and the guest operating system is running, the virtualization stack starts the Network Virtual Service Client (NetVSC). NetVSC exposes a virtual machine (VM) network adapter by providing a miniport driver edge to the protocol stacks that run in the guest operating system. In addition, NetVSC provides a protocol driver edge that allows it to bind to the underlying miniport drivers.

NetVSC also communicates with the Hyper-V extensible switch that runs in the management operating system of the Hyper-V parent partition. The extensible switch component operates as a Network Virtual Service Provider (NetVSP). The interface between the NetVSC and NetVSP provides a software data path that is known as the synthetic data path. For more information about this data path, see SR-IOV Synthetic Data Path.

If the physical network adapter supports the single root I/O virtualization (SR-IOV) interface, it can enable one or more PCI Express (PCIe) Virtual Functions (VFs). Each VF can be attached to a Hyper-V child partition. When this happens, the virtualization stack performs the following steps:

  1. The virtualization stack exposes a network adapter for the VF in the guest operating system. This causes the PCI driver that runs in the guest operating system to start the VF miniport driver. This driver is provided by the independent hardware vendor (IHV) for the SR-IOV network adapter.

  2. After the VF miniport driver is loaded and initialized, NDIS binds the protocol edge of the NetVSC in the guest operating system to the driver.

    Note  NetVSC only binds to the VF miniport driver. No other protocol stacks in the guest operating system can bind to the VF miniport driver.

    After the NetVSC successfully binds to the driver, network traffic in the guest operating system occurs over the VF data path. Packets are sent or received over the underlying VF of the network adapter instead of the synthetic data path.

    For more information about the VF data path, see SR-IOV VF Data Path.

The following figure shows the various data paths that are supported over an SR-IOV network adapter.

Hh440150.SRIOVdatapaths(en-us,VS.85).png

 

After the Hyper-V child partition is started and before the VF data path is established, network traffic flows over the synthetic data path. After the VF data path is established, network traffic can revert to the synthetic data path if the following conditions are true:

  • The VF becomes unattached to the Hyper-V child partition. For example, the virtualization stack could detach a VF from one child partition and attach it to another child partition. This might occur when there are more Hyper-V child partitions that are running than there are VF resources on the underlying SR-IOV network adapter.

    The process of failing over to the synthetic data path from the VF data path is known as VF failover.

  • The Hyper-V child partition is being live migrated to a different host.

SR-IOV VF Data Path

 

If the physical network adapter supports the single root I/O virtualization (SR-IOV) interface, it can enable one or more PCI Express (PCIe) Virtual Functions (VFs). Each VF can be attached to a Hyper-V child partition. When this happens, the virtualization stack performs the following steps:

  1. Once resources for the VF are allocated, the virtualization stack exposes a network adapter for the VF in the guest operating system. This causes the PCI driver that runs in the guest operating system to start the VF miniport driver. This driver is provided by the independent hardware vendor (IHV) for the SR-IOV network adapter.

    Note  Resources for the VF must be allocated by the miniport driver for the PCIe Physical Function (PF) before the VF can be attached to the Hyper-V child partition. VF resources include assigning a virtual port (VPort) on the NIC switch to the VF. For more information, see SR-IOV Virtual Functions.

  2. After the VF miniport driver is loaded and initialized, NDIS binds the protocol edge of the Network Virtual Service Client (NetVSC) in the guest operating system to the driver.

    Note  NetVSC only binds to the VF miniport driver. No other protocol stacks in the guest operating system can bind to the VF miniport driver.

    After the NetVSC successfully binds to the driver, network traffic in the guest operating system occurs over the VF data path. Packets are sent or received over the underlying VF of the network adapter instead of the software-based synthetic data path. For more information about the synthetic data path, see SR-IOV Synthetic Data Path.

The following diagram shows the components of the VF data path over an SR-IOV network adapter.

Hh440248.SRIOVvf_datapaths(en-us,VS.85).png

 

The use of the VF data path provides the following benefits:

  • All data packets flow directly between the networking components in the guest operating system and the VF. This eliminates the overhead of the synthetic data path in which data packets flow between the Hyper-V child and parent partitions.

    For more information about the synthetic data path, see SR-IOV Synthetic Data Path.

  • The VF data path bypasses any involvement by the management operating system in packet traffic from a Hyper-V child partition. The VF provides independent memory space, interrupts, and DMA streams for the child partition to which it is attached. This achieves networking performance that is almost compatible with nonvirtualized environments.

  • The routing of packets over the VF data path is performed by the NIC switch on the SR-IOV network adapter. Packets are sent or received over the external network through the physical port of the adapter. Packets are also forwarded to or from other child partitions to which a VF is attached.

    Note  Packets to or from child partitions to which no VF is attached are forwarded by the NIC switch to the Hyper-V extensible switch module. This module runs in the Hyper-V parent partition and delivers these packets to the child partition by using the synthetic data path.

SR-IOV Synthetic Data Path

 

When a Hyper-V child partition is started and the guest operating system is running, the virtualization stack starts the Network Virtual Service Client (NetVSC). NetVSC exposes a virtual machine (VM) network adapter that provides a miniport driver edge to the protocol stacks that run in the guest operating system.

NetVSC also communicates with the Hyper-V extensible switch that runs in the management operating system of the Hyper-V parent partition. The extensible switch component operates as a Network Virtual Service Provider (NetVSP). The interface between the NetVSC and NetVSP provides a software data path that is known as the synthetic data path.

The following diagram shows the components of the synthetic data path over an SR-IOV network adapter.

Hh440247.SRIOVsynthetic_datapaths(en-us,VS.85).png

 

If the underlying SR-IOV network adapter allocates resources for PCI Express (PCIe) Virtual Functions (VFs), the virtualization stack will attach a VF to a Hyper-V child partition. Once attached, packet traffic within the child partition will occur over the hardware-optimized VF data path instead of the synthesized data path. For more information on the VF data path, see SR-IOV Data Path.

The virtualization stack may still enable the synthetic data path for a Hyper-V child partition if one of the following conditions is true:

  • The SR-IOV network adapter has insufficient VF resources to accommodate all of the Hyper-V child partitions that were started. After all VFs on the network adapter are attached to child partitions, the remaining partitions use the synthetic data path.

    The process of failing over to the synthetic data path from the VF data path is known as VF failover.

  • A VF was attached to a Hyper-V child partition but becomes detached. For example, the virtualization stack could detach a VF from one child partition and attach it to another child partition. This might occur when there are more Hyper-V child partitions that are running than there are VF resources on the underlying SR-IOV network adapter.

  • The Hyper-V child partition is being live migrated to a different host.

Although the synthetic data path over an SR-IOV network adapter is not as efficient as the VF data path, it can still be hardware optimized. For example, if one or more virtual ports (VPorts) are configured and attached to the PCIe Physical Function (PF), the data path can provide the offload capabilities that resemble the virtual machine queue (VMQ) interface. For more information, see Nondefault Virtual Ports and VMQ.

 

SR-IOV VF Failover and Live Migration Support

 

After the Hyper-V child partition is started, network traffic flows over the synthetic data path. If the physical network adapter supports the Single Root I/O Virtualization (SR-IOV) interface, it can enable one or more PCI Express (PCIe) Virtual Functions (VFs). Each VF can be attached to a Hyper-V child partition. When this happens, network traffic flows over the hardware-optimized SR-IOV VF Data Path.

After the VF data path is established, network traffic can revert to the synthetic data path if any of the following conditions is true:

  • A VF was attached to a Hyper-V child partition but becomes detached. For example, the virtualization stack could detach a VF from one child partition and attach it to another child partition. This might occur when there are more Hyper-V child partitions that are running than there are VF resources on the underlying SR-IOV network adapter.

    The process of failing over to the synthetic data path from the VF data path is known as VF failover.

  • The Hyper-V child partition is being live migrated to a different host.

The following figure shows the various data paths that are supported over an SR-IOV network adapter.

Hh440249.SRIOVdatapaths(en-us,VS.85).png

 

The NetVSC exposes a Virtual Machine (VM) network adapter which is bound to the PF miniport driver to support the VF data path. During the transition to the synthetic data path, the VF network adapter is gracefully removed if possible from the guest operating system. If the VF cannot be removed gracefully and times out, it will be surprise removed. Then the VF miniport driver is halted, and the Network Virtual Service Client (NetVSC) is unbound from the VF miniport driver.

The transition between the VF and synthetic data paths occurs with minimum loss of packets and prevents the loss of TCP connections. Before the transition to the synthetic data path is complete, the virtualization stacks follows these steps:

  1. The virtualization stack moves the Media Access Control (MAC) and Virtual LAN (VLAN) filters for the VM network adapter to the default Virtual Port (VPort) that is attached to the PCIe Physical Function (PF). The VM network adapter is exposed in the guest operating system of the child partition.

    After the filters are moved to the default VPort, the synthetic data path is fully operational for network traffic to and from the networking components that run in the guest operating system. The PF miniport driver indicates received packets on the default PF VPort which uses the synthetic data path to indicate the packets to the guest operating system. Similarly, all transmitted packets from the guest operating system are routed through the synthetic data path and transmitted through the default PF VPort.

    For more information about VPorts, see Virtual Ports (VPorts).

  2. The virtualization stack deletes the VPort that is attached to the VF by issuing an Object Identifier (OID) set request ofOID_NIC_SWITCH_DELETE_VPORT to the PF miniport driver. The miniport driver frees any hardware and software resources associated with the VPort and completes the OID request.

    For more information, see Deleting a Virtual Port.

  3. The virtualization stack requests a PCIe Function Level Reset (FLR) of the VF before its resources are deallocated. The stack does this by issuing an OID set request of OID_SRIOV_RESET_VFto the PF miniport driver. The FLR brings the VF on the SR-IOV network adapter into a quiescent state and clears any pending interrupt events for the VF.

  4. After the VF has been reset, the virtualization stack requests a deallocation of the VF resources by issuing an OID set request of OID_NIC_SWITCH_FREE_VF to the PF miniport driver. This causes the miniport driver to free the hardware resources associated with the VF.

  • 1
    点赞
  • 31
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值