virtio 1.0 简介

3 篇文章 0 订阅

Introduction
This article aims to offer a different view of virtio devices from most of the other articles available online. It is neither a complete reference to the virtio 1.0 spec, nor a high-level overview, but rather aims at providing the reader with an idea of how a virtio driver and device implementation work and communicate with each other, through the use of diagrams. That information could then be used as a starting point in cases where it becomes necessary to delve into a virtio device implementation's code for troubleshooting purposes. Throughout the text, we will assume PCI is the transport mechanism used, ignoring MMIO or channel I/O, and choose virtio-net as our example device type. Accordingly, all the elements that are device type-specific in the figures below pertain to virtio-net devices.

Device Discovery

Below is a diagram that describes what the configuration space of a virtio 1.0 net pci device would look like. Note, for legacy devices, It could be guest native endianess instead of PCI’s little-endian.

register(offset) bits 31-24 bits 23-16 bits 15-8 bits 7-0
00 Device ID (0x1041) Vendor ID (0x1a64)
......
08 Class Code(2) ...... Revision ID(1)
......
10 IO BAR
14 Memory BAR
......
2C SubsystemID(0x0001) Subsystem Vendor ID(0x1af4)
.....
34 ...... Capabilities Pointer
......

Virtio capability     |     Virtio capability     |     Virtio capability     |     Virtio capability

......

The Capabilities Pointer points to the virtio capability chain. Each virtio capability area has an offset value that is relative to the IO/Memory BAR. Through that they expose virtio common and device specific configuration area, queue notify location, and ISR status to the driver. Below is a typical layout for the IO/Memory region mapped by the BAR. Again, for legacy device, the layout of virtio header and device specific area is quite different. For detail of the layout for legacy device, please refer to virtio 1.0 spec.

<Common Configuration>
device_feature_select device_feature driver_feature_select driver_feature msix_config num_queues device_status config_generation
queue_select queue_size queue_msix_vector queue_enable queue_notify_off queue_desc queue_avail queue_used
<Queue Notify>
Queue_1_Notify Queue_2_Notify …...
<ISR status>
Bits          |          0                   |                     1                          |  2 to 31
Purpose   | Queue Interrupt    | Device Configuration Interrupt | Reserved
<Device configuration>
Mac status max_virtqueue_pairs


Device Configuration
The driver needs to negotiate for the supported features with the device. And another important thing is to setup the virtqueue. It writes the virtqueue index to the queue_select field in common configuration section, and reads the virtqueue size from the queue_size field there. Then it allocates the descriptor table, available ring, and used ring in a single page, and writes the address of each part to queue_desc, queue_avail, and queue_used fields. So the device can find them.

Virtqueue Data Structure
Each virtqueue consists of three parts – descriptor table, available ring, and used ring. Simple virtio net device has one virtqueue for transmit and one for receive. It can also have multiple send/receive virtqueue pairs depends on the setup[3]. Each queue has a 16-bit queue size parameter, which sets the number of entries and implies the total size of the queue according to '2.4 Virtqueues' from virtio 1.0 spec.

'The descriptor table refers to the buffers the driver is using for the device. The addresses are physical addresses, and the buffers can be chained via the next field.' (from '2.4.4 The Virtqueue Descriptor Table') Actual descriptor buffers are allocated in the guest, and the addr below is GPA. KVM can locate it because it maintains a mapping between GPA and HVA. Below is a descriptor table entry.

Index Addr Len Flags Next

The available ring refers to the descriptor chains the driver is offering the device. Each ring entry refers to the head of a descriptor chain, written by the driver and read by the device.

Flag idx ring[]

'The used ring is where the device returns buffers once it is done with them. It is only written to by the device, and read by the driver. '(from '2.4.6 The Virtqueue Used Ring')

Flag idx vring_used_elem ring[]


Device Operation
Supplying Buffers to The Device:

  1. The buffers are put into free descriptor in the descriptor table, and multiply buffers can be chained together throught the next field of the descriptor.
  2. The index of the first descriptor in the chain is then put into the next ring entry (pointed by idx field) of the available ring. And the drive needs to increase the idx field of the available ring.
  3. If notification is on, the driver notifies the device of the new available buffers by writing to the queue notify entry in the queue notify section.

Below is an example where a chain of two buffers is placed onto the available ring.
Descriptor table:

Index Addr Len Flags Next
0 0x8000000000000000 4096 NEXT 1
1 0x8000000000040000 128 WRITE 0
......        

Available ring:

0     ......

Receiving Used Buffers From The Device:

  1. The descriptor number is written to the next field (pointed by idx field) in the used ring by the device. And the used ring index (idx field) is updated by the device as well. 
  2. The device sends an interrupt to guest if required according to feature negotiation and per ring flag in available ring structure.
Below is an example where the above available ring is consumed and returned back.
Descriptor table:
Index Addr Len Flags Next
0 0x8000000000000000 4096 NEXT 1
1 0x8000000000040000 128 WRITE 0
......        

Used ring:

0     ......

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值