virtio-blk原理:
1.处理数据请求有两条路径
1).request路径:virtblk_request
virtio_blk结构体中的gendisk结构的request_queue队列接收block层的bio请求,按照request_queue队列默认处理过程,bio请求会在io调度层转化为request,然后进入request_queue队列,最后调用virtblk_request将request转化为vbr结构。
2).bio路径:virtblk_make_request
不按照默认路径走,将bio直接转化为vbr结构。
2.通过request队列(virtio-blk初始化时向virtio_ring申请到的)将vdr发送给qemu-kvm处理。
3.qemu-kvm处理过vdr后会将它加入到virtio_ring的request队列,并发一个中断给队列,队列的中断响应函数vring_interrupt调用队列的回调函数virtblk_done。
4.virtblk_done判断是bio处理方式则调用virtblk_bio_done,否则调用virtblk_request_done
virtio-blk层进入点有两个:
1.request方式:virtblk_request
2.bio方式:virtblk_make_request
bio方式的优势在于跳过了io调度层(主要工作是合并多个bio到request),实现了性能的提高,但是对于低速设备,性能却有下降。
virtio-blk层返回点只有一个,那就是队列的回调函数virtblk_done
1.bio结构体的作用就是指明要读或写块设备的哪些地址及长度,所以它最主要的
成员就是bio_vec数组,一个bio_vec对应一个地址和长度(即,一块区域),
bio_vec数组的作用就是读或写指明的多个块设备区域。
2.bio被传递到io调度层时,就会被转换成request结构体,一个request可能包含
多个读取地址区域相邻的bio从而提高读写性能。
3.块设备所包含的gendisk结构中包含一个request_queue,这个队列就是用来接
收io调度层发送过来的request。
4.gendisk结构的request_queue队列包含各种回调函数来处理整个request的生命
流程:
queue的各种回调函数:
/* request process function - 处理request函数 */
request_fn_proc *request_fn;
/* make request function - 将bio转化为request函数 */
make_request_fn *make_request_fn;
/* prepare request function - 创建request时执行的函数 */
prep_rq_fn *prep_rq_fn;
/* unprepared request function - */
unprep_rq_fn *unprep_rq_fn;
/* merge bio_vec function - 合并bio到一个request */
merge_bvec_fn *merge_bvec_fn;
/* 软中断处理函数,request处理完成时的回调函数 */
softirq_done_fn *softirq_done_fn;
/* 超时处理函数 */
rq_timed_out_fn *rq_timed_out_fn;
dma_drain_needed_fn *dma_drain_needed;
lld_busy_fn *lld_busy_fn;
进入block层的接口:generic_make_request
/**
* generic_make_request - hand a buffer to its device driver for I/O
* @bio: The bio describing the location in memory and on the device.
*
* generic_make_request() is used to make I/O requests of block
* devices. It is passed a &struct bio, which describes the I/O that
needs
* to be done.
*
* generic_make_request() does not return any status. The
* success/failure status of the request, along with notification of
* completion, is delivered asynchronously through the bio->bi_end_io
* function described (one day) else where.
*
* The caller of generic_make_request must make sure that bi_io_vec
* are set to describe the memory buffer, and that bi_dev and bi_sector
are
* set to describe the device address, and the
* bi_end_io and optionally bi_private are set to describe how
* completion notification should be signaled.
*
* generic_make_request and the drivers it calls may use bi_next if
this
* bio happens to be merged with someone else, and may resubmit the bio
to
* a lower device by calling into generic_make_request recursively,
which
* means the bio should NOT be touched after the call to -
>make_request_fn.
*/
void generic_make_request(struct bio *bio)
{
struct bio_list bio_list_on_stack;
if (!generic_make_request_checks(bio))
return;
/*
* We only want one ->make_request_fn to be active at a time,
else
* stack usage with stacked devices could be a problem. So use
* current->bio_list to keep a list of requests submited by a
* make_request_fn function. current->bio_list is also used as
a
* flag to say if generic_make_request is currently active in
this
* task or not. If it is NULL, then no make_request is active.
If
* it is non-NULL, then a make_request is active, and new
requests
* should be added at the tail
*/
if (current->bio_list) {
bio_list_add(current->bio_list, bio);
return;
}
/* following loop may be a bit non-obvious, and so deserves
some
* explanation.
* Before entering the loop, bio->bi_next is NULL (as all
callers
* ensure that) so we have a list with a single bio.
* We pretend that we have just taken it off a longer list, so
* we assign bio_list to a pointer to the bio_list_on_stack,
* thus initialising the bio_list of new bios to be
* added. ->make_request() may indeed add some more bios
* through a recursive call to generic_make_request. If it
* did, we find a non-NULL value in bio_list and re-enter the
loop
* from the top. In this case we really did just take the bio
* of the top of the list (no pretending) and so remove it from
* bio_list, and call into ->make_request() again.
*/
BUG_ON(bio->bi_next);
bio_list_init(&bio_list_on_stack);
current->bio_list = &bio_list_on_stack;
do {
struct request_queue *q = bdev_get_queue(bio->bi_bdev);
q->make_request_fn(q, bio);
bio = bio_list_pop(current->bio_list);
} while (bio);
current->bio_list = NULL; /* deactivate */
}
block层默认创建request的函数:blk_make_request
/**
* blk_make_request - given a bio, allocate a corresponding struct
request.
* @q: target request queue
* @bio: The bio describing the memory mappings that will be submitted
for IO.
* It may be a chained-bio properly constructed by block/bio
layer.
* @gfp_mask: gfp flags to be used for memory allocation
*
* blk_make_request is the parallel of generic_make_request for
BLOCK_PC
* type commands. Where the struct request needs to be farther
initialized by
* the caller. It is passed a &struct bio, which describes the memory
info of
* the I/O transfer.
*
* The caller of blk_make_request must make sure that bi_io_vec
* are set to describe the memory buffers. That bio_data_dir() will
return
* the needed direction of the request. (And all bio's in the passed
bio-chain
* are properly set accordingly)
*
* If called under none-sleepable conditions, mapped bio buffers must
not
* need bouncing, by calling the appropriate masked or flagged
allocator,
* suitable for the target device. Otherwise the call to
blk_queue_bounce will
* BUG.
*
* WARNING: When allocating/cloning a bio-chain, careful consideration
should be
* given to how you allocate bios. In particular, you cannot use
__GFP_WAIT for
* anything but the first bio in the chain. Otherwise you risk waiting
for IO
* completion of a bio that hasn't been submitted yet, thus resulting
in a
* deadlock. Alternatively bios should be allocated using bio_kmalloc()
instead
* of bio_alloc(), as that avoids the mempool deadlock.
* If possible a big IO should be split into smaller parts when
allocation
* fails. Partial allocation should not be an error, or you risk a
live-lock.
*/
struct request *blk_make_request(struct request_queue *q, struct bio
*bio,
gfp_t gfp_mask)
{
struct request *rq = blk_get_request(q, bio_data_dir(bio),
gfp_mask);
if (unlikely(!rq))
return ERR_PTR(-ENOMEM);
for_each_bio(bio) {
struct bio *bounce_bio = bio;
int ret;
blk_queue_bounce(q, &bounce_bio);
ret = blk_rq_append_bio(q, rq, bounce_bio);
if (unlikely(ret)) {
blk_put_request(rq);
return ERR_PTR(ret);
}
}
return rq;
}
block层通用执行request函数:blk_execute_rq
/**
* blk_execute_rq - insert a request into queue for execution
* @q: queue to insert the request in
* @bd_disk: matching gendisk
* @rq: request to insert
* @at_head: insert request at head or tail of queue
*
* Description:
* Insert a fully prepared request at the back of the I/O scheduler
queue
* for execution and wait for completion.
*/
int blk_execute_rq(struct request_queue *q, struct gendisk *bd_disk,
struct request *rq, int at_head)
{
DECLARE_COMPLETION_ONSTACK(wait);
char sense[SCSI_SENSE_BUFFERSIZE];
int err = 0;
unsigned long hang_check;
/*
* we need an extra reference to the request, so we can look at
* it after io completion
*/
rq->ref_count++;
if (!rq->sense) {
memset(sense, 0, sizeof(sense));
rq->sense = sense;
rq->sense_len = 0;
}
rq->end_io_data = &wait;
blk_execute_rq_nowait(q, bd_disk, rq, at_head,
blk_end_sync_rq);
/* Prevent hang_check timer from firing at us during very long
I/O */
hang_check = sysctl_hung_task_timeout_secs;
if (hang_check)
while (!wait_for_completion_io_timeout(&wait,
hang_check * (HZ/2)));
else
wait_for_completion_io(&wait);
if (rq->errors)
err = -EIO;
return err;
}
/* return id (s/n) string for *disk to *id_str
*/
static int virtblk_get_id(struct gendisk *disk, char *id_str)
{
struct virtio_blk *vblk = disk->private_data;
struct request *req;
struct bio *bio;
int err;
/* 创建一个bio,并且把id_str转换为块设备可理解的页地址,然后将
地址添加到bio的成员bio_vec结构体数组中(该数组一个元素就是一个bio_vec结
构体变量,一个变量就对应一个起始地址和长度,因此该数组就对应几块存储区
),因为id_str的长度可能会需要几个页,所以bio_vec结构体数组元素的个数也
就是id_str占用的页数 */
bio = bio_map_kern(vblk->disk->queue, id_str,
VIRTIO_BLK_ID_BYTES,
GFP_KERNEL);
if (IS_ERR(bio))
return PTR_ERR(bio);
req = blk_make_request(vblk->disk->queue, bio, GFP_KERNEL);
if (IS_ERR(req)) {
bio_put(bio);
return PTR_ERR(req);
}
req->cmd_type = REQ_TYPE_SPECIAL;
err = blk_execute_rq(vblk->disk->queue, vblk->disk, req,
false);
blk_put_request(req);
return err;
}