直接IO路径
下图,是在O_DIRECT打开模式下,对文件进行进行读写的函数调用图。
函数generic_file_aio_read进行IO类型判别,如果是直接IO:
对块设备文件,会走blkdev_direct_IO分支,代码如下:
static ssize_t blkdev_direct_IO(int rw, struct kiocb *iocb, const struct iovec *iov, loff_t offset, unsigned long nr_segs) { struct file *file = iocb->ki_filp; struct inode *inode = file->f_mapping->host; return blockdev_direct_IO_no_locking_newtrunc(rw, iocb, inode, I_BDEV(inode), iov, offset, nr_segs, } |
它只是简单取出文件对象和索引节点结构,索引节点同时是其地址空间的owner,然后,调用blockdev_direct_IO_no_locking_newtrunc. 而blockdev_direct_IO_no_locking_newtrunc只是一个封装函数,它立即调用 __blockdev_direct_IO_newtrunc.
清单 2 函数 blkdev_direct_IO_no_locking_newtrunc()
static inline ssize_t blockdev_direct_IO_no_locking_newtrunc(int rw, struct kiocb *iocb, struct inode *inode, struct block_device *bdev, const struct iovec *iov, loff_t offset, unsigned long nr_segs, get_block_t get_block, { return __blockdev_direct_IO_newtrunc(rw, iocb, inode, bdev, iov, offset, nr_segs, get_block, end_io, NULL, 0); } |
对普通文件,会走blockdev_direct_IO_newtrunc,代码如下:
清单 2 函数 blkdev_direct_IO_newtrunc()
static inline ssize_t blockdev_direct_IO_newtrunc(int rw, struct kiocb *iocb, struct inode *inode, struct block_device *bdev, const struct iovec *iov, loff_t offset, unsigned long nr_segs, get_block_t get_block, { return __blockdev_direct_IO_newtrunc(rw, iocb, inode, bdev, iov, offset, nr_segs, get_block, end_io, NULL, DIO_LOCKING | DIO_SKIP_HOLES); } |
到这一步可以看出,对普通文件和块设备文件的直接IO,到最后都会调用__blockdev_direct_IO_newtrunc,其差别在于有没加锁,可以看它的最后一个参数是不同的,即对普通文件的DIO要加锁,避免多个进程在同一时刻对文件写,而块设备文件则没有加以限制,而是交由底层的设备驱动进行处理,底层的写操作会以串行方式执行。
__blockdev_direct_IO_newtrunc函数,代码如下,主要完成这样几个任务
1)检查各个段的内存边界,不能出现跨页的段
2)检查,如果有锁,即是对普通文件的DIO,则调用filemap_write_and_wait_range,将相应位置可能存在的page cache废弃掉或刷回磁盘(避免产生不一致),然后调用direct_io_worker来处理请求
清单 3 函数 __blkdev_direct_IO_newtrunc()
__blockdev_direct_IO_newtrunc(int rw, struct kiocb *iocb, struct inode *inode, struct block_device *bdev, const struct iovec *iov, loff_t offset, unsigned long nr_segs, get_block_t get_block, dio_iodone_t end_io, dio_submit_t submit_io, int flags) { if (dio->flags & DIO_LOCKING) { ... mutex_lock(&inode->i_mutex); retval=filemap_write_and_wait_range(mapping,offset, end-1); ... } retval = direct_io_worker(rw, iocb, inode, iov, offset,nr_segs, blkbits, get_block, ... } |
direct_io_worker,一次可能包含多个读操作,对于其中的每一个,调用do_direct_IO.