Linux内核学习:EXT4 文件系统在 Linux 内核系统中的读写过程

目录

1 概述

2 虚拟文件系统 与 Ext4 文件系统

2.1 sys_write( ) 代码跟踪

2.2 sys_write( ) 过程分析

2.3 sys_write( ) 的核心部分 vfs_write( )

2.4 ext4_file_write( )

2.4.1 ext4文件系统的extent

2.4.2 ext4_file_write( ) 

2.5 generic_file_write_iter( )

2.6 __generic_file_write_iter( )

2.7 generic_perform_write( )

2.7.1 ext4文件系统address_space_operations

2.7.2 ext4文件系统delay allocation机制

2.7.3 执行完 generate_write_back( )后


1 概述

用户进程通过系统调用write()往磁盘上写数据,但write()执行结束后,数据是否 立即写到磁盘上?内核读文件数据时,使用到了“提前读”;写数据时,则使用了“延迟写”, 即write()执行结束后,数据并没有立即立即将请求放入块设备驱动请求队列,然后写到 硬盘上。

跟踪的时候通过

dump_stack

重新编译linux内核,跟踪函数执行过程。

2 虚拟文件系统 与 Ext4 文件系统

首先文件系统在内核中的读写过程是在 sys_write( ) 中定义的。

2.1 sys_write( ) 代码跟踪

sys_write( ) 定义在 include/linux/syscalls.h 中:

asmlinkage long sys_write(unsigned int fd, const char __user *buf, 568 size_t count);

sys_write( )的具体实现在 fs/read_write.c 中:

SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
         size_t, count)
{
     struct fd f = fdget_pos(fd);
     ssize_t ret = -EBADF; 
     if (f.file) {
         loff_t pos = file_pos_read(f.file);
         ret = vfs_write(f.file, buf, count, &pos);
         if (ret >= 0)
             file_pos_write(f.file, pos);
         fdput_pos(f);
     } 
     return ret;
}

2.2 sys_write( ) 过程分析

可以看出在实现 sys_write( ) 的时候,分为如下几步:

1) 根据打开文件号 fd找到该已打开文件file结构:

struct fd f = fdget_pos(fd);

2) 读取当前文件的读写位置:

loff_t pos = file_pos_read(f.file);

3) 写入:

ret = vfs_write(f.file, buf, count, &pos);

4) 根据读文件结果,更新文件读写位置 :

file_pos_write(f.file, pos);

2)和  4)可以作为写入之前和之后的对应操作来看,一个是读取当前文件的位置,一个是根据写文件的结果,更新文件的读写位置,主要代码还是在 fs/read_write.c 中:

static inline loff_t file_pos_read(struct file *file)                                                   
{
     return file->f_pos;
}

static inline void file_pos_write(struct file *file, loff_t pos)
{
     file->f_pos = pos;
}

3) 是整个 sys_write( ) 中最为重要的一部分,下面我们仔细分析一下这个函数。

2.3 sys_write( ) 的核心部分 vfs_write( )

ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_t *pos){
     ssize_t ret;
 
     if (!(file->f_mode & FMODE_WRITE))
         return -EBADF;
     if (!(file->f_mode & FMODE_CAN_WRITE))
         return -EINVAL;
     if (unlikely(!access_ok(VERIFY_READ, buf, count)))
         return -EFAULT; 
     ret = rw_verify_area(WRITE, file, pos, count);
     if (ret >= 0) {
         count = ret;
         file_start_write(file);
         if (file->f_op->write)
             ret = file->f_op->write(file, buf, count, pos);
         else if (file->f_op->aio_write)
             ret = do_sync_write(file, buf, count, pos);
         else
             ret = new_sync_write(file, buf, count, pos);
         if (ret > 0) {
             fsnotify_modify(file);
             add_wchar(current, ret);
         }
         inc_syscw(current);
         file_end_write(file);
     }
 
     return ret;
}

首先函数在 rw_verify_area(WRITE, file, pos, count); 检查文件是否从当前位置 pos 开始的 count 字节是否对写操作加上了 “强制锁”,这是通过调用函数完成的。

通过合法性检查后,就调用具体文件系统 file_operations中 write 的方法。对于ext4文件系统,file_operations方法定义在 fs/ext4/file.c 中。从定义中可知 write 方法实现函数为 do_sync_write( )。 

下面是ext4文件系统操作的数据结构:

 const struct file_operations ext4_file_operations = {
     .llseek     = ext4_llseek,
     .read       = new_sync_read,
     .write      = new_sync_write,
     .read_iter  = generic_file_read_iter,
     .write_iter = ext4_file_write_iter,
     .unlocked_ioctl = ext4_ioctl,
 #ifdef CONFIG_COMPAT
     .compat_ioctl   = ext4_compat_ioctl,
 #endif
     .mmap       = ext4_file_mmap,
     .open       = ext4_file_open,
     .release    = ext4_release_file,
     .fsync      = ext4_sync_file,
     .splice_read    = generic_file_splice_read,
     .splice_write   = iter_file_splice_write,
     .fallocate  = ext4_fallocate,
 };

下面是do_sync_write( )的具体代码,也在fs/read_write.c中:

 ssize_t do_sync_write(struct file *filp, const char __user *buf, size_t len, loff_t *ppos)
 {
     struct iovec iov = { .iov_base = (void __user *)buf, .iov_len = len };
     struct kiocb kiocb;
     ssize_t ret; 
     init_sync_kiocb(&kiocb, filp);
     kiocb.ki_pos = *ppos;
     kiocb.ki_nbytes = len; 
     ret = filp->f_op->aio_write(&kiocb, &iov, 1, kiocb.ki_pos);
     if (-EIOCBQUEUED == ret)
         ret = wait_on_sync_kiocb(&kiocb);
     *ppos = kiocb.ki_pos;
     return ret;
 }
 EXPORT_SYMBOL(do_sync_write);

异步I/O允许用户空间来初始化操作而不必等待它们的完成,因此,一个应用程序可以在他的I/O处理进行中做其他的处理。

块和网络驱动在整个时间是完全异步的,因此只有字符驱动对于明确的异步I/O支持是候选的。实现异步I/O操作的file_operations方法

可以读写Ext2,以Ext2方式挂载Ext3文件系统(不支持Ext3日志),不支持中文! It provides Windows NT4.0/2000/XP/2003/Vista/2008 with full access to Linux Ext2 volumes (read access andwrite access). This may be useful if you have installed both Windows and Linux as a dual boot environment on your computer. What features are supported? Complete reading and writing access to files and directories of volumes with theExt2 orExt3 file system. Supports features which are specific to the I/O-system of Windows: Byte Range Locks, Directory Notfication (so the Explorer updates the view of a directory on changes within that directory), Oplocks (so SMB clients are able to cache the content of files). Allows Windows to run with paging files on Ext2 volumes. UTF-8 encoded file names are supported. The driver treats files with file names that start with a dot "." character ashidden. Supports GPT disks if the Windows version used also does. Supports use of the Windows mountvol utility to create or delete drive letters for Ext2 volumes (except on Windows NT 4.0). See also section"Can drive letters also be configured from scripts?". What features are *not* supported? Inodes that are larger than 128 bytes are not supported. Access rights are not maintained. All users can access all the directories and files of an Ext2 volume. If a new file or directory is created, it inherits all the permissions, the GID and the UID from the directory where it has been created. There is one exception to this rule: a file (but not a directory) the driver has created always has cleared "x" permissions, it inherits the "r" and the "w" permissions only. See also section"What limitations arise from not maintaining access rights?". The driver does not allow accessing special files at Ext2 volumes, the access will be always denied. (Special files are sockets, soft links, block devices, character devices and pipes.) Alternate 8.3-DOS names are not supported (just because there is no place to store them in an Ext2 file system). This can prevent legacy DOS applications, executed by the NTVDM of Windows, from accessing some files or directories. Currently the driver does not implement defragging support. So defragmentation applications will neither show fragmentation information nor defragment any Ext2 volume. This software does not achieve booting a Windows operating system from an Ext2 volume. LVM volumes are not supported, so it is not possible to access them.
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值