Linux Kernel VFS-What‘s next?

2021SC@SDUSC

A Summary

After few month's analyse towards Linux Kernel VFS, I wondered what I've really learnt from those unbelievable code written by former programmers which is definately leaders for the field.I've opened those files full of tons of codes and the title said that some of the original parts were written in 1990s where personal computers hadn't quite common among famillies in China.Today it takes courage to look for important methods through these codes, yet our fathers use their brilliant mind to fully acquire these needs and leave us such wonderful operating systems. I thought I couldn't meet their talent for the rest of my life.

However, along with all these 'ancient' codes, there is so many changed things which have taken most of the space, and a lot of progress were made after the operating system's first developed.Linux is an open-sourced system and many talented programmers were doing their best to optimize and advance in waves the system to make it even better in newly developed versions, until today.When I looked back to those lines, I find that even in some other people's blogs, the code makes difference from the code I got, even they are all Linux kernel's code, plenty of lines were rewritten to meet a better efficiency.

It takes time to summarize the past 13 blogs, but worth it.

File System and VFS

These are tons of file systems among the computers, some are elder ones that is too out-dated that no one is using it, and others may be too new so it's not widely used for managing files.Some are inode-style which uses inodes as index to locate the files on the dist, and some are FAT-style which create a table to arrange those blocks and pages.It's seems that it is hard for OSs to support all these file systems, and Linux provides a way thanks to VFS.You might remember the picture below.

Superblock-To Manage FSs

Superblock is the first structure for accessing all file systems, and for each of that linked to the system, a superblock is necessary to find their own operations towards files, even if there is no file but devices.In Linux, everything is considered files and can be operated in a file's way.

As we can see, some important messages are provided here including it's operations' list s_op.We can also see that Linux uses a inode-style to manage files in VFS, even the file system in operation doesn't has inodes.

​
struct super_operations {
   	struct inode *(*alloc_inode)(struct super_block *sb);
	void (*destroy_inode)(struct inode *);
	void (*free_inode)(struct inode *);

   	void (*dirty_inode) (struct inode *, int flags);
	int (*write_inode) (struct inode *, struct writeback_control *wbc);
	int (*drop_inode) (struct inode *);
	void (*evict_inode) (struct inode *);
	void (*put_super) (struct super_block *);
	int (*sync_fs)(struct super_block *sb, int wait);
	int (*freeze_super) (struct super_block *);
	int (*freeze_fs) (struct super_block *);
	int (*thaw_super) (struct super_block *);
	int (*unfreeze_fs) (struct super_block *);
	int (*statfs) (struct dentry *, struct kstatfs *);
	int (*remount_fs) (struct super_block *, int *, char *);
	void (*umount_begin) (struct super_block *);

	int (*show_options)(struct seq_file *, struct dentry *);
	int (*show_devname)(struct seq_file *, struct dentry *);
	int (*show_path)(struct seq_file *, struct dentry *);
	int (*show_stats)(struct seq_file *, struct dentry *);
#ifdef CONFIG_QUOTA
	ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
	ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
	struct dquot **(*get_dquots)(struct inode *);
#endif
	long (*nr_cached_objects)(struct super_block *,
				  struct shrink_control *);
	long (*free_cached_objects)(struct super_block *,
				    struct shrink_control *);
};

​

Inode, Dentry and File Struct

These three struct are of great importance in VFS's management.Inode is the only way the kernel finds a file, so each file has one and only one inode.Almost all the basic operations need inode as their parameter.So the kernel's job for accessing a file is to find its inode in some ways.Dentry and file struct are made for this purpose.

Dentry stands for Directory Entry, and dentry has bind to inode of the file, but not every files has a dentry, just those are already in the system.So if a file is not used at all, no dentry is created, but if a file is used the dentry will be created immidiately in main memory(not on disk).Radix Tree is used to organize these structs.

File struct however, is used for user's layer by processes.It saves contexts and other messages such as file's name for user's process to use, and through which user will find the dentry then inode and finally the file itself.FS is needed because a process might open the same file multiple times, and each time a context will have to be recorded, which also means a file could have multiple file structs that aimed to it.For each process, a list will be kept for the fss.

Open

Open is a registered system call which can be called directly by the user's process, and VFS will eliminate the problem that different file systems uses different way to open files.For the user's process, fd is the only thing it knows as return.

A sort of functions will be called in order to open a file, and fd as well as fd list play important roles in it.

filegraph1.png

To open a file, we must know it's filepath, and the kernal will use the path to search for the file.Before that happens, necessary structs will be created and initialized including fd and nameidata.To search for the file, path walk will be done and the file's pathname will be discussed in many ways such as if it's legal of illegal, if it's DOT_DOT(../) or not.Struct nameidata will save the messages for path walk, and the code is mostly in namei.c in fs folder.

After the dentry's discovered, fd will be filled with dentry and inode as well as other messages that user might use.In the end, a new node will be created in process's fd list and the newly created fd will be linked to it.As a result, the fd is returned to the user's layer.

Read

Read is a system call that could be directly called by user's process to read data from a file, and the file of course is opened.In opening a file, fd is given to the process, and the process will use fd to Read.After the system call's caught, sys_read will be called to handle the call.

ssize_t ksys_read(unsigned int fd, char __user *buf, size_t count)
{
	struct fd f = fdget_pos(fd);
	ssize_t ret = -EBADF;

	if (f.file) {
		loff_t pos, *ppos = file_ppos(f.file);
		if (ppos) {
			pos = *ppos;
			ppos = &pos;
		}
		ret = vfs_read(f.file, buf, count, ppos);
		if (ret >= 0 && ppos)
			f.file->f_pos = pos;
		fdput_pos(f);
	}
	return ret;
}

It's clear that fd is a parameter for ksys_read function, and the user's layer doesn't have to know about inodes.Function vfs_read is the next one to go to.

	if (file->f_op->read)
		ret = file->f_op->read(file, buf, count, pos);
	else if (file->f_op->read_iter)
		ret = new_sync_read(file, buf, count, pos);
	else
		ret = -EINVAL;

In the function(skipped some check before this part of code), read and read_iter methods for the file system(not the virtual one) will be called if exist(most of them has read_iter instead of read).Here we jumped from vfs to fs like ext4.

At last, as for ext4 which is linux_style file system, it will first check the way for reading, DIO(Direct IO), DAX(Direct Access) or generic way.We didn't talked about DIO, and DAX is the original way only used for directly accessing the file which means it will directly goes to disk and pagecache will not be used, which happens rarely.In the end, the genetic way uses pagecache to see if the file is in the cache, if so, the job will be suddenly finished.If not, the kernel goes to disk to read the file and create some space in the cache(if possible).This part is just like what we've learnt in OS class.

End of Line

I don't know why those blogs were decided to be written in English at the beginning, maybe I just wanna give it a try.It causes a lot of troubles since my English is so poor that sometime it was so difficult to find the proper word I was thinking.The sentence also could be so difficult to comprehend for native users, but hey it is a progress.Anyway, through all these difficulties, I managed to finish all those blogs, and I have the ability to repeat those theory in my sentences.Isn't that wonderful?

If my blogs caused some trouble for you before the screen to understand, forgive me and ignore the part, it is not your fault at all, and I'll be so happy if those blogs helped you a little.

See you next year in a country where divorce is illegal and is to be sentenced to death!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值