The Virtual Filesystem (Linux)

The Virtual Filesystem

Linux manages to support multiple filesystem types through a concept called virtual filesystem.

The key idea behind the virtual filesystem is to put a wide range of information in the kernel to represent many different types of filesystems.

 

 

The Role of VFS

The Virtual Filesystem, also known as Virtual Filesystem Switch(or VFS), is a kernel software layer, providing a common interface to several kinds of filesystems.

It is an abstract layer between application program and filesystem implementation.

 

Filesystems supported by VFS

ð  Disk-based filesystem (ext2, vfat, NTFS, USB flash, JFS, etc)

ð  Network Filesystem (NFS, AFS, CIFS, NCP, etc)

ð  Special filesystem (/proc, /sys, etc )

 

 

 

The Common File Model

Everything is a file.

For FAT(File Allocation Table) filesystems, the files corresponding to the directories exist only as objects in the kernel memory.

 

The kernel does not hardcode file operations. Instead, it uses a pointer for each operation; the pointer is made to point to the proper function for the particular filesystem being accessed.

 

The common file model consists of the following object types:

ð  The superblock object

ð  The inode object

ð  The file object

ð  The dentry object

 

 

 

VFS

ð  A common interface

ð  A disk cache to speed up access to files

 

Caches

Hardware cache: a fast static RAM

Memory cache: a software mechanism to bypass the Kernel Memory Allocator

Disk cache: a software mechanism speeding up access to data by allowing the kernel to keep in RAM some information which is normally stored in disk

 

 

VFS Data Structures

Superblock objects

extern struct list_head super_blocks;

extern spinlock_t sb_lock;

 

struct super_block {

….

void                    *s_fs_info;    /* Filesystem private info */

}

 

Each disk-based filesystem has to access and update its allocation bitmap in order to allocate and release disk blocks. The VFS duplicates this information in memory for reason of efficiency(pointed by s_fs_info field of superblock). However, this leads to a synchronization problem between VSF superblock in memory and the actual superblock in disk. This in turn may lead to a familiar problem called a corrupted filesystem. Linux adopts a policy to minimize this problem: writing dirty pages to disk periodically.

 

 

Superblock operations

ð  alloc_inode(sb)

ð  destroy_inode(inode)

ð  read_inode(inode)   disk->inode object

ð  dirty_inode(inode)

ð  write_inode(inode, flag)  inode object -> disk

ð  put_inode(inode)

ð  drop_inode(inode)

ð  delete_inode(inode)

ð  put_super(super)  release the superblock object passed as parameter (unmount a filesystem)

ð  write_super(super) update a filesystem superblock

ð  sync_fs(sb, wait) used by journaling filesystems

ð  statfs(super, buf)

ð  remount_fs(super, flags, data)

ð  clear_inode(inode)

ð  umount_begin(super)

ð  show_options(seq_file, vfsmount)

 

inode object

The inode object is unique to the file and remains the same as long as the file exists.

Struct inode

{

struct hlist_node i_hash;  //linked into inode_hashtable

       struct list_head   i_list;            /* backing dev IO list */

       struct list_head   i_sb_list; //a per-filesystem doubly linked list

       struct list_head   i_dentry;

….

}

 

 

Each inode object appears in one of the following lists:

1.      The list of valid unused inodes

2.      The list of in-use inodes

3.      The list of dirty inodes

 

Inode operation

struct inode_operations {

       int (*create) (struct inode *,struct dentry *,int, struct nameidata *);

       struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameidata *);

       int (*link) (struct dentry *,struct inode *,struct dentry *);

       int (*unlink) (struct inode *,struct dentry *);

       int (*symlink) (struct inode *,struct dentry *,const char *);

       int (*mkdir) (struct inode *,struct dentry *,int);

       int (*rmdir) (struct inode *,struct dentry *);

       int (*mknod) (struct inode *,struct dentry *,int,dev_t);

       int (*rename) (struct inode *, struct dentry *,

                     struct inode *, struct dentry *);

       int (*readlink) (struct dentry *, char __user *,int);

       void * (*follow_link) (struct dentry *, struct nameidata *);

       void (*put_link) (struct dentry *, struct nameidata *, void *);

       void (*truncate) (struct inode *);

       int (*permission) (struct inode *, int);

       int (*check_acl)(struct inode *, int);

       int (*setattr) (struct dentry *, struct iattr *);

       int (*getattr) (struct vfsmount *mnt, struct dentry *, struct kstat *);

       int (*setxattr) (struct dentry *, const char *,const void *,size_t,int);

       ssize_t (*getxattr) (struct dentry *, const char *, void *, size_t);

       ssize_t (*listxattr) (struct dentry *, char *, size_t);

       int (*removexattr) (struct dentry *, const char *);

       void (*truncate_range)(struct inode *, loff_t, loff_t);

       long (*fallocate)(struct inode *inode, int mode, loff_t offset,

                       loff_t len);

       int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start,

                    u64 len);

};

 

 

File Objects

A file object describes how a process interacts with an open file.

 

Struct file

{

}

The main information is the f_pos, which is the file pointer indicating the current file offset. This information is stored in file object instead of inode object because several processes may concurrently access the same file.

 

f_count field is a reference counter, it counts the number of processes that are using the file object. Lightweight processes created with CLONE_FILES flag share the open file table, thus they use the same file objects. The reference counter is also increased when a dup() call is made.

ð  Multithreaded programming must take synchronization into consideration.

 

Dentry Objects

Dentry objects have no corresponding image on disk. It is used to represent a directory entry on the disk.

 

Four possible states of a dentry object:

1.      Free

2.      Unused

3.      In use

4.      Negative (The inode associated with the dentry object does not exist.)

 

Dentry cache: maximize the efficiency in handling dentries

 

Files associated with a process

Struct task_struct

{

/* filesystem information */

       struct fs_struct *fs;

/* open file information */

       struct files_struct *files;

}

 

struct fs_struct {

       int users;

       rwlock_t lock;

       int umask;

       int in_exec;

       struct path root, pwd;

};

 

struct files_struct {

struct fdtable fdtab;

  ….

       struct file * fd_array[NR_OPEN_DEFAULT];

};

 

struct fdtable {

       unsigned int max_fds;

       struct file ** fd;      /* current fd array */

       fd_set *close_on_exec;

       fd_set *open_fds;

       struct rcu_head rcu;

       struct fdtable *next;

};

 

For every file with an entry in fd array, the array index is the file descriptor.

Note that two elements of the array may point to the same file object.

 

 

 

Filesystem Types

Special filesystems

 

Special filesystems are not bound to physical block devices. However, the kernel assigns to each mounted special filesystem a fictitious block device with 0 as the major number.

This helps the kernel to handle special filesystems and the regular ones in a uniform way.

 

Filesystem Type Registration

The VFS must keep track of all filesystem types whose code is currently included in the kernel. It does this by performing filesystem type registration.

 

Registered filesystem çè a file_system_type object

 

struct file_system_type {

       const char *name;

       int fs_flags;

       int (*get_sb) (struct file_system_type *, int,

                     const char *, void *, struct vfsmount *);

       void (*kill_sb) (struct super_block *);

       struct module *owner;

       struct file_system_type * next;

       struct list_head fs_supers;

};

 

 

Filesystem Handling

The root directory of a filesystem

The root directory of a process

The system’s root filesystem

 

Usually, the root directory of a process is the same as the root directory of the system’s root filesystem.

 

The /proc virtual filesystem is a child of the system’s root filesystem, and thus a sub tree of the tree rooted at the system’s root filesystem.

 

 

Namespaces

In Linux 2.6, every process might have its own tree of mounted filesystems, the so-called namespace of the process.

Most processes share the same namespace, which is the tree rooted at the system’s root filesystem. However, a process may gets a new namespace if it is created by clone() call with CLONE_NEWNS flag set.

 

Filesystem Mounting

Linux上文件系统可以被多次挂载

root@localhost :/home/James# mount /dev/sdb2 ./mnt1

root@localhost :/home/James# ls mnt1

lost+found

root@localhost :/home/James# mount /dev/sdb2 ./mnt2

root@localhost :/home/James# ls mnt2

lost+found

root@localhost :/home/James# touch ./mnt1/test

root@localhost :/home/James# ls mnt1

lost+found  test

root@localhost :/home/James# ls mnt2

lost+found  test

 

Linux上的文件系统挂载会覆盖之前挂载的文件系统,就像一个stack

root@localhost :/home/James# mount /dev/sdb2 ./mnt1/

root@localhost :/home/James# ls mnt1

lost+found  test

root@localhost :/home/James# mount /dev/sdb1 ./mnt1/

root@localhost :/home/James# ls mnt1/

00001.vcf  00004.vcf                                        BaiduMap   download              LGCameraPro_6.2android_zol.apk  recording19190.3gpp  u-center.apk

00002.vcf  126681_863ed540-6d14-47c1-b5ee-0e4290ed1a3e.apk  bluetooth  GPS???(GPS Test).apk  LOST.DIR                        recording71086.3gpp  Wildlife.mp4

00003.vcf  Android                                          DCIM       kugou                 moboplayer_1.apk                songs                Wildlife.wmv

root@localhost :/home/James# umount mnt1/

root@localhost :/home/James# ls mnt1/

lost+found  test

root@localhost :/home/James# umount mnt1/

root@localhost :/home/James# ls mnt1/

root@localhost :/home/James#

 

All information related with filesystem mounting is stored in a mounted filesystem descriptor of type vfsmount.

 

 

Mounting a Generic Filesystem

Mount -> sys_mount() -> do_mount() -> do_kern_mount()

 

 

Mounting the Root Filesystem

1.      The kernel mounts the special rootfs filesystem, which simply provides an empty directory that serves as the initial mount point.

2.      The kernel mounts the real root filesystem over the empty directory.

 

The rootfs filesystem allows the kernel to easily change the real root filesystem.

 

 

 

Pathname Lookup

Pathname lookup: how to derive an inode from the corresponding file pathname.

 

Other than parsing the pathname, several Unix and VFS filesystem features must be taken into consideration when performing pathname lookup procedure:

1.      The access rights

2.      Symbolic link

3.      Identify circular references, breakout an infinite loop

4.      A filename maybe a mount point of a mounted filesystem. This situation must be detected and the lookup operation must continue into the new filesystem.

5.      Pathname lookup must be performed inside the namespace of the calling process.

 

int path_lookup(const char *name, unsigned int flags,

                     struct nameidata *nd)

enum { MAX_NESTED_LINKS = 8 };

 

struct nameidata {

       struct path    path;

       struct qstr     last;

       struct path    root;

       unsigned int flags;

       int          last_type;

       unsigned       depth;

       char *saved_names[MAX_NESTED_LINKS + 1];

 

       /* Intent data */

       union {

              struct open_intent open;

       } intent;

};

 

struct path {

       struct vfsmount *mnt;

       struct dentry *dentry;

};

 

The core of the pathname lookup operation: link_path_walk

 

ð  Standard pathname lookup

ð  Lookup a directory

ð  Lookup with symbolic links

 

 

Implementation of VFS System Calls

Manipulate VFS data structures to implement VFS system calls.

Example: open(), read(), write(), close()

Corresponding system service routine: sys_xxx()

 

File Locking

Concurrent access

Synchronization problem

The POSIX standard requires a file-locking mechanism based on fcntl() system call.

For more details about how to use file locks, refer to “Beginning Linux Programming”.

This kind of lock is known as advisory locks, because it doesn’t work unless other processes cooperate in checking the existence of the file lock before accessing the file. It is similar to semaphores.

 

POSIX: advisory locks

System V: introduce mandatory locks

Linux: supports both advisory and mandatory locks

 

 

System calls: flock(), fcntl()

 

File-locking data structure in Kernel

Struct file_lock {

….

};

 

 

 

 

 

 

转载于:https://my.oschina.net/u/158589/blog/60857

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: Linux文件系统是指在Linux操作系统中用于管理和组织文件的一种文件系统。它是由文件和目录组成的层次结构,可以在其中存储和访问文件和数据。Linux文件系统支持多种文件系统类型,如ext2、ext3、ext4、XFS等。它还支持文件权限、链接、挂载、磁盘配额等高级功能,使得文件管理更加灵活和高效。Linux文件系统是Linux操作系统的核心组成部分,为用户提供了强大的文件管理和数据存储能力。 ### 回答2: Linux文件系统是一种用于管理文件和目录的方法,它为Linux系统提供了许多灵活性和可扩展性。Linux文件系统使用一种树形结构,这意味着文件和目录是按照层次结构来组织的,每个目录下都可以包含若干个文件或者子目录。而且,Linux支持多种文件系统,如ext4,ext3,ext2等等,用户可以根据需求选择合适的文件系统。 Linux文件系统的主要组成部分包括分区、文件、目录、链接、权限等。分区将硬盘分成若干个逻辑区域,每个区域可以单独格式化并挂载为文件系统,这使得用户可以对不同的分区采取不同的备份、安全和存储策略。文件是已命名的数据块,它包括两个部分:文件名和文件内容。文件名是由用户给定的,可以用来标识文件的用途或内容;文件内容则是实际存储的数据。 目录则是一种存储文件的方式,它是一种文件特殊类型。目录中包含了其他文件或目录的链接,通过目录,用户可以快速方便地访问到所需的文件。链接指的是文件系统中两个文件之间的关联关系,有硬链接和软链接两种。权限是保护文件和目录的措施,它控制用户可以执行哪些操作,例如读取、写入或执行。 除此之外,Linux还提供了一种称为“虚拟文件系统”的接口,它允许用户通过统一的方式来访问不同的文件系统。在虚拟文件系统接口下,所有文件谓词都相同,不论是硬盘文件、网络文件、内存文件或其他文件。它提供了一种统一的方法来处理所有文件,使得用户可以更方便地进行文件操作。 总之,Linux文件系统是一个强大而灵活的系统,它提供了广泛的功能和选项,可以适应各种需求和要求。对于开发者来说,了解和熟悉 Linux 文件系统是很重要的,它可以帮助我们更好地管理文件和目录,同时提高我们的工作效率。 ### 回答3: Linux文件系统是Linux操作系统中用于存储和组织数据的部分。Linux文件系统采用一种分层的文件系统布局,它将磁盘上的文件和目录组织成层次结构,这种层次结构有助于用户查找、管理和访问数据。 Linux文件系统提供了多种文件系统类型,包括ext4、XFS、Btrfs、ReiserFS等,每种文件系统都有其特定的功能和优点。但是,大多数Linux用户使用的文件系统仍然是ext4,这是一种稳定、可靠和高效的文件系统。 Linux文件系统支持多种存储设备,包括硬盘、闪存驱动器、光盘、网络存储设备等,这些设备通过虚拟文件系统层和设备驱动程序来与操作系统进行交互。 文件系统的容量管理和数据保护是Linux文件系统的关键特性之一。Linux文件系统使用磁盘配额来限制用户的文件存储量,保护文件系统免受意外删除或损坏。同时,Linux文件系统还支持文件系统快照、数据镜像、RAID和LVM等高级特性,这些功能增强了文件系统的可靠性和可用性。 总的来说, Linux文件系统是Linux操作系统中非常重要的一个组成部分,它使用层次结构的文件系统布局和高级数据保护功能来管理存储设备上的数据,提高数据的可靠性和可用性。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值