【INT的内核笔记】文件系统浅析

最新推荐文章于 2023-03-04 14:39:09 发布

Rachelint

最新推荐文章于 2023-03-04 14:39:09 发布

阅读量184

点赞数

分类专栏： Linux学习

本文链接：https://blog.csdn.net/rachelint/article/details/106026736

版权

Linux学习专栏收录该内容

24 篇文章 1 订阅

订阅专栏

1. 磁盘中的文件系统

理想组织形式

满足了文件系统几个需求：存储和读取信息、空闲管理、文件属性(基本信息+权限)

职责分配：

引导块，和系统启动有关，不讨论；
超级块，保存整个文件系统的属性，可以类比到文件的属性，就是大小、时间之类。

摘录：

超级块中会存储整个文件系统的类型、大小，文件系统中索引节点的总数量，当前文件系统的空间占用量、空闲块数量，存储块的大小，文件系统的最近一次检查时间等信息。

空闲空间管理，就像内存一样，假设| s0 | s1 | s2 | s3 | s4 | s5 | s6 | s7 | s8 | s9 |这块内存都是有效的，但是某一时刻后，s1到s5被释放了。那s1到s5这块空闲内存肯定要记录下来的，不然就凭空没了一大块内存了，磁盘也是一样的道理；
i节点表，可以类比到进程虚拟内存和物理内存的映射表，i节点就是用来记录某个文件到实际磁盘逻辑块的映射的；
逻辑块，就是数据，可以类比到物理内存；

实际组织形式(ext2为例)

空闲空间管理，就像内存一样，假设| s0 | s1 | s2 | s3 | s4 | s5 | s6 | s7 | s8 | s9 |这块内存都是有效的，但是某一时刻后，s1_{s5被释放了。那s1}s5这块空闲内存肯定要记录下来的，不然就凭空没了一大块内存了，磁盘也是一样的道理；
i节点表，可以类比到进程虚拟内存和物理内存的映射表，i节点就是用来记录某个文件到实际磁盘逻辑块的映射的；
逻辑块，就是数据，可以类比到物理内存；

区别：

块组和组描述符概念，这两个概念是成对的，出现原因是因为空闲管理的数据位图是固定大小的，就像一个老师管整个学校学生是不可能的，所以要分班一样，这里的块组就相当于学校的班级而已。将块组相关概念去掉，就基本和理想组织形式一致了。实际上，只有超级块0和组描述符0是用于操作的，其他n-1份皆是用于备份；

摘录：

组描述符中会包含每个块组的总体信息，如数据位图的块号、索引节点位图的块号、索引节点表的起始块号、空闲数据块和 inode 的个数等。每个组描述符会占用 32 字节的空间，所以，上面实例中，2048 个块组一共需要 2048 * 32 = 64 KB 的空间来存储所有的组描述符，也就是会占用 16 个存储块。

在 Ext2 文件系统中，超级块和组描述符虽然在每个块组里面都有一份数据，但是实际上内核只使用第 0 块，其他块组中的数据被用作第 0 块数据的备份，用于在异常状态下磁盘状态的检查和恢复。

索引位图概念，就是类似于数据位图，用于对i节点状态进行管理；

每个区域的大小问题(假设磁盘总大小为256GB)：

在磁盘大小确定的前提下，超级块等所有结构大小都是固定的；
超级块，1个块(一般为4KB)大小；
数据位图，1个块大小，总共有32K位，所以每个组大小为32K * 4KB = 128MB；
组描述符，单个组描述符为32B大小，

共有256GB / 128MB = 2K个组，组描述符总大小为32B * 2K = 64KB = 16个块；
索引节点表，每个i节点大小为128B，

每4个块即16KB预留一个i节点，每组大小为128MB，所以每组共有128MB / 16KB = 8K个i节点，

索引节点表总大小 = 128B * 8K = 1MB = 256个块；

日志功能

2. 内存中的文件系统VFS

内存中文件系统组织结构：super_block, inode, dentry

挂载相关结构：mnt

进程与文件系统交互相关结构：

内存中文件系统组织结构

superblock

意义：磁盘分区代理，真正的文件系统在磁盘上，文件实际组织是inode + 逻辑块；

主要变量：

truct super_block
{
    // 用于管理不同磁盘分区的链表，对superblock的意义无实际影响
    struct list_head        s_list;           /* list of all superblocks */
    // 基本信息
    dev_t                   s_dev;            /* identifier */
    unsigned long           s_blocksize;      /* block size in bytes */
    unsigned char           s_blocksize_bits; /* block size in bits */
    unsigned char           s_dirt;           /* dirty flag */
    unsigned long long      s_maxbytes;       /* max file size */
    struct file_system_type s_type;           /* filesystem type */
    // 磁盘文件系统的代理，用于持久态和内存态的转换
    struct super_operations s_op;             /* superblock methods */
	/* 	mount相关，mount的思想其实很简单，
		覆盖内存态文件系统的某个节点，用来作为新文件系统的根节点，
		相当于就是，将A树的某个节点，作为B树的根展开B数
    */
    unsigned long            s_flags;         /* mount flags */
    unsigned long            s_magic;         /* filesystem’s magic number */
    struct rw_semaphore      s_umount;        /* unmount semaphore */
   
    /*	当前文件系统下的inode链表和dentry树的根
    
    */
    struct list_head    s_inodes;    /* all inodes 所有的inodes*/
    struct dentry            *s_root;         /* directory mount point */ 
    
    /*	可以理解为延迟删，类似延迟写
    	当dentry和inode的引用为0时，先放入lru中，一段时间之后再彻底删除，
    	在彻底删除之前，还可以在inode链表和dentry树，还有inode和dentry的hashtable中找到
    */
    struct list_lru        s_dentry_lru ____cacheline_aligned_in_smp;
	struct list_lru        s_inode_lru ____cacheline_aligned_in_smp;
    // 脏inode的链表，适当时间写回
    struct list_head      s_dirty;         /* list of dirty inodes */
}

主要方法：

// 磁盘文件系统的增删查改代理
struct super_operations {
    struct inode *(*alloc_inode)(struct super_block *sb);
    void (*destroy_inode)(struct inode *);
    void (*dirty_inode) (struct inode *);
    int (*write_inode) (struct inode *, int);
    void (*drop_inode) (struct inode *);
    void (*delete_inode) (struct inode *);
}

内存中的super_block比磁盘上的超级块承担的责任要更多一点，主要体现在s_op上。

主要是关于特定文件系统(分区)上inode的create(在内存中), del, read, write等操作；

inode

意义：内存状态下的inode，inode对应文件的实体，dentry则是文件的逻辑体。

类比的话，inode可类比到某个类，dentry则可类比到某个类的指针。

主要变量：

struct inode
{
    // 全局inode哈希表
    struct hlist_node       i_hash;              /* hash list */
    // 三件套的内存结构链表
    struct list_head        i_list;              /* list of inodes */
    struct list_head        i_sb_list;           /* list of superblocks */
    // 一个inode对应多个dentry
    struct list_head        i_dentry;            /* list of dentries */
    /* 	inode，就是inode的代号，
    	无论在磁盘上，还是内核中，实际识别的都是ino，
    	在纯fs层面上，主要用于从磁盘中读取inode
    */
    unsigned long           i_ino;               /* inode number */
    // inode被进程引用计数
    atomic_t                i_count;             /* reference counter */
    // inode被dentry引用计数
    unsigned int            i_nlink;             /* number of hard links */
    // 所属用户
    uid_t                   i_uid;               /* user id of owner */
    gid_t                   i_gid;               /* group id of owner */
    // 文件基本属性 + 权限
    kdev_t                  i_rdev;              /* real device node */
    u64                     i_version;           /* versioning number */
    loff_t                  i_size;              /* file size in bytes */
    seqcount_t              i_size_seqcount;     /* serializer for i_size */
    struct timespec         i_atime;             /* last access time */
    struct timespec         i_mtime;             /* last modify time */
    struct timespec         i_ctime;             /* last change time */
    unsigned int            i_blkbits;           /* block size in bits */
    blkcnt_t                i_blocks;            /* file size in blocks */
    unsigned short          i_bytes;             /* bytes consumed */
    umode_t                 i_mode;              /* access permissions */
    // 主要是iop
    struct inode_operations *i_op;               /* inode ops table */
    struct file_operations  *i_fop;              /* default inode ops */
    struct super_block      *i_sb;               /* associated superblock */
    // inode状态、dirty位等，inode状态在磁盘上也存在
    unsigned long           i_state;             /* state flags */
    unsigned long           dirtied_when;        /* first dirtying time */
    unsigned int            i_flags;             /* filesystem flags */
    
    // inode哈希表，就是利用哈希做索引，避免遍历搜索
    struct hlist_node	i_hash;        
    // 当引用数为0的时候，扔进去，等待东山再起或者旧了之后彻底释放掉
	struct list_head	i_lru;
}

主要方法

// 看着就很熟悉对不对 ？mkdir、link等等一系列的文件命令最终都是调用inode相关方法
struct inode_operations
{

    int (*create) (struct inode *,struct dentry *,int, struct nameidata *);
    struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameidata *);
    int (*link) (struct dentry *,struct inode *,struct dentry *);
    int (*unlink) (struct inode *,struct dentry *);
    int (*symlink) (struct inode *,struct dentry *,const char *);
    int (*mkdir) (struct inode *,struct dentry *,int);
    int (*rmdir) (struct inode *,struct dentry *);
    int (*mknod) (struct inode *,struct dentry *,int,dev_t);
    int (*rename) (struct inode *, struct dentry *,
                   struct inode *, struct dentry *);
    int (*readlink) (struct dentry *, char __user *,int);
    void * (*follow_link) (struct dentry *, struct nameidata *);
    void (*put_link) (struct dentry *, struct nameidata *, void *);
    ......
};

dentry

意义：用于在内存中组织成文件树，主要的作用是提高查询效率；

主要变量

struct dentry
{
    atomic_t                 d_count;      /* usage count */
    unsigned int             d_flags;      /* dentry flags */
    spinlock_t               d_lock;       /* per-dentry lock */
    
    // 文件夹被mounted的话，会成为别的文件系统的根节点
    int                      d_mounted;    /* is this a mount point? */
    // 对应的inode
    struct inode             *d_inode;     /* associated inode */
    // 全局dentry哈希表
    struct hlist_node        d_hash;       /* list of hash table entries */
    
   	// 用于形成文件树
    struct dentry            *d_parent;    /* dentry object of parent */
    union
    {
        struct list_head     d_child;      /* list of dentries within */
        struct rcu_head      d_rcu;        /* RCU locking */
    } d_u;
    struct list_head         d_subdirs;    /* subdirectories */
    
    // 主要是文件名 + 哈希值
    struct qstr              d_name;       /* dentry name */
    // 文件名足够短的话，会放在这
    unsigned char            d_iname[DNAME_INLINE_LEN_MIN]; /* short name */
    
    struct list_head         d_lru;        /* unused list */

    struct list_head         d_alias;  /* list of alias inodes */
    unsigned long            d_time;       /* revalidate time */
    struct dentry_operations *d_op;        /* dentry operations table */
    struct super_block       *d_sb;        /* superblock of file */
    void                     *d_fsdata;    /* filesystem-specific data */

};

主要方法
```
/*	其实得到的最主要信息是，dentry相关操作更多偏向于树结构的形成，
	在其中基本体现不出和文件本质相关的操作
*/
struct dentry_operations
{
    int (*d_revalidate) (struct dentry *, struct nameidata *);
    // 计算hash值
    int (*d_hash) (struct dentry *, struct qstr *);
    int (*d_compare) (struct dentry *, struct qstr *, struct qstr *);
    int (*d_delete) (struct dentry *);
    void (*d_release) (struct dentry *);
    // 用于加入到lru中
    void (*d_iput) (struct dentry *, struct inode *);
    char *(*d_dname) (struct dentry *, char *, int);
};
```
3. dentry和inode缓存机制
- 特定文件系统的象征superblock，管理着该文件系统自身的inode和dentry；
- 主体架构为inode链表和dentry树；
- 记录计数为0即unused状态的inode和dentry的数据结构为，inode_lru和dentry_lru，
  
  均可在superblock及其自身中找到。
  
  其目的是延迟删，先在lru中记录着，一段时间后不再被使用再彻底删除，
  
  在彻底删除之前，其在inode列表和dentry树，inode哈希表和dentry哈希表中仍然存在；
- 记住dirty状态inode的数据结构为dirty，用于延迟写；
- 用于避免线性查找的数据结构，inode_hashtable和dentry_hashtable，是全局的。
  
  其中dentry_hashtable用文件名 + 父dentry来索引；
4. dentry树路径搜索
- 调用顺序
  
  path_lookup()
  do_path_lookup()
  path_walk()
  link_path_walk()
  __link_path_walk()
  do_lookup()
- 主要发生作用的是__d_lookup()和read_lookup()，
  
  它们均是do_lookup()的子函数，
  
  __d_lookup()比较难理解，我一开始以为既然是通过哈希查找，应该就是一步到位的，
  
  然而并不是，虽然dentry有哈希表机制，但还是一层一层往下查找的。
  
  例如/bin/test，先搜索到bin，dentry_hashtable用于提高在bin中查找test的速度；
- nameidata结构
  
  在逐层查找的过程，nameidata用于记录上一层结果，和做下一层查找的一些数据准备；

Rachelint

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【INT的内核笔记】文件系统浅析

1. 磁盘中的文件系统理想组织形式| 引导块 | 超级块 | 空闲空间管理 | i节点表 | 逻辑块(根目录+文件和目录) |满足了文件系统几个需求：存储和读取信息、空闲管理、文件属性(基本信息+权限)职责分配：引导块，和系统启动有关，不讨论；超级块，保存整个文件系统的属性，可以类比到文件的属性，就是大小、时间之类。摘录：超级块中会存储整个文件系统的类型、大小，文件系统中索引节点的总数量，当前文件系统的空间占用量、空闲块数量，存储块的大小，文件系统的最近一次检查时间等信息。
复制链接

扫一扫