本节主要介绍MD中的bitmap的机制,该机制主要用于减少不必要的同步操作。在真正的数据IO写操作之前先将该chunk对应的bitmap内存中的bit位设置为1,写入磁盘文件bitmap文件中,而在真正的数据写完成之后,再将bitmap文件中的bit位清零。这样,在进行一次IO写操作中,就多了两次磁盘的写操作,势必影响IO的效率,因此,在linux内核中,关于这部分做了两个方面的优化:1)批量写入;2)延迟清除。使得bitmap的操作现在缓存中操作,必要时再写入磁盘。
在进入主题之前,先看看这部分涉及的几个主要的数据结构:
1)超级块位于磁盘文件开始前256个字节,用于记录bitmap文件的管理信息,主要的域为chunksize(bitmap文件中一个bit对应的chunk的大小)。
typedef struct bitmap_super_s {
__le32 magic; /* 0 BITMAP_MAGIC */
__le32 version; /* 4 the bitmap major for now, could change... */
__u8 uuid[16]; /* 8 128 bit uuid - must match md device uuid */
__le64 events; /* 24 event counter for the bitmap (1)*/
__le64 events_cleared;/*32 event counter when last bit cleared (2) */
__le64 sync_size; /* 40 the size of the md device's sync range(3) */
__le32 state; /* 48 bitmap state information */
__le32 chunksize; /* 52 the bitmap chunk size in bytes */
__le32 daemon_sleep; /* 56 seconds between disk flushes */
__le32 write_behind; /* 60 number of outstanding write-behind writes */
__u8 pad[256 - 64]; /* set to zero */
} bitmap_super_t;
2)从注释可以看出,该结构体代表了bitmap在内存中的页;
/* the in-memory bitmap is represented by bitmap_pages */
struct bitmap_page {
/*
* map points to the actual memory page映射到实际物理页的指针
*/
char *map;
/*
* in emergencies (when map cannot be alloced), hijack the map特殊情况下,使用映射的指针作为计数器,因为一个计数器的大小为16位,因此,
* pointer and use it as two counters itself可以将指针作为两个计数器来使用;
*/
unsigned int hijacked:1;
/*
* count of dirty bits on the page 在一个物理页中的dirty位的计数器
*/
unsigned int count:31;
};
3)bitmap在磁盘中的文件表现,每个mddev(磁盘阵列)包含一个bitmap文件。
/* the main bitmap structure - one per mddev */
struct bitmap {
struct bitmap_page *bp; /*bitmap文件对应的物理内存页的数组*/
unsigned long pages; /* total number of pages in the bitmap bitmap文件映射到内存中总共占用的页数*/
unsigned long missing_pages; /* number of pages not yet allocated */
mddev_t *mddev; /* the md device that the bitmap is for */
int counter_bits; /* how many bits per block counter */
/* bitmap chunksize -- how much data does each bit represent?每个bit位代表的数据chunk大小 */
unsigned long chunksize;
unsigned long chunkshift; /* chunksize = 2^chunkshift (for bitops) */
unsigned long chunks; /* total number of data chunks for the array 阵列中总共包含的数据chunk的数量*/
/* We hold a count on the chunk currently being synced, and drop
* it when the last block is started. If the resync is aborted
* midway, we need to be able to drop that count, so we remember
* the counted chunk..
*/
unsigned long syncchunk;
__u64 events_cleared;
int need_sync;
/* bitmap spinlock */
spinlock_t lock;
/*bitmap有两种表现形式:1)存放在MD设备之外,此时file指向的就是bitmap文件对应的file;2)存放在MD设备中,此时offset代表了bitmap距离superblock的偏移值*/