浅析FAT32

    最近在接触网络录像机项目时,因项目需要,需要接触硬盘,接触fat32系统。

    总的来说一开始接触fat32,Google比microsoft官方的冗长难懂的说明文档要好啊(虽然fat32是微软的专利),根据自己的使用需要,以及参考google上国外大牛的文章以及microsoft 官方说明来浅谈fat32。本文侧重点在于基于linux系统通过一个现成的fat分区来验证如何查找fat表,以及如何通过fat表的信息查找任意文件或者目录的数据。

首先复制网上现有的资料可以知道:为了能更深入地了解硬盘,我们还必须对硬盘的数据结构有个简单的了解。硬盘上的数据按照其不同的特点和 作用大致可分为5部分:MBR区、DBR区、FAT区、DIR区和DATA区。

1、MBR区

      MBR(Main Boot Record 主引导记录区)位于整个硬盘的0磁道0柱面1扇区。不过,在总共512字节的主引导扇区中,MBR只占用了其中的446个字节,另外的64个字节交给了 DPT(Disk Partition Table硬盘分区表),最后两个字节“55,AA”是分区的结束标志。这个整体构成了硬盘的主引导扇区。

    主引导记录中包含了硬盘的一系列参数和一段引导程序。其中的硬盘引导程序的主要作用是检查分区表是否正确并且在系统硬件完成自检以后引导具有激活标志的分 区上的操作系统,并将控制权交给启动程序。MBR是由分区程序(如Fdisk.exe)所产生的,它不依赖任何操作系统,而且硬盘引导程序也是可以改变 的,从而实现多系统共存。

    下面,我们以一个实例让大家更直观地来了解主引导记录:

    例:80 01 01 00 0B FE BF FC 3F 00 00 00 7E 86 BB 00 在这里我们可以看到,最前面的“80”是一个分区的激活标志,表示系统可引导;“01 01 00”表示分区开始的磁头号为01,开始的扇区号为01,开始的柱面号为00;“0B”表示分区的系统类型是FAT32,其他比较常用的有 04(FAT16)、07(NTFS);“FE BF FC”表示分区结束的磁头号为254,分区结束的扇区号为63、分区结束的柱面号为764;“3F 00 00 00”表示首扇区的相对扇区号为63;“7E 86 BB 00”表示总扇区数为12289622。

2、DBR区

    DBR(Dos Boot Record)是操作系统引导记录区的意思。它通常位于硬盘的0磁道1柱面1扇区,是操作系统可以直接访问的第一个扇区,它包括一个引导程序和一个被称为 BPB(Bios Parameter Block)的本分区参数记录表。引导程序的主要任务是当MBR将系统控制权交给它时,判断本分区跟目录前两个文件是不是操作系统的引导文件(以DOS为 例,即是Io.sys和Msdos.sys)。如果确定存在,就把它读入内存,并把控制权 交给该文件。BPB参数块记录着本分区的起始扇区、结束扇区、文件存储格式、硬盘介质描述符、根目录大小、FAT个数,分配单元的大小等重要参数。DBR 是由高级格式化程序(即Format.com等程序)所产生的。

3、FAT区

    在DBR之后的是我们比较熟悉的FAT(File Allocation Table文件分配表)区。在解释文件分配表的概念之前,我们先来谈谈簇(Cluster)的概念。文件占用磁盘空间时,基本单位不是字节而是簇。一般情 况下,软盘每簇是1个扇区,硬盘每簇的扇区数与硬盘的总容量大小有关,可能是4、8、16、32、64…… 同一个文件的数据并不一定完整地存放在磁盘的一个连续的区域内,而往往会分成若干段,像一条链子一样存放。这种存储方式称为文件的链式存储。由于硬盘上保 存着段与段之间的连接信息(即FAT),操作系统在读取文件时,总是能够准确地找到各段的位置并正确读出。 为了实现文件的链式存储,硬盘上必须准确地记录哪些簇已经被文件占用,还必须为每个已经占用的簇指明存储后继内容的下一个簇的簇号。对一个文件的最后一 簇,则要指明本簇无后继簇。这些都是由FAT表来保存的,表中有很多表项,每项记录一个簇的信息。由于FAT对于文件管理的重要性,所以FAT有一个备 份,即在原FAT的后面再建一个同样的FAT。初形成的FAT中所有项都标明为“未占用”,但如果磁盘有局部损坏,那么格式化程序会检测出损坏的簇,在相 应的项中标为“坏簇”,以后存文件时就不会再使用这个簇了。FAT的项数与硬盘上的总簇数相当,每一项占用的字节数也要与总簇数相适应,因为其中需要存放 簇号。FAT的格式有多种,最为常见的是FAT16和FAT32。

 

4、DIR区

DIR(Directory)是根目录区,紧接着第二FAT表(即备份的FAT表)之后,记录着根目录下每个文件(目录)的起始单元,文件的属性等。定位文件位置时,操作系统根据DIR中的起始单元,结合FAT表就可以知道文件在硬盘中的具体位置和大小了。

 

5、数据(DATA)区

    数据区是真正意义上的数据存储的地方,位于DIR区之后,占据硬盘上的大部分数据空间。

    因此如何从系统MBR,得知FAT32存放的位置,及相关信息?显得很关键,幸运的是Google上说的很详细。

Where To Start... How About At The Beginning?

The first sector of the drive is called the Master Boot Record (MBR). You can read it with LBA = 0. For any new projects, you should not even worry about accessing a drive in CHS mode, as LBA just numbers the sectors sequentially starting at zero, which is much simpler. All IDE drives support accessing sectors using LBA. Also, all IDE drives use sectors that are 512 bytes. Recent Microsoft operating systems refer to using larger sectors, but the drives still use 512 bytes per sector and MS is just treating multiple sectors as if they were one sector. The remainder of this page will only refer to LBA address of 512 byte sectors.

The first 446 bytes of the MBR are code that boots the computer. This is followed by a 64 byte partition table, and the last two bytes are always 0x55 and 0xAA. You should always check these last two bytes, as a simple "sanity check" that the MBR is ok.

 

MBR Diagrag
Figure 1: MBR (first sector) layout

The MBR can only represent four partitions. A technique called "extended" partitioning is used to allow more than four, and often times it is used when there are more than two partitions. All we're going to say about extended partitions is that they appear in this table just like a normal partition, and their first sector has another partition table that describes the partitions within its space. But for the sake of simply getting some code to work, we're going to not worry about extended partitions (and repartition and reformat any drive that has them....) The most common scenario is only one partition using the whole drive, with partitions 2, 3 and 4 blank.

Each partition description is just 16 bytes, and the good news is that you can usually just ignore most of them. The fifth byte is a Type Code that tells what type of filesystem is supposed to be contained within the partition, and the ninth through twelth bytes indicate the LBA Begin address where that partition begins on the disk.

 

Partition Entry
Figure 2: 16-byte partition entry

Normally you only need to check the Type Code of each entry, looking for either 0x0B or 0x0C (the two that are used for FAT32), and then read the LBA Begin to learn where the FAT32 filesystem is located on the disk.

      通过这一段我们了解了MBR的结构信息,以及每个Partition的LBA begin地址,该地址很重要,在后续地址换算中,该地址座位一个基地址来使用,当初在看详细文档时,一直找不到LBA Begin在哪说明,后来再仔细看这段终于知道LBA Begin地址具体定义,是在MBR处给出的一个固定值。

FAT32 Volume ID... Yet Another First Sector

The first step to reading the FAT32 filesystem is the read its first sector, called the Volume ID. The Volume ID is read using the LBA Begin address found from the partition table. From this sector, you will extract information that tells you everything you need to know about the physical layout of the FAT32 filesystem.

Microsoft's specification lists many variables, and the FAT32 Volume ID is slightly different than the older ones used for FAT16 and FAT12. Fortunately, most of the information is not needed for simple code. Only four variables are required, and three others should be checked to make sure they have the expected values.

 

Volume ID Diagram
Figure 4: FAT32 Volume ID, critical fields

 

FieldMicrosoft's NameOffsetSizeValue
Bytes Per SectorBPB_BytsPerSec0x0B16 BitsAlways 512 Bytes
Sectors Per ClusterBPB_SecPerClus0x0D8 Bits1,2,4,8,16,32,64,128
Number of Reserved SectorsBPB_RsvdSecCnt0x0E16 BitsUsually 0x20
Number of FATsBPB_NumFATs0x108 BitsAlways 2
Sectors Per FATBPB_FATSz320x2432 BitsDepends on disk size
Root Directory First ClusterBPB_RootClus0x2C32 BitsUsually 0x00000002
Signature(none)0x1FE16 BitsAlways 0xAA55

After checking the three fields to make sure the filesystem is using 512 byte sectors, 2 FATs, and has a correct signature, you may want to "boil down" these variables read from the MBR and Volume ID into just four simple numbers that are needed for accessing the FAT32 filesystem. Here are simple formulas in C syntax:

 

(unsigned long)fat_begin_lba = Partition_LBA_Begin + Number_of_Reserved_Sectors;
(unsigned long)cluster_begin_lba = Partition_LBA_Begin + Number_of_Reserved_Sectors + (Number_of_FATs * Sectors_Per_FAT);
(unsigned char)sectors_per_cluster = BPB_SecPerClus;
(unsigned long)root_dir_first_cluster = BPB_RootClus;

As you can see, most of the information is needed only to learn the location of the first cluster and the FAT. You will need to remember the size of the clusters and where the root directory is located, but the other information is usually not needed (at least for simply reading files).

If you compare these formulas to the ones in Microsoft's specification, you should notice two differences. They lack "RootDirSectors", because FAT32 stores the root directory the same way as files and subdirectories, so RootDirSectors is always zero with FAT32. For FAT16 and FAT12, this extra step is needed to compute the special space allocated for the root directory.

Microsoft's formulas do not show the "Partition_LBA_Begin" term. Their formulas are all relative to the beginning of the filesystem, which they don't explicitly state very well. You must add the "Partition_LBA_Begin" term found from the MBR to compute correct LBA addresses for the IDE interface, because to the drive the MBR is at zero, not the Volume ID. Not adding Partition_LBA_Begin is one of the most common errors most developers make, so especially if you are using Microsoft's spec, do not forget to add this for correct LBA addressing.

The rest of this page will usually refer to "fat_begin_lba", "cluster_begin_lba", "sectors_per_cluster", and "root_dir_first_cluster", rather than the individual fields from the MBR and Volume ID, because it is easiest to compute these numbers when starting up and then you no longer need all the details from the MBR and Volume ID.

    通过MBR我们知道系统有多少个分区,这里不谈扩展分区的概念。在我的硬盘实际使用中,可以得知,这个网络录像机的硬盘被分为2个分区。/dev/sda1挂载的是ext4,大小为9GB,/dev/sda2挂载的是fat32,大小为923GB。这段Partition_LBA_Begin就是上段的LBA begin地址,fat 32分区的Partition_LBA_Begin地址,是由MBR分区表得知的。我们可以通过dd指令来验证正确性:

       dd if=/dev/sda of=test1 skip=[Partition_LBA_Begin]  bs=512 count=1 与

       dd if=/dev/sda2 of=test2 bs=512 count=1 

       通过这两个指令获取的文件是一致的,如果不对,说明Partition_LBA_Begin计算错误。

      

How The FAT32 Filesystem Is Arranged

The layout of a FAT32 filesystem is simple. The first sector is always the Volume ID, which is followed by some unused space called the reserved sectors. Following the reserved sectors are two copies of the FAT (File Allocation Table). The remainder of the filesystem is data arranged in "clusters", with perhaps a tiny bit of unused space after the last cluster.

 

Filesystem Layout Diagrag
Figure 5: FAT32 Filesystem Overall Layout

The vast majority of the disk space is the clusters section, which is used to hold all the files and directories. The clusters begin their numbering at 2, so there is no cluster #0 or cluster #1. To access any particular cluster, you need to use this formula to turn the cluster number into the LBA address for the IDE drive:

lba_addr = cluster_begin_lba + (cluster_number - 2) * sectors_per_cluster;

Normally clusters are at least 4k (8 sectors), and sizes of 8k, 16k and 32k are also widely used. Some later versions of Microsoft Windows allow using even larger cluster sizes, by effectively considering the sector size to be some mulitple of 512 bytes. The FAT32 specification from Microsoft states that 32k is the maximum cluster size.

Now If Only We Knew Where The Files Were....

When you begin, you only know the first cluster of the root directory. Reading the directory will reveal the names and first cluster location of other files and subdirectories. A key point is that directories only tell you how to find the first cluster number of their files and subdirectories. You also obtain a variety of other info from the directory such as the file's length, modification time, attribute bits, etc, but a directory only tells you where the files begin. To access more than the first cluster, you will need to use the FAT. But first we need to be able to find where those files start.

In this section, we'll only briefly look at directories as much as is necessary to learn where the files are, then we'll look at how to access the rest of a file using the FAT, and later we'll revisit directory structure in more detail.

Directory data is organized in 32 byte records. This is nice, because any sector holds exactly 16 records, and no directory record will ever cross a sector boundry. There are four types of 32-byte directory records.

  1. Normal record with short filename - Attrib is normal
  2. Long filename text - Attrib has all four type bits set
  3. Unused - First byte is 0xE5
  4. End of directory - First byte is zero

Unused directory records are a result of deleting files. The first byte is overwritten with 0xE5, and later when a new file is created it can be reused. At the end of the directory is a record that begins with zero. All other records will be non-zero in their first byte, so this is an easy way to determine when you have reached the end of the directory.

Records that do not begin with 0xE5 or zero are actual directory data, and the format can be determined by checking the Attrib byte. For now, we are only going to be concerned with the normal directory records that have the old 8.3 short filename format. In FAT32, all files and subdirectories have short names, even if the user gave the file a longer name, so you can access all files without needing to decode the long filename records (as long as your code simply ignores them). Here is the format of a normal directory record:

 

Short Directory Entry
Figure 6: 32 Byte Directory Structure, Short Filename Format

 

FieldMicrosoft's NameOffsetSize
Short FilenameDIR_Name0x0011 Bytes
Attrib ByteDIR_Attr0x0B8 Bits
First Cluster HighDIR_FstClusHI0x1416 Bits
First Cluster LowDIR_FstClusLO0x1A16 Bits
File SizeDIR_FileSize0x1C32 Bits

The Attrib byte has six bits defined, as shown in the table below. Most simple firmware will check the Attrib byte to determine if the 32 bytes are a normal record or long filename data, and to determine if it is a normal file or a subdirectory. Long filename records have all four of the least significant bits set. Normal files rarely have any of these four bits set.

        从这一段我们可以得知FAT的数据分布情况,以及详细目录及文件在FAT表中的分配情况,我们可以基于此,查找磁盘上任意的文件数据和目录。。。

 以下是我的硬盘实际操作结果:

MBR:

volume id sector(Fat32):

fat:

dir(根目录区):

 

 

 

 

 

  

转载于:https://www.cnblogs.com/wuhanlin/p/6668537.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值