独立磁盘冗余阵列：RAID

最新推荐文章于 2024-04-08 08:34:11 发布

Aaron185

最新推荐文章于 2024-04-08 08:34:11 发布

阅读量354

点赞数

分类专栏：操作系统

原文链接：https://blog.csdn.net/fsx2550553488/article/details/79819164

版权

操作系统专栏收录该内容

6 篇文章 1 订阅

订阅专栏

本文链接： https://blog.csdn.net/fsx2550553488/article/details/79819164

RAID
RAID：独立磁盘冗余阵列（Redundant Arrays of Independent(最早是Inexpensive，后来改成Independent) Disks，RAID）

计算机核心部件：CPU，内存，IO设备

部分硬盘接口格式：

硬盘类型	英文名称	传输速度	接口类型
IDE	Integrated Drive Electronics	133Mbps	并行接口
SATA1	Serial Advanced Technology Attachment1.0	300Mbps	串行接口
SATA2		600Mbps	串行接口
SATA3		6Gbps	串行接口
USB3.0	Universal Serial Bus	480Mbps	串行接口
SCSI	small computer system interface	1Gbps	并行
SAS

驱动：将逻辑指令转化为对应设备的控制指令的程序。

将cpu指令转化为各设备可以理解的指令，在转换处有一个控制设备，集成再主板上的叫控制器，独立在主板外的叫适配器；本质上是同一种东西。即控制器(适配器)是连接主板和IO设备(外接IO设备，如磁盘、USB等)中间的桥梁，负责让CPU和IO设备良好通信的介质。

RAID级别

RAID级别仅仅代表磁盘组织方式不同(应用不同场景需求)，没有上下之分。考虑因素：数据传输速度，完整性，可靠性，安全性等。

RAID0：条带技术

所谓条带技术，即RAID0，就是将一各数据分为很多片，通过控制器，存放在不同的磁盘设备上，这样可以大大提高读写速度，缓解IO端口的读写瓶颈问题。但是带来的问题：如果任意一个磁盘损坏，此文件就无法使用，这也大大降低了磁盘存储的可靠性。

RAID1：磁盘镜像技术

磁盘镜像技术，即RAID1，就是在通过控制器向磁盘存储数据时，将一份数据分别存放在不同的磁盘上，每个磁盘存放的数据都是完整的，相同的。这种技术，提高了磁盘存储的可靠性，即是有一个磁盘损坏，也可以从别的磁盘上备份得到；但是缺毫无冗余能力，即读写速度并没有提升(相反，写的速度反而降低)，硬盘利用率只有二分之一。

RAID4：校验码技术

校验码技术，即RAID4，磁盘群提供一个校验码盘，负责记录其他磁盘存储的总和数据，如：在上图，存储数据磁盘上分别存储1、2、2，则在校验码盘存储5(1+2+2)；如果存储数据盘有一个损坏，有能通过校验码盘和其他正常工作的盘，恢复出损坏盘的数据。校验码技术，提升了读写速度，并且有一定的可靠性(坏一块磁盘并不影响数据完整性)。

即便如此，因为恢复数据(有一块盘损坏)时，需要校验码盘和其他盘共同参与，这也提升了IO压力，和损坏风险。同时，校验码盘容易成为整个技术的瓶颈。

RAID5：轮流校验码技术

RAID5和RAID4很相似，差别在于：RAID5的校验码盘不是固定的，即磁盘群内，互为校验码盘，这样就缓解了RAID4暴露出来的问题(固定的校验码盘容易成为整个技术的瓶颈)。

RAID级别示意图

级别	应用技术	允许磁盘损坏程度	性能表现
RAID0	条带技术	不允许磁盘坏	传输速度快(读写性能提升N倍)，可靠性极差(没有冗余(错)能力)
RAID1	磁盘镜像技术	坏一块不影响	传输速度慢(写性能下降，读性能提升)，可靠性高(较强的冗余能力)
RAID2	汉明码技术	同0	在0的基础上，提升读写数据时的纠错能力
RAID3	汉明码技术	同1	在1的基础上，提升读写数据时的纠错能力
RAID4	校验码技术	坏一块可修复	传输速度快(读写性能提升)，具备冗余能力(不高)，修复危险度高
RAID5	轮流校验码技术	坏一块可修复	同4，避免单一校验硬盘成为瓶颈问题
RAID6

RAID组合

RAID01

所谓RAID01就是先进行RAID0(条带化)，在进行RAID1(镜像化)，这样可一提升IO速度，也可以确保可靠性。

但是，如果有一块磁盘损坏，可能整个磁盘体系都需要被调用。

RAID10

所谓RAID10即先进行RAID1(镜像技术)，再进行RAID0(条带技术)，这样可一提升IO速度，也可以确保可靠性。如果有一块磁盘损坏，则不需要调用整个磁盘体系，只需要调用和损坏磁盘互为镜像的磁盘进行数据恢复即可。

RAID50

所谓RAID10即先进行RAID5(镜像技术)，再进行RAID0(条带技术)。此处不再过多解释。

RAID50大大提升对数据的读写能力，同时具备冗余能力。空间利用率为(总盘数-校验盘个数)/总盘数

JBOD技术

JBOD磁盘技术，适合Hadoop。

性能无提升，不具备冗余能力，空间利用率百分之百

RAID盘选取

早期：

IDE
SCSI

如今：

STAT
盘大，价格便宜
SAS
盘小，价格昂贵

硬件RAID

模型1

主机连接有插槽，RAID控制器通过线缆连接到插槽(插槽镶嵌再主机上)，将硬盘插入插槽，组成RAID阵列。

模型2

磁盘阵列放在一个大的磁盘盒子内，通过一个向外接口

主机没有插槽，通过一个外接线缆，将主机RAID控制器和磁盘阵列盒子连接在一起。

软件RAID

内核必须支持软件RAID，LInux内核中有一个模块md：multi disks(多磁盘)

使用md模拟一个raid(逻辑RAID)，/dev/mdx

mdadm

RAID需要在不同磁盘上做，才有意义(突破单个磁盘的IO瓶颈)，实验状态下，使用一块磁盘不同分区来实现(真实环境是毫无意义的，因为磁盘的IO端口依然是传输瓶颈)。

mdadm是个模式化的命令：

创建模式 -C
专用选项：-l(指定级别)；-n(设备个数)；-a(是否自动为其创建设备文件(yes，no))；-c(指定chunk大小，即数据块大小)
-x : 指定空闲盘的个数。留着一个备份盘，如果有一个使用的盘坏掉，备份盘立刻顶上去，保证数据安全。
管理模式
–add(-a) --remove(-r)等
mdadm /dev/mdx --fail /dev/sdxx
mdadm -S(–stop) /dev/mdx
监控模式 -F

增长模式 -G
装配模式 -A

RAID0

将/dev/vdb分为四个分区，并创建其每个分区的系统类型为Linux raid autodetect。

[root@raid ~]# fdisk /dev/vdb

Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel

Building a new DOS disklabel with disk identifier 0x05300968.

Changes will remain in memory only, until you decide to write them.

After that, of course, the previous content won’t be recoverable.



Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)



WARNING: DOS-compatible mode is deprecated. It’s strongly recommended to

         switch off the mode (command ‘c’) and change display units to

         sectors (command ‘u’).



Command (m for help): n

Command action

   e   extended

   p   primary partition (1-4)

p

Partition number (1-4): 1

First cylinder (1-2080, default 1):

Using default value 1

Last cylinder, +cylinders or +size{K,M,G} (1-2080, default 2080): +200M

Command (m for help): n

Command action

   e   extended

   p   primary partition (1-4)

p

Partition number (1-4): 2

First cylinder (408-2080, default 408):

Using default value 408

Last cylinder, +cylinders or +size{K,M,G} (408-2080, default 2080): +200M

Command (m for help): n

Command action

   e   extended

   p   primary partition (1-4)

p

Partition number (1-4): 3

First cylinder (815-2080, default 815):

Using default value 815

Last cylinder, +cylinders or +size{K,M,G} (815-2080, default 2080): +200M



Command (m for help): n

Command action

   e   extended

   p   primary partition (1-4)

p

Selected partition 4

First cylinder (1222-2080, default 1222):

Using default value 1222

Last cylinder, +cylinders or +size{K,M,G} (1222-2080, default 2080):

Using default value 2080

Command (m for help): t

Partition number (1-4): 1

Hex code (type L to list codes): fd

Changed system type of partition 1 to fd (Linux raid autodetect)



Command (m for help): t

Partition number (1-4): 2

Hex code (type L to list codes): fd

Changed system type of partition 2 to fd (Linux raid autodetect)



Command (m for help): t

Partition number (1-4): 3

Hex code (type L to list codes): fd

Changed system type of partition 3 to fd (Linux raid autodetect)



Command (m for help): t

Partition number (1-4): 4

Hex code (type L to list codes): fd

Changed system type of partition 4 to fd (Linux raid autodetect)

Command (m for help): w

将/dev/vdb1和/dev/vdb4作为磁盘阵列(RAID0模式)。





[root@raid ~]# mdadm -C /dev/md0 -a yes -l 0 -n 2 /dev/vdb{1,4}

mdadm: Defaulting to version 1.2 metadata

mdadm: array /dev/md0 started.



[root@raid ~]# cat /proc/mdstat         //查看创建结果

Personalities : [raid0] 

md0 : active raid0 vdb4[1] vdb1[0]//所有启用的RAID设备

      637440 blocks super 1.2 512k chunks   //chunk大小为512k

unused devices: <none>

在操作系统看来，/dev/md0就是一个块设备，和/dev/sda2没有区别，RAID机制的实现是在物理层面。即块设备可以被操作系统识别后，就需要进行格式化：



[root@raid ~]# mkfs.ext3 /dev/md0   //格式化/dev/md0为ext3格式

mke2fs 1.41.12 (17-May-2010)

Filesystem label=

OS type: Linux

Block size=4096 (log=2)

Fragment size=4096 (log=2)

Stride=128 blocks, Stripe width=256 blocks

39840 inodes, 159360 blocks

7968 blocks (5.00%) reserved for the super user

First data block=0

Maximum filesystem blocks=163577856

5 block groups

32768 blocks per group, 32768 fragments per group

7968 inodes per group

Superblock backups stored on blocks: 

    32768, 98304



Writing inode tables: done                            

Creating journal (4096 blocks): done

Writing superblocks and filesystem accounting information: done



This filesystem will be automatically checked every 20 mounts or

180 days, whichever comes first.  Use tune2fs -c or -i to override.

查看md0文件系统



[root@raid ~]# fdisk -l /dev/md0



Disk /dev/md0: 652 MB, 652738560 bytes

2 heads, 4 sectors/track, 159360 cylinders

Units = cylinders of 8 * 512 = 4096 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 524288 bytes / 1048576 bytes

Disk identifier: 0x00000000

挂载文件设备



[root@raid ~]# mkdir /fsx

[root@raid ~]# mount /dev/md0 /fsx/

[root@raid ~]# cd /fsx/

[root@raid fsx]# ls

lost+found

RAID1

前期步骤和创建RAID0基本一样，即创建分区，设备分区系统类型等。然后，创建RAID1：



root@raid fsx]# mdadm -C /dev/md1 -a yes -l 1 -n 2 /dev/vda1 /dev/vda2 

mdadm: Note: this array has metadata at the start and

    may not be suitable as a boot device.  If you plan to

    store ‘/boot’ on this device please ensure that

    your boot-loader understands md/v1.x metadata, or use

    --metadata=0.90

Continue creating array? y   

mdadm: Defaulting to version 1.2 metadata

mdadm: array /dev/md1 started.

查看/dev/md1设备块信息



[root@raid fsx]# fdisk -l /dev/md1



Disk /dev/md1: 209 MB, 209846272 bytes//只是一个磁盘的大小，即操作系统看来，md1仅仅是一个磁盘，备份的任务是物理底层完成的

2 heads, 4 sectors/track, 51232 cylinders

Units = cylinders of 8 * 512 = 4096 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000

格式化，挂载md1，并写入数据



[root@raid fsx]# mkfs.ext4 /dev/md1

mke2fs 1.41.12 (17-May-2010)

warning: 127 blocks unused.



Filesystem label=

OS type: Linux

Block size=1024 (log=0)

Fragment size=1024 (log=0)

Stride=0 blocks, Stripe width=0 blocks

51400 inodes, 204801 blocks

10246 blocks (5.00%) reserved for the super user

First data block=1

Maximum filesystem blocks=67371008

25 block groups

8192 blocks per group, 8192 fragments per group

2056 inodes per group

Superblock backups stored on blocks: 

    8193, 24577, 40961, 57345, 73729



Writing inode tables: done                            

Creating journal (4096 blocks): done

Writing superblocks and filesystem accounting information: done



This filesystem will be automatically checked every 23 mounts or

180 days, whichever comes first.  Use tune2fs -c or -i to override.

[root@raid fsx]# mount /dev/md1 /mnt/

[root@raid fsx]# echo “fsx123” > /mnt/fsx

[root@raid mnt]# cat fsx

fsx123

监控模式下，可以显示RAID设备的详细信息：



[root@raid mnt]# mdadm -D /dev/md1      //-D == --detail

/dev/md1:

        Version : 1.2   //版本

  Creation Time : Wed Apr  4 02:34:10 2018

     Raid Level : raid1

     Array Size : 204928 (200.16 MiB 209.85 MB) //阵列大小

  Used Dev Size : 204928 (200.16 MiB 209.85 MB)

   Raid Devices : 2

  Total Devices : 2

    Persistence : Superblock is persistent



    Update Time : Wed Apr  4 02:39:29 2018

          State : clean 

 Active Devices : 2

Working Devices : 2

 Failed Devices : 0

  Spare Devices : 0



           Name : raid:1  (local to host raid)

           UUID : d9cb6898:63c22e52:c3651fed:8650a65c

         Events : 17



    Number   Major   Minor   RaidDevice State

       0     252        1        0      active sync   /dev/vda1

       1     252        2        1      active sync   /dev/vda2

管理模式下可以模拟一个磁盘损坏：



[root@raid mnt]# mdadm /dev/md1 -f /dev/vda1    //模拟损坏/dev/md1阵列中的/dev/vda1磁盘

mdadm: set /dev/vda1 faulty in /dev/md1

[root@raid mnt]# ls

fsx  lost+found        //一个磁盘存坏，文件仍然可以读取，RAID1的冗余能力。

[root@raid mnt]# vim fsx 

[root@raid mnt]# cat fsx

fsx123

[root@raid mnt]# mdadm -D /dev/md1

/dev/md1:

        Version : 1.2

  Creation Time : Wed Apr  4 02:34:10 2018

     Raid Level : raid1

     Array Size : 204928 (200.16 MiB 209.85 MB)

  Used Dev Size : 204928 (200.16 MiB 209.85 MB)

   Raid Devices : 2

  Total Devices : 2

    Persistence : Superblock is persistent



    Update Time : Wed Apr  4 02:45:27 2018

          State : clean, degraded 

 Active Devices : 1

Working Devices : 1

 Failed Devices : 1

  Spare Devices : 0



           Name : raid:1  (local to host raid)

           UUID : d9cb6898:63c22e52:c3651fed:8650a65c

         Events : 21



    Number   Major   Minor   RaidDevice State

       0       0        0        0      removed         //查看状态，处与remove

       1     252        2        1      active sync   /dev/vda2



       0     252        1        -      faulty   /dev/vda1

移除损坏的设备



[root@raid mnt]# mdadm /dev/md1 -r /dev/vda1    //-r == --remove

mdadm: hot removed /dev/vda1 from /dev/md1

[root@raid mnt]# mdadm -D /dev/md1

/dev/md1:

        Version : 1.2

  Creation Time : Wed Apr  4 02:34:10 2018

     Raid Level : raid1

     Array Size : 204928 (200.16 MiB 209.85 MB)

  Used Dev Size : 204928 (200.16 MiB 209.85 MB)

   Raid Devices : 2

  Total Devices : 1

    Persistence : Superblock is persistent



    Update Time : Wed Apr  4 02:47:40 2018

          State : clean, degraded 

 Active Devices : 1

Working Devices : 1

 Failed Devices : 0

  Spare Devices : 0



           Name : raid:1  (local to host raid)

           UUID : d9cb6898:63c22e52:c3651fed:8650a65c

         Events : 26



    Number   Major   Minor   RaidDevice State

       0       0        0        0      removed            //被移除

       1     252        2        1      active sync   /dev/vda2

重新给md1添加一块设备



[root@raid mnt]# mdadm /dev/md1 -a /dev/vda1    //-a == --add

mdadm: added /dev/vda1

[root@raid mnt]# mdadm -D /dev/md1

/dev/md1:

        Version : 1.2

  Creation Time : Wed Apr  4 02:34:10 2018

     Raid Level : raid1

     Array Size : 204928 (200.16 MiB 209.85 MB)

  Used Dev Size : 204928 (200.16 MiB 209.85 MB)

   Raid Devices : 2

  Total Devices : 2

    Persistence : Superblock is persistent



    Update Time : Wed Apr  4 02:50:01 2018

          State : clean, degraded, recovering 

 Active Devices : 1

Working Devices : 2

 Failed Devices : 0

  Spare Devices : 1



 Rebuild Status : 47% complete



           Name : raid:1  (local to host raid)

           UUID : d9cb6898:63c22e52:c3651fed:8650a65c

         Events : 37



    Number   Major   Minor   RaidDevice State

       2     252        1        0      spare rebuilding   /dev/vda1

       1     252        2        1      active sync   /dev/vda2

[root@raid mnt]# cat /proc/mdstat

Personalities : [raid0] [raid1] 

md1 : active raid1 vda1[2] vda2[1]

      204928 blocks super 1.2 [2/2] [UU]

      

md0 : active raid0 vdb4[1] vdb1[0]

      637440 blocks super 1.2 512k chunks

      

unused devices: <none>

停止阵列



[root@raid /]# umount /mnt/         //停止前必须先解挂

[root@raid /]# mdadm -S /dev/md1    //-S == --stop

mdadm: stopped /dev/md1

重新启动阵列



[root@raid /]# mdadm -A /dev/md1 /dev/vda1 /dev/vda2

mdadm: /dev/md1 has been started with 2 drives.

[root@raid /]# mdadm -D /dev/md1

/dev/md1:

        Version : 1.2

  Creation Time : Wed Apr  4 02:34:10 2018

     Raid Level : raid1

     Array Size : 204928 (200.16 MiB 209.85 MB)

  Used Dev Size : 204928 (200.16 MiB 209.85 MB)

   Raid Devices : 2

  Total Devices : 2

    Persistence : Superblock is persistent



    Update Time : Wed Apr  4 02:54:19 2018

          State : clean 

 Active Devices : 2

Working Devices : 2

 Failed Devices : 0

  Spare Devices : 0



           Name : raid:1  (local to host raid)

           UUID : d9cb6898:63c22e52:c3651fed:8650a65c

         Events : 83



    Number   Major   Minor   RaidDevice State

       2     252        1        0      active sync   /dev/vda1

       3     252        2        1      active sync   /dev/vda2