文章来自于图睿科技(www.graidtech.com)
执行概要
Graid Technology的SupremeRAID™为数据库工作负载(如MySQL)的NVMe SSD数据保护提供了更高性能的替代方案,特别是与Linux MD RAID相比。优点是显著的,在最佳(非降级)和恢复(降级)状态下,SupremeRAID™RAID 6 每秒比 Linux MD RAID 10提供更多的事务。
此外,SupremeRAID™RAID 6的重建速度比Linux MD RAID 10快近两倍。当比较类似的RAID级别时(例如,RAID 6与MD RAID), SupremeRAID™的性能优势更加引人注目。
关于本次测试
在这个测试中,我们将在SupremeRAID™ SR-1010 RAID 6、Linux MD RAID 10 和 Linux MD RAID 6上部署 MySQL 8服务器。我们将使用sysbench,一个流行的数据库基准测试工具,来执行OLTP读/写测试和评估RAID性能。
测试的背景
硬件规格
•服务器:戴尔PowerEdge R750 x 1
•处理器:Intel®Xeon®Gold 6338 CPU @ 2.00GHz x 2
•内存:三星M393A4G43BB4-CWE 32GB DDR4 3200Mhz x 16
•SupremeRAID™:SR-1010 SR-BUN-1010-FD32 x 1
•SSD: Intel®SSD D7-P5510 SSDPF2KX038TZ 3.84TB × 8
软件配置
•操作系统:Ubuntu 20.04.4 LTS
•内核:5.4.0-131-generic
•SupremeRAID™:驱动程序版本:1.3.0-473. gb5466fc .010
•Linux MD RAID: mdadm version v4.1 - 2018-10-01
•文件系统:xfs 5.3.0-1ubuntu2
MySQL版本:8.0.30-0ubuntu0.20.04.2
基准测试工具:sysbench 1.1.0
硬件配置
•MADT核心枚举:线性
•逻辑处理器:启用
•设备位置:
o 4块Intel®SSD D7-P5510,位于CPU0
o 4块Intel®SSD D7-P5510,位于CPU1
o一个位于CPU1的SupremeRAID™SR-1010
基准测试场景和MySQL调优
•测试负载:Sysbench OLTP_RW统一
•InnoDB页面大小:16K
•并发用户数:64、128、256、512、1024
•数据集:8个表,每个表50M行,总计100GB
•InnoDB缓冲池(BP): 32GB(约占缓冲池中缓存数据的32%)
•工作线程:
o innodb_buffer_pool_instances = 48
o innodb_page_cleaners = 48
o innodb_read_io_threads = 32
o innodb_write_io_threads = 16
o innodb_purge_threads = 16
•测试模式在最佳和重建1 SSD状态:
o SupremeRAID™RAID 6, 8个ssd和4k块
o Linux MD RAID 6, 8个ssd和4k块
o Linux MD RAID 6, 8个ssd和16k块
o Linux MD RAID 10, 8个ssd和4k块
o Linux MD RAID 10, 8个ssd和16k块
测试结果
结果表明,SR-1010 RAID 6的性能几乎是Linux MD RAID 6的两倍。SR-1010 RAID 6的性能可以与Linux MD RAID 10相媲美,同时提供更多的可用容量和更好的数据安全性。
处于最佳状态的每秒事务数
在较低并发用户测试用例(从64到256)中,MD RAID 10表现良好,因为它不是基于奇偶校验的RAID。但是,随着并发用户数的增加,SQL服务会消耗CPU资源,并与MD RAID竞争。这种资源争用会显著降低性能,其水平低于SupremeRAID™RAID 6。
与MD RAID 6相比,SupremeRAID™RAID 6在所有情况下都更快,并且在高并发用户下可以提供两倍以上的性能。
重建状态下的每秒事务数
在重建过程中,由于读取和重建任务工作负载在后台运行,导致RAID性能下降。测试表明,为RAID 10 配置的SupremeRAID™ 性能下降了22%,为RAID 10 配置的MD RAID 性能下降了87%。在 RAID 6 中,SupremeRAID™的性能下降了50%,MD RAID的性能下降了95%。然而,在所有测试用例中,为RAID 6配置的SupremeRAID™性能仍然比为RAID 10配置的MD RAID高9倍。
重建速度
在重建过程中,虽然MD RAID 10不受降级读取的影响,但由于重建流量的影响,其性能仍然下降。对于受益于GPU计算能力的SupremeRAID™RAID 10,性能下降较小。
在512个用户时,SupremeRAID™RAID 6同时保持每秒12,953个事务和700Mb/s的重建吞吐量(每小时约2.5TB)。然而,512个用户的MD RAID 6只支持每秒598个事务和340Mb/s的重建。MD RAID 6的性能严重下降,因为降级的读取和重建任务会消耗大量CPU资源来计算奇偶校验
结论
像MySQL这样的数据库受益于访问性能最快的存储,因此使用多个NVMe SSD 与 RAID进行数据保护是标准的。选择 Supermeraid™ 可以使用更有效和高效的RAID 6进行数据保护,提供比Linux MD RAID 10更高的性能。其他好处包括:
•防止两个ssd同时故障时的数据丢失。
•8块ssd硬盘可用容量提高50%(16块ssd硬盘可用容量提高75%)。
•在RAID最佳状态下,每秒事务处理速度提高85%
•在RAID恢复状态下,每秒事务处理速度提高28%
•在RAID重建状态下,每秒事务处理速度提高945%
•重建速度提高66%,对性能的影响较小。
测试流程·Testing Flow
SupremeRAID™ RAID 6 with 8 SSDs
1. Compose a RAID 6 group with eight physical drives and create a virtual drive with all available space.
$ sudo graidctl create dg raid6 0-7
$ sudo graidctl create vd 0
2. Format virtual drive with
xfs.
$ sudo mkfs.xfs /dev/gvd0n1
3. Mount the filesystem and copy SQL data to the mount point.
$ sudo mount -o noatime,nodiratime /dev/gvd0n1 /mnt/graid
$ sudo rsync -av /var/lib/mysql /mnt/graid/
4. Start MySQL server.
$ sudo systemctl start mysql
5. Create a database called sbtest.
$ mysql -u root -p -e “create database sbtest;”
6. Prepare a dataset with eight tables, each containing 50M entries for a total size of 100GB.
$ ./sysbench-1.1-new2020 lua/OLTP_RW-trx.lua --db-driver=mysql \
--mysql-storage-engine=InnoDB --tables=8 --table-size=50000000 \
--mysql-user=root --mysql-password=password --mysql-
socket=/var/run/mysqld/mysqld.sock \
--mysql-db=sbtest --events=0 --threads=1 create
$ ./sysbench-1.1-new2020 lua/OLTP_RW-trx.lua --db-driver=mysql \
--mysql-storage-engine=InnoDB --tables=8 --table-size=50000000 \
--mysql-user=root --mysql-password=password --mysql-
socket=/var/run/mysqld/mysqld.sock \
--mysql-db=sbtest --events=0 --threads=32 prepare
7. Launch a 1-hour warm-up task with 512 threads to make SSDs enter a steady state.
$ ./sysbench-1.1-new2020 lua/OLTP_RW-trx.lua --db-driver=mysql \
--tables=8 --table-size=50000000 --threads=256 --time=3600 \
--thread-init-timeout=0 --rate=0 --rand-type=uniform --rand-seed=0 \
--mysql-user=root --mysql-password=password --mysql-
socket=/var/run/mysqld/mysqld.sock \
--mysql-db=sbtest --events=0 run
8. Launch a 10-minute OLTP_RW uniform test with the following threads: 64, 128, 256, 512,1024.
$ for threads in 64 128 256 512 1024
do
./sysbench-1.1-new2020 lua/OLTP_RW-trx.lua --db-driver=mysql \
--tables=8 --table-size=50000000 --threads=${threads} --time=600 \
--thread-init-timeout=0 --rate=0 --rand-type=uniform --rand-seed=0 \
--mysql-user=root --mysql-password=password --mysql-
socket=/var/run/mysqld/mysqld.sock \
--mysql-db=sbtest --events=0 run
sleep 15
done
9. Mark one physical drive offline to make RAID degraded.
$ sudo graidctl edit pd 0 marker offline
10. Mark the offline physical drive online to enter the rebuilding process.
$ sudo graidctl edit pd 0 marker online
11. Launch OLTP_RW uniform.
Linux MD RAID 6 with 8 SSDs
1. Since Intel® D7-P5510 supports deterministic read zero after TRIM, use the discard command to rest all SSDs and skip the MD RAID initialization process.
$ for i in {0..7}; do sudo blkdiscard /dev/nvme"$i"n1; done
2. Compose a
RAID 6
group with
8 physical drives
and the chunk size set to 4KB to improve
16k random write performance.
$ sudo mdadm --create --assume-clean --verbose /dev/md6 --level=6 --raid
devices=8 --chunk=4K /dev/nvme[0-7]n1
3. Increase MD parity worker threads to improve overall write performance and increase the speed limit to get better rebuild speed.
$ echo 8 | sudo tee /sys/block/md6/md/group_thread_cnt
$ sysctl -w dev.raid.speed_limit_min=600000
$ sysctl -w dev.raid.speed_limit_max=600000
4. Format virtual drive with
xfs.
$ sudo mkfs.xfs /dev/md6
5. Mount the filesystem and copy SQL data to the mount point.
$ sudo mount -o noatime,nodiratime /dev/md6 /mnt/graid
$ sudo rsync -av /var/lib/mysql /mnt/graid/
6. Start MySQL server.
$ sudo systemctl start mysql
7. Create a database called
sbtest.
$ mysql -u root -p -e “create database sbtest;”
8. Prepare a dataset with eight tables, each containing 50M entries for a total size of 100GB.
$ ./sysbench-1.1-new2020 lua/OLTP_RW-trx.lua --db-driver=mysql \
--mysql-storage-engine=InnoDB --tables=8 --table-size=50000000 \
--mysql-user=root --mysql-password=password --mysql-
socket=/var/run/mysqld/mysqld.sock \
--mysql-db=sbtest --events=0 --threads=1 create
$ ./sysbench-1.1-new2020 lua/OLTP_RW-trx.lua --db-driver=mysql \
--mysql-storage-engine=InnoDB --tables=8 --table-size=50000000 \
--mysql-user=root --mysql-password=password --mysql-
socket=/var/run/mysqld/mysqld.sock \
--mysql-db=sbtest --events=0 --threads=32 prepare
9. Launch a 1-hour warm-up task with 512 threads to make SSDs enter a steady state.
$ ./sysbench-1.1-new2020 lua/OLTP_RW-trx.lua --db-driver=mysql \
--tables=8 --table-size=50000000 --threads=256 --time=3600 \
--thread-init-timeout=0 --rate=0 --rand-type=uniform --rand-seed=0 \
--mysql-user=root --mysql-password=password --mysql-
socket=/var/run/mysqld/mysqld.sock \
--mysql-db=sbtest --events=0 run
10. Launch a 10-minute test with the following threads:
64, 128, 256, 512, 1024.
$ for threads in 64 128 256 512 1024
do
./sysbench-1.1-new2020 lua/OLTP_RW-trx.lua --db-driver=mysql \
--tables=8 --table-size=50000000 --threads=${threads} --time=600 \
--thread-init-timeout=0 --rate=0 --rand-type=uniform --rand-seed=0 \
--mysql-user=root --mysql-password=password --mysql-
socket=/var/run/mysqld/mysqld.sock \
--mysql-db=sbtest --events=0 run
sleep 15
done
11. Mark one SSD offline to make RAID degraded.
$ sudo mdadm --manage --set-faulty /dev/md6 /dev/nvme0n1
$ sudo mdadm --manage /dev/md6 -r /dev/nvme0n1
$ sudo mdadm --zero-superblock /dev/nvme0n1
Add removed SSD back to enter the rebuilding process.
$ sudo mdadm --manage /dev/md6 -a /dev/nvme0n1
Launch OLTP_RW uniform again
Linux MD RAID 10 with 8 SSDs
1. Since Intel® D7-P5510 supports deterministic read zero after TRIM, use the discard command to rest all SSDs and skip the MD RAID initialization process.
$ for i in {0..7}; do sudo blkdiscard /dev/nvme"$i"n1; done
2. Compose a
RAID 6
group with
8 physical drives.
$ sudo mdadm --create --assume-clean --verbose /dev/md10 --level=10 --raid
devices=8 --chunk=16K /dev/nvme[0-7]n1
3. Increase the speed limit to get better rebuild speed.
$ sysctl -w dev.raid.speed_limit_min=600000
$ sysctl -w dev.raid.speed_limit_max=600000
4. Format virtual drive with
xfs.
$ sudo mkfs.xfs /dev/md10
5. Mount the filesystem and copy SQL data to the mount point.
$ sudo mount -o noatime,nodiratime /dev/md10 /mnt/graid
$ sudo rsync -av /var/lib/mysql /mnt/graid/
6. Start MySQL server.
$ sudo systemctl start mysql
7. Create a database called
sbtest.
$ mysql -u root -p -e “create database sbtest;”
8. Prepare a dataset with eight tables, each containing 50M entries for a total size of 100GB.
$ ./sysbench-1.1-new2020 lua/OLTP_RW-trx.lua --db-driver=mysql \
--mysql-storage-engine=InnoDB --tables=8 --table-size=50000000 \
--mysql-user=root --mysql-password=password --mysql-
socket=/var/run/mysqld/mysqld.sock \
--mysql-db=sbtest --events=0 --threads=1 create
$ ./sysbench-1.1-new2020 lua/OLTP_RW-trx.lua --db-driver=mysql \
--mysql-storage-engine=InnoDB --tables=8 --table-size=50000000 \
--mysql-user=root --mysql-password=password --mysql-
socket=/var/run/mysqld/mysqld.sock \
--mysql-db=sbtest --events=0 --threads=32 prepare
9. Launch a 1-hour warm-up task with 512 threads to make SSDs enter a steady state.
$ ./sysbench-1.1-new2020 lua/OLTP_RW-trx.lua --db-driver=mysql \
--tables=8 --table-size=50000000 --threads=256 --time=3600 \
--thread-init-timeout=0 --rate=0 --rand-type=uniform --rand-seed=0 \
--mysql-user=root --mysql-password=password --mysql-
socket=/var/run/mysqld/mysqld.sock \
--mysql-db=sbtest --events=0 run
10. Launch a 10-minute test with the following threads:
64, 128, 256, 512, 1024.
$ for threads in 64 128 256 512 1024
do
./sysbench-1.1-new2020 lua/OLTP_RW-trx.lua --db-driver=mysql \
--tables=8 --table-size=50000000 --threads=${threads} --time=600 \
--thread-init-timeout=0 --rate=0 --rand-type=uniform --rand-seed=0 \
--mysql-user=root --mysql-password=password --mysql-
socket=/var/run/mysqld/mysqld.sock \
--mysql-db=sbtest --events=0 run
sleep 15
done
11. Mark one SSD offline to make RAID degraded.
$ sudo mdadm --manage --set-faulty /dev/md10 /dev/nvme0n1
$ sudo mdadm --manage /dev/md10 -r /dev/nvme0n1
$ sudo mdadm --zero-superblock /dev/nvme0n1
12. Add the removed SSD back to enter the rebuilding process.
$ sudo mdadm --manage /dev/md10 -a /dev/nvme0n1
13. Launch OLTP_RW uniform
Appendix
MySQL Server Configuration
#
# The MySQL database server configuration file.
#
# One can use all long options that the program supports.
# Run program with --help to get a list of available options and with
# --print-defaults to see which it would actually understand and use.
#
# For explanations see
# http://dev.mysql.com/doc/mysql/en/server-system-variables.html
# Here is entries for some specific programs
# The following values assume you have at least 32M ram
[mysqld]
#
# * Basic Settings
#
user=mysql
# pid-file=/var/run/mysqld/mysqld.pid
socket=/mnt/graid/mysql/mysqld.sock
port= 3306
datadir=/mnt/graid/mysql
# general
max_connections=4000
back_log=4000
ssl=0
table_open_cache=8000
table_open_cache_instances=16
default_authentication_plugin=mysql_native_password
default_password_lifetime=0
max_prepared_stmt_count=512000
skip_log_bin=1
character_set_server=latin1
collation_server=latin1_swedish_ci
skip-character-set-client-handshake
transaction_isolation=REPEATABLE-READ
# files
innodb_file_per_table
innodb_log_file_size=1024M
innodb_log_files_in_group=16
innodb_open_files=4000
# buffers
innodb_buffer_pool_size=32000M
innodb_buffer_pool_instances=48
innodb_log_buffer_size=64M
innodb_numa_interleave=on
# tune
innodb_doublewrite=1
innodb_thread_concurrency=0
innodb_flush_log_at_trx_commit=1
innodb_max_dirty_pages_pct=90
innodb_max_dirty_pages_pct_lwm=10
join_buffer_size=32K
sort_buffer_size=32K
innodb_use_native_aio=1
innodb_stats_persistent=1
innodb_spin_wait_delay=6
innodb_max_purge_lag_delay=300000
innodb_max_purge_lag=0
innodb_flush_method=O_DIRECT
innodb_checksum_algorithm=crc32
innodb_io_capacity=20000
innodb_io_capacity_max=40000
innodb_lru_scan_depth=1000
innodb_change_buffering=none
innodb_read_only=0
innodb_page_cleaners=48
innodb_undo_log_truncate=off
# perf special
innodb_adaptive_flushing=1
innodb_flush_neighbors=0
innodb_read_io_threads=32
innodb_write_io_threads=16
innodb_purge_threads=16
innodb_adaptive_hash_index=0
# monitoring
innodb_monitor_enable='%’
performance_schema=ON
# etc.
loose_log_error_verbosity=3
secure_file_priv=
core_file
innodb_buffer_pool_in_core_file=off