0 硬件说明
Orange Pi5 Ultra 采用了瑞芯微 RK3588 新一代八核 64 位 ARM 处理器,具体为四核 A76 和四核 A55,采用的三星 8nm LP 制程工艺,大核主频最高可达 2.4GHz,集成 ARM Mali-G610 MP4 GPU,内嵌高性能 3D 和 2D 图像加速模块,内置高达 6 Tops 算力的 AI 加速器 NPU,具有高达 8K 显示处理能力。
开发板接口说明,
GPIO 说明:
四个定位孔的直径都是 2.7mm。
1 系统烧录
这个就是常规操作了。
烧录软件 balenaEtcher:https://www.balena.io/etcher/
TF 卡需要 16GB 或更大容量,TF 卡的传输速度必须为 class10 级或 class10 级以上。
2 启动
将烧录好镜像的 TF 卡插入香橙派开发板的 TF 卡插槽中。
开发板有 HDMI 接口,可以通过 HDMI 转 HDMI 连接线把开发板连接到电视或者 HDMI 显示器。接上 USB 鼠标和键盘,用于控制香橙派开发板。
3 安装散热与 M.2 固态硬盘
- 板载 pcie gen3x4 的 m.2 接口,可以用来安装 nvme/sata 协议的 m.2 接口固态硬盘。
3.1 配置 m.2 固态硬盘开机自动挂载
# 查看磁盘
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
# mtdblock0 31:0 0 16M 0 disk
# mmcblk1 179:0 0 29.7G 0 disk
# ├─mmcblk1p1 179:1 0 1G 0 part /boot
# └─mmcblk1p2 179:2 0 28.4G 0 part /var/log.hdd
# /
# zram0 254:0 0 7.8G 0 disk [SWAP]
# zram1 254:1 0 200M 0 disk /var/log
# nvme0n1 259:0 0 1.9T 0 disk
可以看到有一个 1.9T 的 nvme0n1 就是添加的 SSD 硬盘了。
如果需要对 SSD 硬盘进行分区,可以使用fdisk
进行分区,
sudo fdisk /dev/nvme0n1
后面根据提示和需要进行分区即可,这里只分了一个区:
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
mtdblock0 31:0 0 16M 0 disk
mmcblk1 179:0 0 29.7G 0 disk
├─mmcblk1p1 179:1 0 1G 0 part /boot
└─mmcblk1p2 179:2 0 28.4G 0 part /var/log.hdd
/
zram0 254:0 0 7.8G 0 disk [SWAP]
zram1 254:1 0 200M 0 disk /var/log
nvme0n1 259:0 0 1.9T 0 disk
└─nvme0n1p1 259:1 0 1.9T 0 part
如果不需要分区,就跳过分区处理,直接进行硬盘的格式化,输入mkfs
,之后连续按两次tab
,查看支持哪些文件系统格式:
mkfs
mkfs mkfs.cramfs mkfs.ext2 mkfs.ext4 mkfs.minix mkfs.ntfs
mkfs.bfs mkfs.exfat mkfs.ext3 mkfs.fat mkfs.msdos mkfs.vfat
可以看到系统支持多种格式,这里选择ext4
:
sudo mkfs.ext4 /dev/nvme0n1p1
mke2fs 1.46.5 (30-Dec-2021)
Discarding device blocks: done
Creating filesystem with 500099414 4k blocks and 125026304 inodes
Filesystem UUID: 8b60d807-81c4-4061-b947-bfea65909117
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848
Allocating group tables: done
Writing inode tables: done
Creating journal (262144 blocks): done
Writing superblocks and filesystem accounting information: done
至此,硬盘格式化完成。
新建挂载点:
sudo mkdir -p /mnt/nvme0
挂载:
# 单次挂载
sudo mount /dev/nvme0n1p1 /mnt/nvme0
可以通过df -h
来查看是否挂载成功。
3.1.1 设置开机自动挂载
首先查看UUID
:
blkid
/dev/nvme0n1p1: UUID="8b60d807-81c4-4061-b947-bfea65909117" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="a5bd4a72-01"
将/dev/nvme0n1p1
的 UUID
复制出来,然后写入到/etc/fstab
中去:
sudo echo "UUID=8b60d807-81c4-4061-b947-bfea65909117 /mnt/nvme0 ext4 defaults 0 0" >> /etc/fstab
将 /etc/fstab
中定义的所有档案系统挂上:
mount -a
修改挂载目录权限:
sudo chmod 777 -R /mnt/nvme0/
最后,重启试验一下是否成功。
3.1.2 硬盘读写测试
进入磁盘挂载目录
cd nvme_ssd0
清除缓存
sudo sh -c "sync && echo 3 > /proc/sys/vm/drop_caches"
写测试
sudo dd if=/dev/zero of=./test_write count=2000 bs=1024k
# 2000+0 records in
# 2000+0 records out
# 2097152000 bytes (2.1 GB, 2.0 GiB) copied, 1.44551 s, 1.5 GB/s
我这里的写速度为 1.5GB/s。
读测试
# 清除缓存
sudo sh -c "sync && echo 3 > /proc/sys/vm/drop_caches"
sudo dd if=./test_write of=/dev/null count=2000 bs=1024k
2000+0 records in
2000+0 records out
2097152000 bytes (2.1 GB, 2.0 GiB) copied, 1.18762 s, 1.8 GB/s
我这里的读速度为 1.8GB/s,这里一定要先清除缓存,否则会由于缓存的原因,达到很高的读速度。
- 板子上带有 PH1.25 的散热风扇接口,可以安装主动散热器
3.2 设置散热器散热策略
# 查看系统温度传感器
sensors
# npu_thermal-virtual-0
# Adapter: Virtual device
# temp1: +44.4°C (crit = +115.0°C)
# center_thermal-virtual-0
# Adapter: Virtual device
# temp1: +44.4°C (crit = +115.0°C)
# bigcore1_thermal-virtual-0
# Adapter: Virtual device
# temp1: +44.4°C (crit = +115.0°C)
# soc_thermal-virtual-0
# Adapter: Virtual device
# temp1: +44.4°C (crit = +115.0°C)
# gpu_thermal-virtual-0
# Adapter: Virtual device
# temp1: +43.5°C (crit = +115.0°C)
# littlecore_thermal-virtual-0
# Adapter: Virtual device
# temp1: +44.4°C (crit = +115.0°C)
# bigcore0_thermal-virtual-0
# Adapter: Virtual device
# temp1: +44.4°C (crit = +115.0°C)
# 查看 nvme ssd 固态硬盘当前温度
sudo smartctl -a /dev/nvme0 | grep "Temperature:"
# Temperature: 36 Celsius
开机后风扇没有转是正常的,因为开机后 CPU 的温度一般都低于 50 度,默认只有当 CPU 的温度达到 50 度后,风扇才会开始转。
使用下面的命令可以让所有CPU都跑满,然后就能看到风扇会开始工作了:
for i in $(seq 0 $(( $(nproc --all) - 1)) ); do (taskset -c $i yes > /dev/null &); done
开发板上的风扇可以通过PWM
来调节转速和开关,使用的PWM
引脚为PWM3_IR_M1
。Linux 系统默认使用 驱动来控制风扇,所使用的 配置如下所示:
设备树文件 kernel/arch/arm64/boot/dts/rockchip/rk3588-orangepi-5-plus.dts
fan: pwm-fan {
compatible = "pwm-fan";
#cooling-cells = <2>;
pwms = <&pwm3 0 50000 0>;
cooling-levels = <0 50 100 150 200 255>;
rockchip,temp-trips = <
42000 1
48000 2
55000 3
62000 4
68000 5
>;
status = "okay";
};
4 基本测试
4.1 功耗
- 插上固态后系统待机:6W
- 大核满载:6W
- 小核满载:1W
- NVME 满载:7W
- 单个网卡满载:0.25W(AX210)
- 散热风扇: 0.5W
4.2 部分测试记录
基本信息:
neofetch
.-/+oossssoo+/-. orangepi@orangepi5ultra
`:+ssssssssssssssssss+:` -----------------------
-+ssssssssssssssssssyyssss+- OS: Ubuntu 22.04.5 LTS aarch64
.ossssssssssssssssssdMMMNysssso. Host: RK3588 OPi 5 Ultra
/ssssssssssshdmmNNmmyNMMMMhssssss/ Kernel: 6.1.43-rockchip-rk3588
+ssssssssshmydMMMMMMMNddddyssssssss+ Uptime: 14 hours, 27 mins
/sssssssshNMMMyhhyyyyhmNMMMNhssssssss/ Packages: 1746 (dpkg)
.ssssssssdMMMNhsssssssssshNMMMdssssssss. Shell: bash 5.1.16
+sssshhhyNMMNyssssssssssssyNMMMysssssss+ Resolution: 3520x2557
ossyNMMMNyMMhsssssssssssssshmmmhssssssso Theme: Adwaita [GTK3]
ossyNMMMNyMMhsssssssssssssshmmmhssssssso Icons: Adwaita [GTK3]
+sssshhhyNMMNyssssssssssssyNMMMysssssss+ Terminal: /dev/pts/0
.ssssssssdMMMNhsssssssssshNMMMdssssssss. CPU: (8) @ 1.800GHz
/sssssssshNMMMyhhyyyyhdNMMMNhssssssss/ Memory: 796MiB / 15964MiB
+sssssssssdmydMMMMMMMMddddyssssssss+
/ssssssssssshdmNNNNmyNMMMMhssssss/
.ossssssssssssssssssdMMMNysssso.
-+sssssssssssssssssyyyssss+-
`:+ssssssssssssssssss+:`
.-/+oossssoo+/-.
4.2.1 sbc-bench
git clone https://github.com/ThomasKaiser/sbc-bench.git
cd sbc-bench
sudo ./sbc-bench.sh
主要测试结果:
# 当前系统状态
Status of performance related governors found below /sys (w/o cpufreq):
dmc: dmc_ondemand / 2400 MHz (rknpu_ondemand dmc_ondemand userspace powersave performance simple_ondemand / 534 1320 1968 2400)
fb000000.gpu: simple_ondemand / 300 MHz (rknpu_ondemand dmc_ondemand userspace powersave performance simple_ondemand / 300 400 500 600 700 800 900 1000)
fdab0000.npu: rknpu_ondemand / 1000 MHz (rknpu_ondemand dmc_ondemand userspace powersave performance simple_ondemand / 300 400 500 600 700 800 900 1000)
sbc-bench v0.9.71
# Results validation
Results validation:
* Measured clockspeed not lower than advertised max CPU clockspeed
* No swapping
* Background activity (%system) OK
* Too much other background activity: 0% avg, 4% max -> https://tinyurl.com/mr2wy5uv
* No throttling
# 内存性能测试,测试了大小核的三个簇
Memory performance (all 3 CPU clusters measured individually):
memcpy: 6599.2 MB/s (Cortex-A55)
memset: 21712.5 MB/s (Cortex-A55)
memcpy: 12925.7 MB/s (Cortex-A76)
memset: 27424.8 MB/s (Cortex-A76)
memcpy: 12844.3 MB/s (Cortex-A76)
memset: 27290.9 MB/s (Cortex-A76)
# 7-zip测试分数
7-zip total scores (3 consecutive runs): 15639,15694,15626, single-threaded: 2888
# OpenSSL测试分数
OpenSSL results (all 3 CPU clusters measured individually):
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-128-cbc 157907.60k 466907.84k 913189.29k 1211602.94k 1335563.61k 1345901.91k (Cortex-A55)
aes-128-cbc 637695.14k 1269587.78k 1627369.30k 1741073.41k 1785804.12k 1790377.98k (Cortex-A76)
aes-128-cbc 636491.22k 1270735.13k 1630370.90k 1745556.48k 1790883.16k 1795375.10k (Cortex-A76)
aes-192-cbc 150707.76k 417121.37k 746944.17k 933191.00k 1005458.77k 1011433.47k (Cortex-A55)
aes-192-cbc 593616.55k 1116556.37k 1378340.10k 1448504.66k 1489854.46k 1492833.62k (Cortex-A76)
aes-192-cbc 595237.65k 1113902.61k 1381619.88k 1452472.66k 1493748.39k 1496771.24k (Cortex-A76)
aes-256-cbc 146073.90k 382410.11k 645469.95k 780004.01k 829835.95k 833869.14k (Cortex-A55)
aes-256-cbc 575753.51k 986583.40k 1192946.60k 1254034.43k 1277599.74k 1279923.54k (Cortex-A76)
aes-256-cbc 574328.99k 989858.52k 1196424.45k 1257626.28k 1281086.81k 1283407.87k (Cortex-A76)
# 一些系统信息
Unable to upload full test results. Please copy&paste the below stuff to pastebin.com and
provide the URL. Check the output for throttling and swapping please.
sbc-bench v0.9.71 RK3588 OPi 5 Ultra (Fri, 07 Mar 2025 23:43:38 +0800)
Distributor ID: Ubuntu
Description: Ubuntu 22.04.5 LTS
Release: 22.04
Codename: jammy
Build system: https://github.com/orangepi-xunlong/orangepi-build, 1.0.0, Orange Pi 5 Ultra, rockchip-rk3588, rockchip-rk3588
/usr/bin/gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Uptime: 23:43:38 up 14:24, 6 users, load average: 1.38, 1.21, 1.10, 42.5°C, 105341910
Linux 6.1.43-rockchip-rk3588 (orangepi5ultra) 03/07/25 _aarch64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.22 0.00 0.43 0.01 0.00 99.34
Device tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd
mmcblk1 0.35 12.28 4.81 0.00 637150 249217 0
nvme0n1 0.04 0.19 2.42 0.00 9717 125508 0
zram0 0.01 0.04 0.00 0.00 2248 4 0
zram1 0.07 0.02 0.55 0.00 1124 28508 0
total used free shared buff/cache available
Mem: 15Gi 709Mi 14Gi 13Mi 341Mi 14Gi
Swap: 7.8Gi 0B 7.8Gi
# 测试细项
# CPU 频率测试
##########################################################################
Checking cpufreq OPP for cpu0-cpu3 (Cortex-A55):
Cpufreq OPP: 1800 Measured: 1787 (1787.703/1787.681/1787.614)
Cpufreq OPP: 1608 Measured: 1600 (1600.550/1600.490/1600.370)
Cpufreq OPP: 1416 Measured: 1402 (1402.854/1401.707/1401.690)
Cpufreq OPP: 1200 Measured: 1190 (1190.608/1190.608/1190.489)
Cpufreq OPP: 1008 Measured: 944 (944.281/944.210/944.175) (-6.3%)
Cpufreq OPP: 816 Measured: 755 (755.260/755.260/755.063) (-7.5%)
Cpufreq OPP: 600 Measured: 591 (591.393/591.319/591.282) (-1.5%)
Cpufreq OPP: 408 Measured: 393 (393.473/393.414/393.272) (-3.7%)
Checking cpufreq OPP for cpu4-cpu5 (Cortex-A76):
Cpufreq OPP: 2256 Measured: 2254 (2254.173/2254.117/2254.060)
Cpufreq OPP: 2208 Measured: 2230 (2230.562/2230.507/2230.423)
Cpufreq OPP: 2016 Measured: 2021 (2021.858/2021.808/2021.757)
Cpufreq OPP: 1800 Measured: 1853 (1853.985/1853.985/1853.870) (+2.9%)
Cpufreq OPP: 1608 Measured: 1623 (1623.488/1623.406/1623.305)
Cpufreq OPP: 1416 Measured: 1408 (1408.440/1408.440/1408.282)
Cpufreq OPP: 1200 Measured: 1101 (1101.257/1101.175/1101.037) (-8.3%)
Cpufreq OPP: 1008 Measured: 925 (925.898/925.851/925.840) (-8.2%)
Cpufreq OPP: 816 Measured: 742 (742.202/742.174/742.156) (-9.1%)
Cpufreq OPP: 600 Measured: 593 (593.208/593.208/593.186) (-1.2%)
Cpufreq OPP: 408 Measured: 395 (395.238/395.238/395.199) (-3.2%)
Checking cpufreq OPP for cpu6-cpu7 (Cortex-A76):
Cpufreq OPP: 2256 Measured: 2261 (2261.224/2261.196/2261.167)
Cpufreq OPP: 2208 Measured: 2237 (2237.741/2237.685/2237.685) (+1.3%)
Cpufreq OPP: 2016 Measured: 2001 (2001.445/2001.370/2001.270)
Cpufreq OPP: 1800 Measured: 1830 (1830.725/1830.564/1830.519) (+1.7%)
Cpufreq OPP: 1608 Measured: 1632 (1632.412/1632.249/1632.228) (+1.5%)
Cpufreq OPP: 1416 Measured: 1418 (1418.126/1418.108/1418.055)
Cpufreq OPP: 1200 Measured: 1109 (1110.045/1109.975/1109.892) (-7.6%)
Cpufreq OPP: 1008 Measured: 933 (933.992/933.992/933.969) (-7.4%)
Cpufreq OPP: 816 Measured: 749 (749.262/749.177/749.140) (-8.2%)
Cpufreq OPP: 600 Measured: 593 (593.140/593.140/593.103) (-1.2%)
Cpufreq OPP: 408 Measured: 395 (395.189/395.189/395.179) (-3.2%)
# benchmark
##########################################################################
Executing benchmark on cpu0 (Cortex-A55):
tinymembench v0.4.9-nuumio (simple benchmark for memory throughput and latency)
CFLAGS:
bandwidth test min repeats (-b): 2
bandwidth test max repeats (-B): 3
bandwidth test mem realloc (-M): no (-m for realloc)
latency test repeats (-l): 3
latency test count (-c): 1000000
==========================================================================
== Memory bandwidth tests ==
== ==
== Note 1: 1MB = 1000000 bytes ==
== Note 2: Test result is the best of repeated runs. Number of repeats ==
== is shown in brackets ==
== Note 3: Results for 'copy' tests show how many bytes can be ==
== copied per second (adding together read and writen ==
== bytes would have provided twice higher numbers) ==
== Note 4: 2-pass copy means that we are using a small temporary buffer ==
== to first fetch data into it, and only then write it to the ==
== destination (source -> L1 cache, L1 cache -> destination) ==
== Note 5: If sample standard deviation exceeds 0.1%, it is shown in ==
== brackets ==
==========================================================================
C copy backwards : 2720.7 MB/s (3, 5.5%)
C copy backwards (32 byte blocks) : 2692.4 MB/s (3, 0.3%)
C copy backwards (64 byte blocks) : 2723.4 MB/s (3, 0.4%)
C copy : 6053.8 MB/s (2)
C copy prefetched (32 bytes step) : 2242.4 MB/s (2)
C copy prefetched (64 bytes step) : 6268.1 MB/s (2)
C 2-pass copy : 2645.6 MB/s (3, 0.3%)
C 2-pass copy prefetched (32 bytes step) : 1440.8 MB/s (3, 0.3%)
C 2-pass copy prefetched (64 bytes step) : 2910.4 MB/s (2)
C scan 8 : 441.8 MB/s (2)
C scan 16 : 876.5 MB/s (2)
C scan 32 : 1735.8 MB/s (2)
C scan 64 : 3411.4 MB/s (2)
C fill : 12336.9 MB/s (2)
C fill (shuffle within 16 byte blocks) : 12340.4 MB/s (2)
C fill (shuffle within 32 byte blocks) : 12339.9 MB/s (2)
C fill (shuffle within 64 byte blocks) : 12046.4 MB/s (2)
---
libc memcpy copy : 6599.2 MB/s (3, 0.4%)
libc memchr scan : 2734.5 MB/s (2)
libc memset fill : 21712.5 MB/s (2)
---
NEON LDP/STP copy : 5677.2 MB/s (2)
NEON LDP/STP copy pldl2strm (32 bytes step) : 1731.4 MB/s (2)
NEON LDP/STP copy pldl2strm (64 bytes step) : 3573.7 MB/s (3, 0.1%)
NEON LDP/STP copy pldl1keep (32 bytes step) : 2586.6 MB/s (3)
NEON LDP/STP copy pldl1keep (64 bytes step) : 5437.1 MB/s (2)
NEON LD1/ST1 copy : 5454.8 MB/s (2)
NEON LDP load : 6844.2 MB/s (3, 0.2%)
NEON LDNP load : 7050.1 MB/s (2)
NEON STP fill : 21630.1 MB/s (3, 0.2%)
NEON STNP fill : 14517.3 MB/s (3, 2.2%)
ARM LDP/STP copy : 5456.9 MB/s (3, 3.2%)
ARM LDP load : 6610.1 MB/s (3, 1.4%)
ARM LDNP load : 6570.1 MB/s (3, 2.6%)
ARM STP fill : 21617.3 MB/s (3, 0.7%)
ARM STNP fill : 15243.5 MB/s (3, 1.2%)
==========================================================================
== Memory latency test ==
== ==
== Average time is measured for random memory accesses in the buffers ==
== of different sizes. The larger is the buffer, the more significant ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM ==
== accesses. For extremely large buffer sizes we are expecting to see ==
== page table walk with several requests to SDRAM for almost every ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest). ==
== ==
== Note 1: All the numbers are representing extra time, which needs to ==
== be added to L1 cache latency. The cycle timings for L1 cache ==
== latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
== two independent memory accesses at a time. In the case if ==
== the memory subsystem can't handle multiple outstanding ==
== requests, dual random read has the same timings as two ==
== single reads performed one after another. ==
==========================================================================
block size : single random read / dual random read
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.1 ns / 0.0 ns
32768 : 0.5 ns / 0.9 ns
65536 : 1.5 ns / 2.6 ns
131072 : 4.0 ns / 6.3 ns
262144 : 9.0 ns / 12.0 ns
524288 : 13.2 ns / 15.3 ns
1048576 : 16.0 ns / 16.6 ns
2097152 : 25.7 ns / 25.5 ns
4194304 : 55.5 ns / 78.5 ns
8388608 : 102.6 ns / 141.1 ns
16777216 : 130.5 ns / 162.7 ns
33554432 : 144.9 ns / 171.7 ns
67108864 : 154.3 ns / 179.4 ns
Executing benchmark on cpu4 (Cortex-A76):
tinymembench v0.4.9-nuumio (simple benchmark for memory throughput and latency)
CFLAGS:
bandwidth test min repeats (-b): 2
bandwidth test max repeats (-B): 3
bandwidth test mem realloc (-M): no (-m for realloc)
latency test repeats (-l): 3
latency test count (-c): 1000000
==========================================================================
== Memory bandwidth tests ==
== ==
== Note 1: 1MB = 1000000 bytes ==
== Note 2: Test result is the best of repeated runs. Number of repeats ==
== is shown in brackets ==
== Note 3: Results for 'copy' tests show how many bytes can be ==
== copied per second (adding together read and writen ==
== bytes would have provided twice higher numbers) ==
== Note 4: 2-pass copy means that we are using a small temporary buffer ==
== to first fetch data into it, and only then write it to the ==
== destination (source -> L1 cache, L1 cache -> destination) ==
== Note 5: If sample standard deviation exceeds 0.1%, it is shown in ==
== brackets ==
==========================================================================
C copy backwards : 11985.9 MB/s (3, 0.2%)
C copy backwards (32 byte blocks) : 11998.8 MB/s (3, 0.1%)
C copy backwards (64 byte blocks) : 11978.0 MB/s (2)
C copy : 12503.9 MB/s (2)
C copy prefetched (32 bytes step) : 12969.3 MB/s (3, 0.1%)
C copy prefetched (64 bytes step) : 12996.2 MB/s (3, 0.2%)
C 2-pass copy : 4821.6 MB/s (3, 0.5%)
C 2-pass copy prefetched (32 bytes step) : 7304.9 MB/s (2)
C 2-pass copy prefetched (64 bytes step) : 6461.2 MB/s (2)
C scan 8 : 1116.4 MB/s (2)
C scan 16 : 2233.6 MB/s (2)
C scan 32 : 4468.5 MB/s (2)
C scan 64 : 8917.2 MB/s (2)
C fill : 27435.4 MB/s (3, 0.8%)
C fill (shuffle within 16 byte blocks) : 27462.1 MB/s (3, 0.3%)
C fill (shuffle within 32 byte blocks) : 27503.2 MB/s (3, 0.4%)
C fill (shuffle within 64 byte blocks) : 27267.6 MB/s (3, 0.2%)
---
libc memcpy copy : 12925.7 MB/s (3, 0.1%)
libc memchr scan : 14988.3 MB/s (2)
libc memset fill : 27424.8 MB/s (3, 0.4%)
---
NEON LDP/STP copy : 12900.6 MB/s (3, 0.1%)
NEON LDP/STP copy pldl2strm (32 bytes step) : 13142.8 MB/s (3)
NEON LDP/STP copy pldl2strm (64 bytes step) : 13114.4 MB/s (2)
NEON LDP/STP copy pldl1keep (32 bytes step) : 12900.2 MB/s (3, 0.1%)
NEON LDP/STP copy pldl1keep (64 bytes step) : 12893.7 MB/s (3, 0.1%)
NEON LD1/ST1 copy : 12771.5 MB/s (3, 0.2%)
NEON LDP load : 16842.6 MB/s (2)
NEON LDNP load : 15980.6 MB/s (2)
NEON STP fill : 27327.4 MB/s (3, 0.3%)
NEON STNP fill : 27432.6 MB/s (3, 0.8%)
ARM LDP/STP copy : 12847.1 MB/s (2)
ARM LDP load : 16339.8 MB/s (2)
ARM LDNP load : 15212.7 MB/s (3)
ARM STP fill : 27405.7 MB/s (3, 1.0%)
ARM STNP fill : 27433.5 MB/s (3, 0.3%)
==========================================================================
== Memory latency test ==
== ==
== Average time is measured for random memory accesses in the buffers ==
== of different sizes. The larger is the buffer, the more significant ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM ==
== accesses. For extremely large buffer sizes we are expecting to see ==
== page table walk with several requests to SDRAM for almost every ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest). ==
== ==
== Note 1: All the numbers are representing extra time, which needs to ==
== be added to L1 cache latency. The cycle timings for L1 cache ==
== latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
== two independent memory accesses at a time. In the case if ==
== the memory subsystem can't handle multiple outstanding ==
== requests, dual random read has the same timings as two ==
== single reads performed one after another. ==
==========================================================================
block size : single random read / dual random read
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 0.1 ns / 0.0 ns
131072 : 1.3 ns / 1.6 ns
262144 : 2.8 ns / 3.0 ns
524288 : 5.7 ns / 6.5 ns
1048576 : 12.1 ns / 13.4 ns
2097152 : 25.3 ns / 18.7 ns
4194304 : 51.2 ns / 67.6 ns
8388608 : 100.9 ns / 131.4 ns
16777216 : 126.4 ns / 152.5 ns
33554432 : 139.4 ns / 164.2 ns
67108864 : 147.8 ns / 165.2 ns
Executing benchmark on cpu6 (Cortex-A76):
tinymembench v0.4.9-nuumio (simple benchmark for memory throughput and latency)
CFLAGS:
bandwidth test min repeats (-b): 2
bandwidth test max repeats (-B): 3
bandwidth test mem realloc (-M): no (-m for realloc)
latency test repeats (-l): 3
latency test count (-c): 1000000
==========================================================================
== Memory bandwidth tests ==
== ==
== Note 1: 1MB = 1000000 bytes ==
== Note 2: Test result is the best of repeated runs. Number of repeats ==
== is shown in brackets ==
== Note 3: Results for 'copy' tests show how many bytes can be ==
== copied per second (adding together read and writen ==
== bytes would have provided twice higher numbers) ==
== Note 4: 2-pass copy means that we are using a small temporary buffer ==
== to first fetch data into it, and only then write it to the ==
== destination (source -> L1 cache, L1 cache -> destination) ==
== Note 5: If sample standard deviation exceeds 0.1%, it is shown in ==
== brackets ==
==========================================================================
C copy backwards : 11985.9 MB/s (3, 0.2%)
C copy backwards (32 byte blocks) : 11929.1 MB/s (2)
C copy backwards (64 byte blocks) : 11923.8 MB/s (2)
C copy : 12425.0 MB/s (3, 0.1%)
C copy prefetched (32 bytes step) : 12865.5 MB/s (3, 0.1%)
C copy prefetched (64 bytes step) : 12859.6 MB/s (3, 0.1%)
C 2-pass copy : 4426.1 MB/s (3, 1.0%)
C 2-pass copy prefetched (32 bytes step) : 6922.1 MB/s (3, 0.3%)
C 2-pass copy prefetched (64 bytes step) : 6330.9 MB/s (2)
C scan 8 : 1116.8 MB/s (2)
C scan 16 : 2233.0 MB/s (2)
C scan 32 : 4469.8 MB/s (2)
C scan 64 : 8922.5 MB/s (2)
C fill : 27412.3 MB/s (3, 0.7%)
C fill (shuffle within 16 byte blocks) : 27450.6 MB/s (3, 0.4%)
C fill (shuffle within 32 byte blocks) : 27455.9 MB/s (3, 0.4%)
C fill (shuffle within 64 byte blocks) : 27316.2 MB/s (3, 0.2%)
---
libc memcpy copy : 12844.3 MB/s (2)
libc memchr scan : 14974.8 MB/s (3)
libc memset fill : 27290.9 MB/s (3, 0.3%)
---
NEON LDP/STP copy : 12847.3 MB/s (2)
NEON LDP/STP copy pldl2strm (32 bytes step) : 13136.4 MB/s (3, 0.1%)
NEON LDP/STP copy pldl2strm (64 bytes step) : 13131.1 MB/s (3)
NEON LDP/STP copy pldl1keep (32 bytes step) : 12904.4 MB/s (3, 0.1%)
NEON LDP/STP copy pldl1keep (64 bytes step) : 12898.5 MB/s (3, 0.4%)
NEON LD1/ST1 copy : 12755.7 MB/s (2)
NEON LDP load : 16794.4 MB/s (3)
NEON LDNP load : 15786.2 MB/s (2)
NEON STP fill : 27268.1 MB/s (3, 0.3%)
NEON STNP fill : 27426.0 MB/s (3, 0.3%)
ARM LDP/STP copy : 12894.1 MB/s (3, 0.2%)
ARM LDP load : 16378.3 MB/s (2)
ARM LDNP load : 15103.7 MB/s (2)
ARM STP fill : 27357.4 MB/s (3, 0.5%)
ARM STNP fill : 27449.6 MB/s (3, 0.3%)
==========================================================================
== Memory latency test ==
== ==
== Average time is measured for random memory accesses in the buffers ==
== of different sizes. The larger is the buffer, the more significant ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM ==
== accesses. For extremely large buffer sizes we are expecting to see ==
== page table walk with several requests to SDRAM for almost every ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest). ==
== ==
== Note 1: All the numbers are representing extra time, which needs to ==
== be added to L1 cache latency. The cycle timings for L1 cache ==
== latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
== two independent memory accesses at a time. In the case if ==
== the memory subsystem can't handle multiple outstanding ==
== requests, dual random read has the same timings as two ==
== single reads performed one after another. ==
==========================================================================
block size : single random read / dual random read
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 0.1 ns / 0.0 ns
131072 : 1.3 ns / 1.6 ns
262144 : 2.8 ns / 2.9 ns
524288 : 5.8 ns / 6.6 ns
1048576 : 12.1 ns / 13.4 ns
2097152 : 21.1 ns / 16.5 ns
4194304 : 50.6 ns / 69.8 ns
8388608 : 96.2 ns / 132.4 ns
16777216 : 125.9 ns / 152.7 ns
33554432 : 143.6 ns / 160.2 ns
67108864 : 147.2 ns / 168.5 ns
##########################################################################
Executing ramlat on cpu0 (Cortex-A55), results in ns:
size: 1x32 2x32 1x64 2x64 1xPTR 2xPTR 4xPTR 8xPTR
4k: 1.700 1.692 1.689 1.688 1.125 1.687 2.285 4.604
8k: 1.687 1.687 1.687 1.687 1.124 1.687 2.285 4.605
16k: 1.695 1.687 1.698 1.687 1.131 1.687 2.285 4.604
32k: 1.716 1.690 1.713 1.689 1.143 1.690 2.290 4.611
64k: 10.53 11.65 10.53 11.65 10.67 11.65 16.32 29.59
128k: 13.38 14.83 13.38 14.82 14.04 14.83 21.84 40.83
256k: 16.17 16.57 16.18 16.54 15.65 16.76 25.85 50.49
512k: 17.06 17.24 17.07 17.19 16.34 17.40 27.11 54.07
1024k: 17.17 17.33 17.17 17.36 16.59 17.56 28.43 54.18
2048k: 19.53 19.23 19.58 29.47 20.56 22.66 37.45 76.78
4096k: 152.7 151.5 150.7 143.2 143.0 117.3 153.6 280.1
8192k: 130.5 137.8 123.5 134.2 132.2 138.5 208.6 366.3
16384k: 145.2 146.6 141.6 145.8 142.4 147.5 228.3 399.2
32768k: 153.1 154.9 149.7 151.1 154.0 155.5 227.7 408.5
65536k: 159.3 163.1 160.2 159.4 160.8 159.6 232.7 417.2
131072k: 167.7 166.1 164.1 167.5 165.6 167.2 239.2 418.6
Executing ramlat on cpu4 (Cortex-A76), results in ns:
size: 1x32 2x32 1x64 2x64 1xPTR 2xPTR 4xPTR 8xPTR
4k: 1.783 1.783 1.783 1.783 1.783 1.783 1.783 3.396
8k: 1.783 1.783 1.783 1.783 1.783 1.783 1.783 3.474
16k: 1.783 1.783 1.782 1.783 1.782 1.783 1.783 3.474
32k: 1.783 1.783 1.782 1.783 1.782 1.783 1.783 3.476
64k: 1.784 1.783 1.784 1.783 1.783 1.784 1.784 3.477
128k: 5.348 5.351 5.348 5.351 5.348 6.128 7.566 13.51
256k: 6.341 6.412 6.364 6.424 6.354 6.363 7.844 13.52
512k: 9.567 9.254 9.597 9.246 9.532 9.793 11.27 17.56
1024k: 18.50 18.84 18.41 18.82 18.47 18.64 20.55 31.15
2048k: 19.57 20.11 19.86 20.10 19.48 20.33 22.54 32.88
4096k: 70.00 55.27 67.82 55.24 80.24 54.03 54.66 68.09
8192k: 136.0 111.1 121.7 106.9 131.8 100.7 107.1 109.1
16384k: 146.2 137.2 148.5 137.7 145.6 131.8 132.5 131.5
32768k: 153.5 153.7 157.3 153.8 152.9 149.7 144.7 137.1
65536k: 160.8 156.6 158.3 154.8 160.5 152.8 150.0 147.6
131072k: 160.7 155.0 158.3 154.6 159.7 156.7 151.5 157.4
Executing ramlat on cpu6 (Cortex-A76), results in ns:
size: 1x32 2x32 1x64 2x64 1xPTR 2xPTR 4xPTR 8xPTR
4k: 1.778 1.777 1.777 1.777 1.777 1.777 1.777 3.383
8k: 1.777 1.777 1.777 1.777 1.777 1.777 1.777 3.463
16k: 1.777 1.777 1.777 1.777 1.777 1.777 1.777 3.462
32k: 1.777 1.777 1.777 1.777 1.777 1.777 1.778 3.465
64k: 1.778 1.777 1.778 1.777 1.778 1.778 1.778 3.466
128k: 5.330 5.332 5.330 5.332 5.330 6.108 7.550 13.47
256k: 6.800 6.810 6.847 6.804 6.843 6.783 8.103 14.24
512k: 10.52 10.15 10.31 10.15 10.40 10.69 12.32 18.76
1024k: 17.99 17.97 17.98 17.97 17.98 18.10 20.05 29.84
2048k: 22.03 20.17 20.96 20.30 20.97 20.47 22.68 34.07
4096k: 81.57 60.44 66.90 52.04 78.90 52.95 53.36 69.04
8192k: 131.7 110.8 121.7 109.4 128.1 102.9 100.6 106.5
16384k: 147.4 137.0 143.6 133.5 148.3 130.9 125.4 128.1
32768k: 152.3 150.2 153.2 152.8 150.2 146.5 145.2 135.2
65536k: 159.6 156.9 158.3 154.8 159.7 147.9 147.3 146.7
131072k: 160.6 155.9 158.5 148.5 158.6 147.3 153.0 154.7
##########################################################################
Executing benchmark on each cluster individually
OpenSSL 3.0.2, built on 15 Mar 2022 (Library: OpenSSL 3.0.2 15 Mar 2022)
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-128-cbc 157907.60k 466907.84k 913189.29k 1211602.94k 1335563.61k 1345901.91k (Cortex-A55)
aes-128-cbc 637695.14k 1269587.78k 1627369.30k 1741073.41k 1785804.12k 1790377.98k (Cortex-A76)
aes-128-cbc 636491.22k 1270735.13k 1630370.90k 1745556.48k 1790883.16k 1795375.10k (Cortex-A76)
aes-192-cbc 150707.76k 417121.37k 746944.17k 933191.00k 1005458.77k 1011433.47k (Cortex-A55)
aes-192-cbc 593616.55k 1116556.37k 1378340.10k 1448504.66k 1489854.46k 1492833.62k (Cortex-A76)
aes-192-cbc 595237.65k 1113902.61k 1381619.88k 1452472.66k 1493748.39k 1496771.24k (Cortex-A76)
aes-256-cbc 146073.90k 382410.11k 645469.95k 780004.01k 829835.95k 833869.14k (Cortex-A55)
aes-256-cbc 575753.51k 986583.40k 1192946.60k 1254034.43k 1277599.74k 1279923.54k (Cortex-A76)
aes-256-cbc 574328.99k 989858.52k 1196424.45k 1257626.28k 1281086.81k 1283407.87k (Cortex-A76)
##########################################################################
Executing benchmark single-threaded on cpu0 (Cortex-A55)
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,8 CPUs LE)
LE
CPU Freq: - - - - 128000000 256000000 - - -
RAM size: 15964 MB, # CPU hardware threads: 8
RAM usage: 435 MB, # Benchmark threads: 1
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 1262 100 1228 1228 | 21180 100 1808 1808
23: 1185 100 1208 1208 | 21006 100 1818 1818
24: 1162 100 1250 1250 | 20577 100 1807 1807
25: 1116 100 1274 1274 | 20023 100 1782 1782
---------------------------------- | ------------------------------
Avr: 100 1240 1240 | 100 1804 1804
Tot: 100 1522 1522
Executing benchmark single-threaded on cpu4 (Cortex-A76)
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,8 CPUs LE)
LE
CPU Freq: - - - - - - - - -
RAM size: 15964 MB, # CPU hardware threads: 8
RAM usage: 435 MB, # Benchmark threads: 1
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 2773 100 2698 2698 | 36511 100 3117 3117
23: 2589 100 2638 2638 | 36036 100 3119 3119
24: 2459 100 2644 2644 | 35162 100 3087 3087
25: 2359 100 2694 2694 | 34188 100 3043 3043
---------------------------------- | ------------------------------
Avr: 100 2669 2669 | 100 3092 3092
Tot: 100 2880 2880
Executing benchmark single-threaded on cpu6 (Cortex-A76)
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,8 CPUs LE)
LE
CPU Freq: - - - - - - - - 2048000000
RAM size: 15964 MB, # CPU hardware threads: 8
RAM usage: 435 MB, # Benchmark threads: 1
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 2705 100 2633 2632 | 36740 100 3137 3137
23: 2604 100 2654 2653 | 36133 100 3128 3128
24: 2495 100 2683 2683 | 35325 100 3101 3101
25: 2375 100 2712 2712 | 34377 100 3060 3060
---------------------------------- | ------------------------------
Avr: 100 2670 2670 | 100 3107 3106
Tot: 100 2888 2888
##########################################################################
Executing benchmark 3 times multi-threaded on CPUs 0-7
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,8 CPUs LE)
LE
CPU Freq: - 64000000 - - - - - - -
RAM size: 15964 MB, # CPU hardware threads: 8
RAM usage: 1765 MB, # Benchmark threads: 8
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 15096 751 1955 14686 | 198449 676 2505 16927
23: 14164 728 1983 14431 | 193292 675 2479 16727
24: 13587 748 1953 14609 | 188499 676 2449 16544
25: 13055 783 1904 14907 | 182935 674 2414 16281
---------------------------------- | ------------------------------
Avr: 753 1949 14658 | 675 2462 16620
Tot: 714 2205 15639
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,8 CPUs LE)
LE
CPU Freq: - - - - - - - - -
RAM size: 15964 MB, # CPU hardware threads: 8
RAM usage: 1765 MB, # Benchmark threads: 8
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 15367 766 1953 14950 | 197978 675 2503 16887
23: 14125 714 2016 14392 | 193111 674 2478 16711
24: 13829 770 1931 14870 | 188669 676 2449 16559
25: 13023 772 1927 14870 | 183297 676 2413 16313
---------------------------------- | ------------------------------
Avr: 755 1957 14770 | 675 2461 16617
Tot: 715 2209 15694
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,8 CPUs LE)
LE
CPU Freq: - - - - - - - - -
RAM size: 15964 MB, # CPU hardware threads: 8
RAM usage: 1765 MB, # Benchmark threads: 8
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 15088 745 1969 14678 | 198129 677 2495 16900
23: 14147 731 1971 14415 | 192542 673 2475 16662
24: 13825 767 1937 14865 | 188253 675 2448 16523
25: 12849 760 1932 14672 | 183105 676 2410 16296
---------------------------------- | ------------------------------
Avr: 751 1952 14657 | 675 2457 16595
Tot: 713 2205 15626
Compression: 14658,14770,14657
Decompression: 16620,16617,16595
Total: 15639,15694,15626
# 测试真实频率
##########################################################################
Testing maximum cpufreq again, still under full load. System health now:
Time cpu0/cpu4/cpu6 load %cpu %sys %usr %nice %io %irq Temp
23:59:32: 1800/2256/2256MHz 8.45 88% 1% 87% 0% 0% 0% 61.0°C
Checking cpufreq OPP for cpu0-cpu3 (Cortex-A55):
Cpufreq OPP: 1800 Measured: 1777 (1777.984/1777.940/1776.963) (-1.3%)
Checking cpufreq OPP for cpu4-cpu5 (Cortex-A76):
Cpufreq OPP: 2256 Measured: 2240 (2240.683/2240.683/2240.375)
Checking cpufreq OPP for cpu6-cpu7 (Cortex-A76):
Cpufreq OPP: 2256 Measured: 2247 (2247.757/2247.673/2247.616)
# 内存频率测试
##########################################################################
DRAM clock transitions since last boot (52833430 ms ago):
/sys/devices/platform/dmc/devfreq/dmc:
From : To
: 534000000132000000019680000002400000000 time(ms)
534000000: 0 0 0 796212 7785276
1320000000: 183 0 0 1352 11516
1968000000: 5 59 0 302 2696
*2400000000: 796024 1476 366 0 45032793
Total transition : 1595979
# 温度表现还好
##########################################################################
Thermal source: /sys/devices/virtual/thermal/thermal_zone0/ (soc-thermal)
System health while running tinymembench:
Time cpu0/cpu4/cpu6 load %cpu %sys %usr %nice %io %irq Temp
23:45:55: 1800/2256/2256MHz 1.94 0% 0% 0% 0% 0% 0% 44.4°C
23:46:25: 1800/2256/2256MHz 1.96 12% 0% 12% 0% 0% 0% 47.2°C
23:46:55: 1800/2256/2256MHz 2.28 16% 0% 13% 1% 0% 0% 49.0°C
23:47:25: 1800/2256/2256MHz 2.17 13% 0% 12% 0% 0% 0% 50.8°C
23:47:55: 1800/2256/2256MHz 2.34 13% 0% 12% 0% 0% 0% 54.5°C
23:48:26: 1800/2256/2256MHz 2.21 13% 0% 12% 0% 0% 0% 54.5°C
23:48:56: 1800/2256/2256MHz 2.12 13% 0% 12% 0% 0% 0% 57.3°C
System health while running ramlat:
Time cpu0/cpu4/cpu6 load %cpu %sys %usr %nice %io %irq Temp
23:49:17: 1800/2256/2256MHz 2.09 0% 0% 0% 0% 0% 0% 54.5°C
23:49:26: 1800/2256/2256MHz 2.07 12% 0% 12% 0% 0% 0% 51.8°C
23:49:35: 1800/2256/2256MHz 2.14 16% 0% 12% 3% 0% 0% 52.7°C
23:49:44: 1800/2256/2256MHz 2.12 13% 0% 12% 0% 0% 0% 51.8°C
23:49:53: 1800/2256/2256MHz 2.11 12% 0% 12% 0% 0% 0% 50.8°C
23:50:02: 1800/2256/2256MHz 2.09 13% 0% 12% 0% 0% 0% 50.8°C
23:50:11: 1800/2256/2256MHz 2.08 13% 0% 12% 0% 0% 0% 50.8°C
23:50:20: 1800/2256/2256MHz 2.07 13% 0% 12% 0% 0% 0% 51.8°C
23:50:29: 1800/2256/2256MHz 2.06 13% 0% 12% 0% 0% 0% 50.8°C
23:50:38: 1800/2256/2256MHz 2.05 13% 0% 12% 0% 0% 0% 51.8°C
23:50:47: 1800/2256/2256MHz 2.19 13% 0% 12% 0% 0% 0% 50.8°C
System health while running OpenSSL benchmark:
Time cpu0/cpu4/cpu6 load %cpu %sys %usr %nice %io %irq Temp
23:50:53: 1800/2256/2256MHz 2.17 0% 0% 0% 0% 0% 0% 51.8°C
23:51:09: 1800/2256/2256MHz 2.12 12% 0% 12% 0% 0% 0% 49.9°C
23:51:25: 1800/2256/2256MHz 2.10 12% 0% 12% 0% 0% 0% 50.8°C
23:51:41: 1800/2256/2256MHz 2.07 13% 0% 12% 0% 0% 0% 50.8°C
23:51:57: 1800/2256/2256MHz 2.06 12% 0% 12% 0% 0% 0% 49.9°C
23:52:13: 1800/2256/2256MHz 2.04 12% 0% 12% 0% 0% 0% 50.8°C
23:52:29: 1800/2256/2256MHz 2.03 12% 0% 12% 0% 0% 0% 50.8°C
23:52:45: 1800/2256/2256MHz 2.02 12% 0% 12% 0% 0% 0% 49.9°C
23:53:01: 1800/2256/2256MHz 2.02 12% 0% 12% 0% 0% 0% 50.8°C
23:53:17: 1800/2256/2256MHz 2.01 12% 0% 12% 0% 0% 0% 50.8°C
23:53:33: 1800/2256/2256MHz 2.01 12% 0% 12% 0% 0% 0% 49.9°C
System health while running 7-zip single core benchmark:
Time cpu0/cpu4/cpu6 load %cpu %sys %usr %nice %io %irq Temp
23:53:35: 1800/2256/2256MHz 2.01 0% 0% 0% 0% 0% 0% 51.8°C
23:53:45: 1800/2256/2256MHz 2.01 13% 0% 12% 0% 0% 0% 49.9°C
23:53:55: 1800/2256/2256MHz 2.00 13% 0% 12% 0% 0% 0% 49.9°C
23:54:05: 1800/2256/2256MHz 2.00 13% 0% 12% 0% 0% 0% 49.9°C
23:54:15: 1800/2256/2256MHz 2.00 13% 0% 12% 0% 0% 0% 49.0°C
23:54:25: 1800/2256/2256MHz 2.00 13% 0% 12% 0% 0% 0% 49.0°C
23:54:35: 1800/2256/2256MHz 2.00 13% 0% 12% 0% 0% 0% 49.0°C
23:54:45: 1800/2256/2256MHz 2.00 13% 0% 12% 0% 0% 0% 49.0°C
23:54:55: 1800/2256/2256MHz 2.00 13% 0% 12% 0% 0% 0% 49.0°C
23:55:05: 1800/2256/2256MHz 2.00 12% 0% 12% 0% 0% 0% 49.0°C
23:55:15: 1800/2256/2256MHz 2.00 13% 0% 12% 0% 0% 0% 49.0°C
23:55:25: 1800/2256/2256MHz 2.00 12% 0% 12% 0% 0% 0% 50.8°C
23:55:35: 1800/2256/2256MHz 2.00 12% 0% 12% 0% 0% 0% 50.8°C
23:55:45: 1800/2256/2256MHz 2.00 13% 0% 12% 0% 0% 0% 50.8°C
23:55:55: 1800/2256/2256MHz 2.00 12% 0% 12% 0% 0% 0% 50.8°C
23:56:05: 1800/2256/2256MHz 2.00 12% 0% 12% 0% 0% 0% 50.8°C
23:56:15: 1800/2256/2256MHz 2.00 12% 0% 12% 0% 0% 0% 50.8°C
23:56:26: 1800/2256/2256MHz 2.00 12% 0% 12% 0% 0% 0% 50.8°C
23:56:36: 1800/2256/2256MHz 2.00 13% 0% 12% 0% 0% 0% 50.8°C
23:56:46: 1800/2256/2256MHz 2.00 13% 0% 12% 0% 0% 0% 50.8°C
System health while running 7-zip multi core benchmark:
Time cpu0/cpu4/cpu6 load %cpu %sys %usr %nice %io %irq Temp
23:56:54: 1800/2256/2256MHz 2.00 0% 0% 0% 0% 0% 0% 56.4°C
23:57:04: 1800/2256/2256MHz 3.08 87% 0% 87% 0% 0% 0% 57.3°C
23:57:14: 1800/2256/2256MHz 3.55 87% 0% 86% 0% 0% 0% 59.2°C
23:57:25: 1800/2256/2256MHz 4.48 89% 1% 88% 0% 0% 0% 62.8°C
23:57:35: 1800/2256/2256MHz 5.56 79% 1% 78% 0% 0% 0% 61.9°C
23:57:45: 1800/2256/2256MHz 5.85 90% 1% 89% 0% 0% 0% 60.1°C
23:57:55: 1800/2256/2256MHz 6.26 92% 0% 91% 0% 0% 0% 60.1°C
23:58:05: 1800/2256/2256MHz 6.83 86% 0% 85% 0% 0% 0% 63.8°C
23:58:19: 1800/2256/2256MHz 7.24 83% 1% 82% 0% 0% 0% 63.8°C
23:58:29: 1800/2256/2256MHz 7.59 82% 1% 80% 0% 0% 0% 61.9°C
23:58:39: 1800/2256/2256MHz 7.96 89% 0% 88% 0% 0% 0% 61.0°C
23:58:49: 1800/2256/2256MHz 7.80 92% 0% 91% 0% 0% 0% 61.9°C
23:58:59: 1800/2256/2256MHz 7.98 86% 0% 85% 0% 0% 0% 63.8°C
23:59:12: 1800/2256/2256MHz 8.29 83% 1% 82% 0% 0% 0% 63.8°C
23:59:22: 1800/2256/2256MHz 8.27 82% 1% 80% 0% 0% 0% 62.8°C
23:59:32: 1800/2256/2256MHz 8.45 88% 1% 87% 0% 0% 0% 61.0°C
##########################################################################
SoC guess: Rockchip RK3588 (35881000)
DMC gov: dmc_ondemand (upthreshold: 40, downdifferential: 20)
DT compat: rockchip,rk3588-orangepi-5-ultra
rockchip,rk3588
Compiler: /usr/bin/gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 / aarch64-linux-gnu
Userland: arm64
Kernel: 6.1.43-rockchip-rk3588/aarch64
CONFIG_HZ=300
CONFIG_HZ_300=y
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_PREEMPT_VOLUNTARY=y
CONFIG_PREEMPT_VOLUNTARY_BUILD=y
rockchip-vop2 fdd90000.vop: leakage=23
rockchip-vop2 fdd90000.vop: leakage-volt-sel=0
cpu cpu0: leakage=8
cpu cpu0: pvtm=1372
cpu cpu0: pvtm-volt-sel=0
cpu cpu4: leakage=7
cpu cpu4: pvtm=1586
cpu cpu4: pvtm-volt-sel=0
cpu cpu6: leakage=7
cpu cpu6: pvtm=1600
cpu cpu6: pvtm-volt-sel=1
mpp_rkvenc2 fdbd0000.rkvenc-core: leakage=8
mpp_rkvenc2 fdbd0000.rkvenc-core: leakage-volt-sel=0
mpp_rkvenc2 fdbe0000.rkvenc-core: leakage=8
mpp_rkvenc2 fdbe0000.rkvenc-core: leakage-volt-sel=0
mali fb000000.gpu: leakage=11
rockchip-dmc dmc: leakage=23
rockchip-dmc dmc: leakage-volt-sel=0
RKNPU fdab0000.npu: leakage=5
##########################################################################
vdd_cpu_big0_s0: 1000 mV (1050 mV max)
vdd_cpu_big1_s0: 1000 mV (1050 mV max)
vdd_npu_s0: 700 mV (950 mV max)
cluster0-opp-table:
408 MHz 675.0 mV (00f9 ffff)
408 MHz 750.0 mV (0006 ffff)
600 MHz 675.0 mV (00f9 ffff)
600 MHz 750.0 mV (0006 ffff)
816 MHz 675.0 mV (00f9 ffff)
816 MHz 750.0 mV (0006 ffff)
1008 MHz 675.0 mV (00f9 ffff)
1008 MHz 750.0 mV (0006 ffff)
1200 MHz 712.5 mV (00f9 ffff)
1200 MHz 750.0 mV (0006 ffff)
1296 MHz 750.0 mV (0004 ffff)
1416 MHz 750.0 mV (0006 ffff)
1416 MHz 762.5 mV (00f9 ffff)
1608 MHz 850.0 mV (00f9 ffff)
1608 MHz 887.5 mV (0006 ffff)
1704 MHz 937.5 mV (0006 ffff)
1800 MHz 950.0 mV (00f9 ffff)
cluster1-opp-table:
408 MHz 675.0 mV (00f9 ffff)
408 MHz 750.0 mV (0006 ffff)
600 MHz 675.0 mV (00f9 ffff)
600 MHz 750.0 mV (0006 ffff)
816 MHz 675.0 mV (00f9 ffff)
816 MHz 750.0 mV (0006 ffff)
1008 MHz 675.0 mV (00f9 ffff)
1008 MHz 750.0 mV (0006 ffff)
1200 MHz 675.0 mV (00f9 ffff)
1200 MHz 750.0 mV (0006 ffff)
1416 MHz 725.0 mV (00f9 ffff)
1416 MHz 750.0 mV (0006 ffff)
1608 MHz 762.5 mV (00f9 ffff)
1608 MHz 787.5 mV (0006 ffff)
1800 MHz 850.0 mV (00f9 ffff)
1800 MHz 875.0 mV (0006 ffff)
2016 MHz 925.0 mV (00f9 ffff)
2016 MHz 950.0 mV (0006 ffff)
2208 MHz 987.5 mV (00f9 ffff)
2256 MHz 1000.0 mV (00f9 0013)
2304 MHz 1000.0 mV (00f9 0024)
2352 MHz 1000.0 mV (00f9 0048)
2400 MHz 1000.0 mV (00f9 0080)
cluster2-opp-table:
408 MHz 675.0 mV (00f9 ffff)
408 MHz 750.0 mV (0006 ffff)
600 MHz 675.0 mV (00f9 ffff)
600 MHz 750.0 mV (0006 ffff)
816 MHz 675.0 mV (00f9 ffff)
816 MHz 750.0 mV (0006 ffff)
1008 MHz 675.0 mV (00f9 ffff)
1008 MHz 750.0 mV (0006 ffff)
1200 MHz 675.0 mV (00f9 ffff)
1200 MHz 750.0 mV (0006 ffff)
1416 MHz 725.0 mV (00f9 ffff)
1416 MHz 750.0 mV (0006 ffff)
1608 MHz 762.5 mV (00f9 ffff)
1608 MHz 787.5 mV (0006 ffff)
1800 MHz 850.0 mV (00f9 ffff)
1800 MHz 875.0 mV (0006 ffff)
2016 MHz 925.0 mV (00f9 ffff)
2016 MHz 950.0 mV (0006 ffff)
2208 MHz 987.5 mV (00f9 ffff)
2256 MHz 1000.0 mV (00f9 0013)
2304 MHz 1000.0 mV (00f9 0024)
2352 MHz 1000.0 mV (00f9 0048)
2400 MHz 1000.0 mV (00f9 0080)
dmc-opp-table:
528 MHz 675.0 mV (00f9 ffff)
528 MHz 750.0 mV (0006 ffff)
1068 MHz 725.0 mV (00f9 ffff)
1068 MHz 750.0 mV (0006 ffff)
1560 MHz 800.0 mV (0006 ffff)
1560 MHz 800.0 mV (00f9 ffff)
2750 MHz 875.0 mV (0006 ffff)
2750 MHz 875.0 mV (00f9 ffff)
gpu-opp-table:
300 MHz 675.0 mV (00f9 ffff)
300 MHz 750.0 mV (0006 ffff)
400 MHz 675.0 mV (00f9 ffff)
400 MHz 750.0 mV (0006 ffff)
500 MHz 675.0 mV (00f9 ffff)
500 MHz 750.0 mV (0006 ffff)
600 MHz 675.0 mV (00f9 ffff)
600 MHz 750.0 mV (0006 ffff)
700 MHz 700.0 mV (00f9 ffff)
700 MHz 750.0 mV (0006 ffff)
800 MHz 750.0 mV (0002 ffff)
800 MHz 750.0 mV (00f9 ffff)
850 MHz 787.5 mV (0004 ffff)
900 MHz 800.0 mV (0002 ffff)
900 MHz 800.0 mV (00f9 ffff)
1000 MHz 850.0 mV (0002 ffff)
1000 MHz 850.0 mV (00f9 ffff)
npu-opp-table:
300 MHz 700.0 mV (00f9 ffff)
300 MHz 750.0 mV (0006 ffff)
400 MHz 700.0 mV (00f9 ffff)
400 MHz 750.0 mV (0006 ffff)
500 MHz 700.0 mV (00f9 ffff)
500 MHz 750.0 mV (0006 ffff)
600 MHz 700.0 mV (00f9 ffff)
600 MHz 750.0 mV (0006 ffff)
700 MHz 700.0 mV (00f9 ffff)
700 MHz 750.0 mV (0006 ffff)
800 MHz 750.0 mV (0006 ffff)
800 MHz 750.0 mV (00f9 ffff)
900 MHz 800.0 mV (00f9 ffff)
950 MHz 837.5 mV (0006 ffff)
1000 MHz 850.0 mV (00f9 ffff)
venc-opp-table:
800 MHz 750.0 mV
vop-opp-table:
500 MHz 725.0 mV
750 MHz 725.0 mV
850 MHz 800.0 mV
4.2.2 pcie 设备
使用lspci
看不到多少 pcie 设备,这个比较可惜。看来 ESXi 只能直通固态、两个板载网卡还有 WiFi 网卡位置的 PCIe 2.0x1,3588 强大的 GPU 和 NPU 以及其他 IO 被完全浪费了。
lspci -nnk
0000:00:00.0 PCI bridge [0604]: Rockchip Electronics Co., Ltd Device [1d87:3588] (rev 01)
Kernel driver in use: pcieport
0000:01:00.0 Non-Volatile memory controller [0108]: Shenzhen Longsys Electronics Co., Ltd. Device [1d97:5236] (rev 01)
Subsystem: Shenzhen Longsys Electronics Co., Ltd. Device [1d97:5236]
Kernel driver in use: nvme
0003:30:00.0 PCI bridge [0604]: Rockchip Electronics Co., Ltd Device [1d87:3588] (rev 01)
Kernel driver in use: pcieport
0003:31:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller [10ec:8125] (rev 05)
Subsystem: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller [10ec:8125]
Kernel driver in use: r8169
Kernel modules: r8169
4.2.3 Sysbench 测试
Sysbench是一个开源的、模块化的、跨平台的多线程性能测试工具,可以用来进行CPU、内存、磁盘I/O、线程、数据库的性能测试。
安装使用命令 sudo apt-get install sysbench
测试CPU命令 sysbench cpu run
其中 CPU speed: events per second 是衡量 CPU 速度的指标。
在单线程下:2512.46
sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 2512.46
General statistics:
total time: 10.0004s
total number of events: 25131
Latency (ms):
min: 0.39
avg: 0.40
max: 2.43
95th percentile: 0.42
sum: 9995.00
Threads fairness:
events (avg/stddev): 25131.0000/0.00
execution time (avg/stddev): 9.9950/0.00