文章目录
背景
DOCA: Data Center-on-a-Chip Architecture,片上数据中心架构。是NVIDIA针对DPU的软件框架。DOCA之于DPU,就相当于CUDA之于GPU。
202410 NVIDIA发布了LTS的DOCA 2.9,本文主要记录升级DPU端快速升级DPU中的DOCA2.9的OS,不涉及升级fw和其他固件,仅仅更新其中的OS系统(含相关库)。主要是为了升级DOCA的库。快速记录
实战
下载DOCA 2.9的bfb
https://developer.nvidia.com/doca-downloads
确认DPU设备(如果有多个BF设备)
我的目标是升级slot5上的设备,通过slot查找bdf号。
根据bdf找对应rshim设备:找到是rshim1的设备
打开系统日志、打开串口
- 设置系统日志
echo "DISPLAY_LEVEL 2" > /dev/rshim1/misc
- 打开串口
screen /dev/rshim1/console
设置bfb安装参数:
这里不升级网卡的固件fw,不升级BMC、CEC和DPU GOLDEN_IMAGE、FW_GOLDEN_IMAGE,只升级ATF和UEFI(可能有一些参数BIOS修改,另外BMC一般建议一起升级)。
[root@localhost ~]# cat bf.cfg-bak
WITH_NIC_FW_UPDATE="no"
UPDATE_ATF_UEFI="yes"
UPDATE_BMC_FW="no"
UPDATE_CEC_FW="no"
UPDATE_DPU_GOLDEN_IMAGE="no"
UPDATE_NIC_FW_GOLDEN_IMAGE="no"
bfb_modify_os()
{
log ===================== bfb_modify_os =====================
log "Disable OVS bridges creation upon boot"
}
bfb_pre_install()
{
log ===================== bfb_pre_install =====================
}
bfb_post_install()
{
log ===================== bfb_post_install =====================
}
执行bfb-install升级
同时查看串口日志:
等待升级成功
升级成功后设置用户名和密码
默认用户名:ubuntu
默认密码:ubuntu
首次登录需要输入密码,然后立即更新密码。更新密码不能和ubuntu一样。
建议临时设置一个密码后,切换到root,再把root、ubuntu用户的密码再次设置为ubuntu
再次切换为ubuntu密码方式
将root、ubuntu用户的密码再次设置为ubuntu。并建议测试一下。
最后判断成功方式
在misc中看到 INFO[MISC]: : DPU is ready
其他
报错:cat: 写入错误: 连接超时
[root@localhost ~]# sudo bfb-install --rshim rshim1 --bfb bf-bundle-2.9.0-90_24.10_ubuntu-22.04_prod.bfb -c bf.cfg
Warn: 'pv' command not found. Continue without showing BFB progress.
Pushing bfb + cfg
cat: 写入错误: 连接超时
伴随打开loglevel后看到panic
[root@localhost rshim1]# echo "DISPLAY_LEVEL 2" > /dev/rshim1/misc
[root@localhost rshim1]# cat misc
DISPLAY_LEVEL 2 (0:basic, 1:advanced, 2:log)
BOOT_MODE 1 (0:rshim, 1:emmc, 2:emmc-boot-swap)
BOOT_TIMEOUT 150 (seconds)
DROP_MODE 0 (0:normal, 1:drop)
SW_RESET 0 (1: reset)
DEV_NAME pcie-0000:05:00.1
DEV_INFO BlueField-2(Rev 1)
OPN_STR MBF2M345A-VENOT_
---------------------------------------
Log Messages
---------------------------------------
INFO[BL2]: start
INFO[BL2]: boot mode (rshim)
INFO[BL2]: DDR POST passed
PANIC(BL2): PC = 0x4018bc
elr_el1 0x401000
esr_el1 0x0
far_el1 0x0
这里是因为没有开启ATF_UEFI的升级。一般建议保持系统默认配置。
将bf.cfg中 UPDATE_ATF_UEFI="no"
设置为UPDATE_ATF_UEFI="yes"
报错:mlx5_cmd_out_err:835:(pid 1007): CREATE_FLOW_GROUP(0x933) op_mod(0x0) failed
该部分主要原因是fw和驱动不匹配或者是DPU硬件不支持,比如这里的CREATE_FLOW_GROUP可能是硬件BF2不支持FLOW(该问题未进一步分析)。
[ 15.578751] mlx5_core 0000:03:00.0: mlx5_cmd_out_err:835:(pid 1007): CREATE_FLOW_GROUP(0x933) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x201c1c), err(-22)
[ 15.594023] mlx5_core 0000:03:00.0: mlx5_rdma_enable_roce_steering:71:(pid 1007): Failed to create RDMA RX flow group err(-22)
升级过程中misc的全量日志
[root@localhost ~]# sudo bfb-install --rshim rshim1 --bfb bf-bundle-2.9.0-90_24.10_ubuntu-22.04_prod.bfb -c bf.cfg-bak
Warn: 'pv' command not found. Continue without showing BFB progress.
Pushing bfb + cfg
Collecting BlueField booting status. Press Ctrl+C to stop…
INFO[BL2]: start
INFO[BL2]: boot mode (rshim)
INFO[BL2]: DDR POST passed
INFO[BL2]: UEFI loaded
INFO[BL31]: start
INFO[BL31]: lifecycle GA Non-Secured
INFO[BL31]: runtime
INFO[UEFI]: UPVS valid
INFO[UEFI]: eMMC init
INFO[UEFI]: eMMC probed
INFO[UEFI]: PMI: updates started
INFO[UEFI]: PMI: total updates: 1
INFO[UEFI]: PMI: updates completed, status 0
INFO[UEFI]: PCIe enum start
INFO[UEFI]: PCIe enum end
INFO[UEFI]: UEFI Secure Boot (disabled)
INFO[UEFI]: Redfish enabled
WARN[UEFI]: UPVS reclaim start
WARN[UEFI]: UPVS reclaim done
INFO[UEFI]: exit Boot Service
INFO[MISC]: Found bf.cfg
INFO[MISC]: Erasing eMMC drive: /dev/mmcblk0
INFO[MISC]: Ubuntu installation started
INFO[MISC]: Running bfb_pre_install from bf.cfg
INFO[MISC]: ===================== bfb_pre_install =====================
INFO[MISC]: Installing OS image
INFO[MISC]: Running bfb_modify_os from bf.cfg
INFO[MISC]: ===================== bfb_modify_os =====================
INFO[MISC]: Disable OVS bridges creation upon boot
INFO[MISC]: Ubuntu installation completed
INFO[BL2]: start
INFO[BL2]: boot mode (emmc)
INFO[BL2]: DDR POST passed
INFO[BL2]: UEFI loaded
INFO[BL31]: start
INFO[BL31]: lifecycle GA Non-Secured
INFO[BL31]: runtime
INFO[UEFI]: UPVS valid
INFO[UEFI]: eMMC init
INFO[UEFI]: eMMC probed
INFO[UEFI]: PCIe enum start
INFO[UEFI]: PCIe enum end
INFO[UEFI]: PMI: updates started
INFO[UEFI]: PMI: total updates: 1
INFO[UEFI]: PMI: updates completed, status 0
INFO[UEFI]: PMI: updates started
INFO[UEFI]: PMI: total updates: 6
INFO[UEFI]: PMI: updates completed, status 0
INFO[UEFI]: UEFI Secure Boot (disabled)
INFO[UEFI]: Redfish enabled
INFO[UEFI]: DPU-BMC RF credentials not found
WARN[UEFI]: UPVS reclaim start
WARN[UEFI]: UPVS reclaim done
INFO[UEFI]: exit Boot Service
INFO[MISC]: Linux up
INFO[MISC]: DPU is ready
[root@localhost ~]#
确认升级成功以及系统版本和DOCA版本
bfver
bfvcheck
bf-info
可以看到doca版本都是2.9了
升级过程中的串口全量日志
[root@localhost rshim1]# screen /dev/rshim1/console
screen: /lib64/libtinfo.so.6: no version information available (required by screen)
Mellanox BlueField-2 A1 BL1 V1.1
NOTICE: No CDI passed to Riot core!
NOTICE: BL2R: v2.2(release):4.9.0-25-g0ce57e322
NOTICE: BL2R: Built : 22:23:05, Oct 30 2024
NOTICE: BL2R built for hw (ver 1)
NOTICE: BL2R: Booting BL2
NOTICE: BL2: v2.2(release):4.9.0-25-g0ce57e322
NOTICE: BL2: Built : 22:23:05, Oct 30 2024
NOTICE: BL2 built for hw (ver 1)
NOTICE: Running as MBF2M345A-VENOT_ system
NOTICE: No SPD detected on MSS0 DIMM0
NOTICE: No SPD detected on MSS0 DIMM1
NOTICE: Finished initializing DDR
NOTICE: DDR POST passed.
NOTICE: BL31: v2.2(release):4.9.0-25-g0ce57e322
NOTICE: BL31: Built : 22:23:05, Oct 30 2024
NOTICE: BL31 built for hw (ver 1), lifecycle GA Non-Secured
UEFI firmware (version BlueField:4.9.0-46-g7e3911bd4d-BId13378 built at 22:54:51 on Oct 30 2024)
FmpDxe: EFI Capsule Authentication Successful, Status: Success.
[PMI] Boot image update started.
Note: Installed image will be filtered from 7022816 bytes to 3191888 bytes
Size check good, 3191888 <= 4194304
Check boot image, Status: Success
...Filtering unneeded executables...
...Preparing boot image...
...Generating boot image...
...Writing the boot image, Size: 3192320 bytes...
...Verify boot image...
Write boot image to partition 1, Status: Success
...Filtering unneeded executables...
...Preparing boot image...
...Generating boot image...
...Writing the boot image, Size: 3192320 bytes...
...Verify boot image...
Write boot image to partition 2, Status: Success
[PMI] Boot Image update completed, Status: Success
[PMI] Total number of updates: 1
[PMI] Errors during updates : 0
CapsuleRuntimeDxe: ProcessCapsuleImage 0, Status: Success
Current Secure Boot State: disabled
Secure Boot Mode : Setup Mode
PK is not configured
Redfish enabled
Press ESC/F2/DEL twice to enter UEFI Menu.
Press ENTER to skip countdown.
3 seconds remain...
2 seconds remain...
1 seconds remain...
0 seconds remain...
EFI stub: Booting Linux Kernel...
EFI stub: Generating empty DTB
EFI stub: Loaded initrd from command line option
EFI stub: Exiting boot services...
[ 9.982407] mlxbf2_gpio MLNXBF22:01: IRQ index 0 not found
[ OK ] Finished Coldplug All udev Devices.
[ OK ] Reached target System Initialization.
[ OK ] Reached target Basic System.
[ 10.005158] mlxbf2_gpio MLNXBF22:02: IRQ index 0 not found
Starting dracut initqueue hook...
[ OK ] Finished dracut initqueue hook.
[ OK ] Reached target Preparation for Remote File Systems.
[ OK ] Reached target Remote File Systems.
[ OK ] Reached target Initrd Root File System.
Starting Reload Configuration from the Real Root...
Starting Network Configuration...
[ OK ] Started Network Configuration.
[ OK ] Reached target Network.
[ OK ] Finished Reload Configuration from the Real Root.
[ OK ] Reached target Initrd File Systems.
Starting Install ubuntu Linux...
[ 10.413396] ipmb-host 1-1011: ERROR: ipmb_send_request failed during ipmb detection
[ 10.428849] ipmb-host 1-1011: Unable to get response from slave device at this time
[15:43:53] INFO: Found bf.cfg
[15:43:53] Starting mst:
[15:43:53] Starting MST (Mellanox Software Tools) driver set
Loading MST PCI module - Success
Loading MST PCI configuration module - Success
Create devices
Unloading MST PCI module (unused) - Success
[15:44:05] INFO: Erasing eMMC drive: /dev/mmcblk0
[ 25.460807] initrd-install[854]: /bin/ls: cannot access 'nvme*': No such file or directory
[15:44:05] INFO: Ubuntu installation started
[15:44:05] OS installation target: /dev/mmcblk0
[ 25.543533] initrd-install[883]: 1+0 records in
[ 25.543884] initrd-install[883]: 1+0 records out
[ 25.544048] initrd-install[883]: 512 bytes copied, 0.0008016 s, 639 kB/s
[15:44:05] Creating partitions on /dev/mmcblk0 using sfdisk:
[15:44:07] INFO: Running bfb_pre_install from bf.cfg
[15:44:07] ===================== bfb_pre_install =====================
[15:44:07] Installing OS image
[15:44:07] Creating file systems:
[15:45:50] Extracting /...
[ 273.410892] initrd-install[972]: WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
[ 274.090975] initrd-install[972]: E: Can not write log (Is /dev/pts mounted?) - posix_openpt (19: No such device)
[ 274.540071] initrd-install[990]: Running in chroot, ignoring command 'stop'
[15:48:16] Removing packages: bf3-bmc-gi-signed bf3-bmc-nic-fw-sn37b36732-ax bf3-bmc-nic-fw-900-9d3c6-00cv-da0-ax bf3-bmc-nic-fw-900-9d3b6-00cv-a-ax bf3-bmc-nic-fw-900-9d3b6-00cv-a-alt-il1-ax bf3-bmc-nic-fw-900-9d3c6-00sv-da-ax bf3-bmc-fw-signed bf3-bmc-nic-fw-900-9d3c6-00cv-ga0-ax bf3-bmc-nic-fw-8217991-ax bf3-bmc-fw-signed bf3-bmc-nic-fw-900-9d3c6-00sv-ga-ax bf3-bmc-nic-fw-900-9d3b6-00sv-a-ax bf3-cec-fw-signed bf3-cec-fw-signed
[ 276.505125] initrd-install[1068]: WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
[ 277.113501] initrd-install[1068]: E: Can not write log (Is /dev/pts mounted?) - posix_openpt (19: No such device)
[15:48:19] Configure dhcp:
[15:48:19]
[15:48:19]
[15:48:19]
[15:48:19] Updating Ubuntu initramfs
[15:48:36] dracut: Executing: /usr/bin/dracut --force --add-drivers "mlxbf-bootctl sdhci-of-dwcmshc mlxbf-tmfifo dw_mmc-bluefield mlx5_core mlx5_ib mlxfw ib_umad nvme sbsa_gwdt gpio-mlxbf2 gpio-mlxbf3 mlxbf-gige pinctrl-mlxbf3 8021q" --gzip /boot/initrd.img-5.15.0-1056-bluefield 5.15.0-1056-bluefield
dracut: dracut module 'bootchart' will not be installed, because command '/sbin/bootchartd' could not be found!
dracut: dracut module 'rngd' will not be installed, because command 'rngd' could not be found!
dracut: dracut module 'network' will not be installed, because command 'wicked' could not be found!
dracut: dracut module 'dmraid' will not be installed, because command 'dmraid' could not be found!
dracut: dracut module 'cifs' will not be installed, because command 'mount.cifs' could not be found!
cat: /sys/power/resume: No such file or directory
dracut: dracut module 'biosdevname' will not be installed, because command 'biosdevname' could not be found!
dracut: dracut module 'rngd' will not be installed, because command 'rngd' could not be found!
dracut: dracut module 'dmraid' will not be installed, because command 'dmraid' could not be found!
dracut: dracut module 'cifs' will not be installed, because command 'mount.cifs' could not be found!
dracut: *** Including module: bash ***
dracut: *** Including module: dash ***
dracut: *** Including module: systemd ***
dracut: *** Including module: systemd-initrd ***
dracut: *** Including module: console-setup ***
dracut: *** Including module: network-legacy ***
dracut: *** Including module: network ***
dracut: *** Including module: ifcfg ***
dracut: *** Including module: kernel-modules ***
dracut: *** Including module: kernel-modules-extra ***
dracut: *** Including module: kernel-network-modules ***
dracut: *** Including module: overlay-root ***
dracut: *** Including module: resume ***
dracut: *** Including module: rootfs-block ***
dracut: *** Including module: terminfo ***
dracut: *** Including module: udev-rules ***
dracut: Skipping udev rule: 40-redhat.rules
dracut: Skipping udev rule: 91-permissions.rules
dracut: Skipping udev rule: 80-drivers-modprobe.rules
dracut: Skipping udev rule: 70-persistent-net.rules
dracut: *** Including module: dracut-systemd ***
dracut: *** Including module: usrmount ***
dracut: *** Including module: base ***
dracut: *** Including module: fs-lib ***
dracut: *** Including module: shutdown ***
dracut: *** Including modules done ***
dracut: *** Installing kernel module dependencies ***
dracut: *** Installing kernel module dependencies done ***
dracut: *** Resolving executable dependencies ***
dracut: *** Resolving executable dependencies done ***
dracut: *** Hardlinking files ***
Mode: real
Files: 666
Linked: 1 files
Compared: 0 xattrs
Compared: 13 files
Saved: 692 B
Duration: 0.013946 seconds
dracut: *** Hardlinking files done ***
dracut: *** Store current command line parameters ***
dracut: Stored kernel commandline:
dracut: root=UUID=f5cf8fb2-29e1-4878-8d62-08b506cc6e67 rootfstype=ext4 rootflags=rw,relatime
dracut: *** Stripping files ***
dracut: *** Stripping files done ***
dracut: *** Creating image file '/boot/initrd.img-5.15.0-1056-bluefield' ***
dracut: *** Creating initramfs image file '/boot/initrd.img-5.15.0-1056-bluefield' done ***
[15:48:37] Configure grub:
[15:48:37] Creating GRUB configuration
[ 297.149010] initrd-install[4933]: Installing for arm64-efi platform.
[ 299.333027] initrd-install[4933]: Installation finished. No error reported.
[15:48:39]
[ 299.768655] initrd-install[4951]: Sourcing file `/etc/default/grub'
[ 299.881894] initrd-install[4951]: Sourcing file `/etc/default/grub.d/10-disableos-prober.cfg'
[ 299.883524] initrd-install[4951]: Sourcing file `/etc/default/grub.d/init-select.cfg'
[ 299.888269] initrd-install[4993]: Generating grub configuration file ...
[ 300.215117] initrd-install[5034]: Found linux image: /boot/vmlinuz-5.15.0-1056-bluefield
[ 300.233037] initrd-install[5034]: Found initrd image: /boot/initrd.img-5.15.0-1056-bluefield
[ 301.042922] initrd-install[5298]: Warning: os-prober will not be executed to detect other bootable partitions.
[ 301.043291] initrd-install[5298]: Systems on them will not be added to the GRUB boot configuration.
[ 301.043432] initrd-install[5298]: Check GRUB_DISABLE_OS_PROBER documentation entry.
[ 301.050420] initrd-install[5302]: Adding boot menu entry for UEFI Firmware Settings ...
[ 301.072965] initrd-install[5317]: done
[15:48:41]
[15:48:41]
[15:48:41] Updating EFI boot entries:
[15:48:41] Remove old boot entries
[15:48:43] BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0000,0001,0002,0004,0005,0006,0007,0008,0009,000A,000B,000C,000D,000E,000F,0010,0011,0012
Boot0000* focal
Boot0001* Linux from mmc0
Boot0002* EFI Internal Shell
Boot0004* UiApp
Boot0005* UEFI Misc Device
Boot0006* UEFI Non-Block Boot Device
Boot0007* UEFI PXEv4 (MAC:B8CEF6FD8462)
Boot0008* UEFI PXEv6 (MAC:B8CEF6FD8462)
Boot0009* UEFI HTTPv4 (MAC:B8CEF6FD8462)
Boot000A* UEFI HTTPv6 (MAC:B8CEF6FD8462)
Boot000B* UEFI PXEv4 (MAC:001ACAFFFF01)
Boot000C* UEFI PXEv6 (MAC:001ACAFFFF01)
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0001,0002,0004,0005,0006,0007,0008,0009,000A,000B,000C,000D,000E,000F,0010,0011,0012
Boot0001* Linux from mmc0
Boot0002* EFI Internal Shell
Boot0004* UiApp
Boot0005* UEFI Misc Device
Boot0006* UEFI Non-Block Boot Device
Boot0007* UEFI PXEv4 (MAC:B8CEF6FD8462)
Boot0008* UEFI PXEv6 (MAC:B8CEF6FD8462)
Boot0009* UEFI HTTPv4 (MAC:B8CEF6FD8462)
Boot000A* UEFI HTTPv6 (MAC:B8CEF6FD8462)
Boot000B* UEFI PXEv4 (MAC:001ACAFFFF01)
Boot000C* UEFI PXEv6 (MAC:001ACAFFFF01)
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0002,0004,0005,0006,0007,0008,0009,000A,000B,000C,000D,000E,000F,0010,0011,0012
Boot0002* EFI Internal Shell
Boot0004* UiApp
Boot0005* UEFI Misc Device
Boot0006* UEFI Non-Block Boot Device
Boot0007* UEFI PXEv4 (MAC:B8CEF6FD8462)
Boot0008* UEFI PXEv6 (MAC:B8CEF6FD8462)
Boot0009* UEFI HTTPv4 (MAC:B8CEF6FD8462)
Boot000A* UEFI HTTPv6 (MAC:B8CEF6FD8462)
Boot000B* UEFI PXEv4 (MAC:001ACAFFFF01)
Boot000C* UEFI PXEv6 (MAC:001ACAFFFF01)
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0004,0005,0006,0007,0008,0009,000A,000B,000C,000D,000E,000F,0010,0011,0012
Boot0004* UiApp
Boot0005* UEFI Misc Device
Boot0006* UEFI Non-Block Boot Device
Boot0007* UEFI PXEv4 (MAC:B8CEF6FD8462)
Boot0008* UEFI PXEv6 (MAC:B8CEF6FD8462)
Boot0009* UEFI HTTPv4 (MAC:B8CEF6FD8462)
Boot000A* UEFI HTTPv6 (MAC:B8CEF6FD8462)
Boot000B* UEFI PXEv4 (MAC:001ACAFFFF01)
Boot000C* UEFI PXEv6 (MAC:001ACAFFFF01)
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0005,0006,0007,0008,0009,000A,000B,000C,000D,000E,000F,0010,0011,0012
Boot0005* UEFI Misc Device
Boot0006* UEFI Non-Block Boot Device
Boot0007* UEFI PXEv4 (MAC:B8CEF6FD8462)
Boot0008* UEFI PXEv6 (MAC:B8CEF6FD8462)
Boot0009* UEFI HTTPv4 (MAC:B8CEF6FD8462)
Boot000A* UEFI HTTPv6 (MAC:B8CEF6FD8462)
Boot000B* UEFI PXEv4 (MAC:001ACAFFFF01)
Boot000C* UEFI PXEv6 (MAC:001ACAFFFF01)
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0006,0007,0008,0009,000A,000B,000C,000D,000E,000F,0010,0011,0012
Boot0006* UEFI Non-Block Boot Device
Boot0007* UEFI PXEv4 (MAC:B8CEF6FD8462)
Boot0008* UEFI PXEv6 (MAC:B8CEF6FD8462)
Boot0009* UEFI HTTPv4 (MAC:B8CEF6FD8462)
Boot000A* UEFI HTTPv6 (MAC:B8CEF6FD8462)
Boot000B* UEFI PXEv4 (MAC:001ACAFFFF01)
Boot000C* UEFI PXEv6 (MAC:001ACAFFFF01)
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0007,0008,0009,000A,000B,000C,000D,000E,000F,0010,0011,0012
Boot0007* UEFI PXEv4 (MAC:B8CEF6FD8462)
Boot0008* UEFI PXEv6 (MAC:B8CEF6FD8462)
Boot0009* UEFI HTTPv4 (MAC:B8CEF6FD8462)
Boot000A* UEFI HTTPv6 (MAC:B8CEF6FD8462)
Boot000B* UEFI PXEv4 (MAC:001ACAFFFF01)
Boot000C* UEFI PXEv6 (MAC:001ACAFFFF01)
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0008,0009,000A,000B,000C,000D,000E,000F,0010,0011,0012
Boot0008* UEFI PXEv6 (MAC:B8CEF6FD8462)
Boot0009* UEFI HTTPv4 (MAC:B8CEF6FD8462)
Boot000A* UEFI HTTPv6 (MAC:B8CEF6FD8462)
Boot000B* UEFI PXEv4 (MAC:001ACAFFFF01)
Boot000C* UEFI PXEv6 (MAC:001ACAFFFF01)
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0009,000A,000B,000C,000D,000E,000F,0010,0011,0012
Boot0009* UEFI HTTPv4 (MAC:B8CEF6FD8462)
Boot000A* UEFI HTTPv6 (MAC:B8CEF6FD8462)
Boot000B* UEFI PXEv4 (MAC:001ACAFFFF01)
Boot000C* UEFI PXEv6 (MAC:001ACAFFFF01)
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 000A,000B,000C,000D,000E,000F,0010,0011,0012
Boot000A* UEFI HTTPv6 (MAC:B8CEF6FD8462)
Boot000B* UEFI PXEv4 (MAC:001ACAFFFF01)
Boot000C* UEFI PXEv6 (MAC:001ACAFFFF01)
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 000B,000C,000D,000E,000F,0010,0011,0012
Boot000B* UEFI PXEv4 (MAC:001ACAFFFF01)
Boot000C* UEFI PXEv6 (MAC:001ACAFFFF01)
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 000C,000D,000E,000F,0010,0011,0012
Boot000C* UEFI PXEv6 (MAC:001ACAFFFF01)
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 000D,000E,000F,0010,0011,0012
Boot000D* UEFI HTTPv4 (MAC:001ACAFFFF01)
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 000E,000F,0010,0011,0012
Boot000E* UEFI HTTPv6 (MAC:001ACAFFFF01)
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 000F,0010,0011,0012
Boot000F* UEFI PXEv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0010,0011,0012
Boot0010* UEFI PXEv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0011,0012
Boot0011* UEFI HTTPv4 (MAC:B8CEF6FD8466 VLAN4040)
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0012
Boot0012* UEFI HTTPv6 (MAC:B8CEF6FD8466 VLAN4040)
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
No BootOrder is set; firmware will attempt recovery
Boot0013* Linux from rshim
BootCurrent: 0013
Timeout: 3 seconds
No BootOrder is set; firmware will attempt recovery
[ 303.852781] initrd-install[5392]: yes: standard output: Broken pipe
[ 303.860933] initrd-install[5395]: yes: standard output: Broken pipe
[ 303.867048] initrd-install[5398]: 16+0 records in
[ 303.867242] initrd-install[5398]: 16+0 records out
[ 303.867396] initrd-install[5398]: 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000155685 s, 26.3 MB/s
[ 307.886360] initrd-install[5887]: yes: standard output: Broken pipe
[ 307.895111] initrd-install[5890]: yes: standard output: Broken pipe
[ 307.901233] initrd-install[5893]: 16+0 records in
[ 307.901508] initrd-install[5893]: 16+0 records out
[ 307.901658] initrd-install[5893]: 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000156465 s, 26.2 MB/s
[15:48:48] Adding Ubuntu boot entry:
[15:48:48] BootCurrent: 0013
Timeout: 3 seconds
BootOrder: 0006,0000,0001,0002,0003,0004,0005
Boot0000* NET-NIC_P0-IPV4
Boot0001* NET-NIC_P0-IPV6
Boot0002* NET-OOB-IPV4
Boot0003* NET-OOB-IPV6
Boot0004* NET-NIC_P0-IPV4-HTTP
Boot0005* NET-OOB-IPV4-HTTP
Boot0006* ubuntu0
[15:48:48] INFO: Running bfb_modify_os from bf.cfg
[15:48:48] ===================== bfb_modify_os =====================
[15:48:48] Disable OVS bridges creation upon boot
[15:48:49] INFO: Ubuntu installation completed
[15:48:49] Updating ATF/UEFI:
[15:48:53] Arm life cycle state "GANon-Secured"
skip BFB signature verification
***********************************************************************
*** ***
*** Platform firmware updates complete. ***
*** ***
***********************************************************************
[15:48:54]
***********************************************************************
*** ***
*** Reboot the system to process the platform firmware updates ***
*** ***
***********************************************************************
primary: /dev/mmcblk0boot1
backup: /dev/mmcblk0boot0
boot-bus-width: x8
reset to x1 after reboot: FALSE
watchdog-swap: disabled
lifecycle state: GA Non-Secured
secure boot key free slots: 4
[15:48:54]
***********************************************************************
*** ***
*** Reboot the system to process the platform firmware updates ***
*** ***
***********************************************************************
primary: /dev/mmcblk0boot1
backup: /dev/mmcblk0boot0
boot-bus-width: x8
reset to x1 after reboot: FALSE
watchdog-swap: disabled
lifecycle state: GA Non-Secured
secure boot key free slots: 4
[15:48:54] Unmount partitions
[15:48:54] Unmounting /mnt/boot/efi
[15:48:54]
[15:48:54] Unmounting /mnt/sys/firmware/efi/efivars
[15:48:54]
[15:48:54] Unmounting /mnt/sys
[15:48:54]
[15:48:54] Unmounting /mnt/dev/pts
[15:48:54]
[15:48:54] Unmounting /mnt/dev
[15:48:54]
[15:48:54] Unmounting /mnt/proc
[15:48:55]
[15:48:55] Unmounting /mnt
[ 314.812454] initrd-install[6168]: umount: /mnt: target is busy.
[15:48:55]
[15:48:55] Unmounting /mnt/sys/firmware/efi/efivars
[15:48:55]
[15:48:55] Unmounting /mnt/sys
[15:48:55]
[15:48:55] Unmounting /mnt/proc
[15:48:55]
[15:48:55] Unmounting /mnt/dev
[15:48:55]
[15:48:55] Unmounting /mnt
[15:48:58]
[15:48:58] INFO: Running bfb_post_install from bf.cfg
[15:48:58] ===================== bfb_post_install =====================
[15:48:58] INFO: Installation finished
[15:49:01] INFO: Rebooting...
[ 326.968413] initrd-install[6239]: Rebooting.
Mellanox BlueField-2 A1 BL1 V1.1
NOTICE: No CDI passed to Riot core!
NOTICE: BL2R: v2.2(release):4.9.0-25-g0ce57e322
NOTICE: BL2R: Built : 22:23:05, Oct 30 2024
NOTICE: BL2R built for hw (ver 1)
NOTICE: BL2R: Booting BL2
NOTICE: BL2: v2.2(release):4.9.0-25-g0ce57e322
NOTICE: BL2: Built : 22:23:05, Oct 30 2024
NOTICE: BL2 built for hw (ver 1)
NOTICE: Running as MBF2M345A-VENOT_ system
NOTICE: No SPD detected on MSS0 DIMM0
NOTICE: No SPD detected on MSS0 DIMM1
NOTICE: Finished initializing DDR
NOTICE: DDR POST passed.
NOTICE: BL31: v2.2(release):4.9.0-25-g0ce57e322
NOTICE: BL31: Built : 22:23:05, Oct 30 2024
NOTICE: BL31 built for hw (ver 1), lifecycle GA Non-Secured
UEFI firmware (version BlueField:4.9.0-46-g7e3911bd4d-BId13378 built at 22:54:51 on Oct 30 2024)
FmpDxe: EFI Capsule Authentication Successful, Status: Success.
[PMI] Boot image update started.
Note: Installed image will be filtered from 7022816 bytes to 3191888 bytes
Size check good, 3191888 <= 4194304
Check boot image, Status: Success
...Filtering unneeded executables...
...Preparing boot image...
...Generating boot image...
...Writing the boot image, Size: 3192320 bytes...
...Verify boot image...
Write boot image to partition 1, Status: Success
...Filtering unneeded executables...
...Preparing boot image...
...Generating boot image...
...Writing the boot image, Size: 3192320 bytes...
...Verify boot image...
Write boot image to partition 2, Status: Success
[PMI] Boot Image update completed, Status: Success
[PMI] Total number of updates: 1
[PMI] Errors during updates : 0
CapsuleRuntimeDxe: ProcessCapsuleImage 0, Status: Success
FmpDxe: EFI Capsule Authentication Successful, Status: Success.
[PMI] DB update started.
Enable Custom Mode, Status: Success
Enroll key, Status: Success
[PMI] DB update completed, Status: Success
[PMI] DB update started.
Enroll key, Status: Success
[PMI] DB update completed, Status: Success
[PMI] DB update started.
Enroll key, Status: Success
[PMI] DB update completed, Status: Success
[PMI] DB update started.
Enroll key, Status: Success
[PMI] DB update completed, Status: Success
[PMI] DB update started.
Enroll key, Status: Success
[PMI] DB update completed, Status: Success
[PMI] DB update started.
Delete single key, Status: Not Found
[PMI] DB update completed, Status: Success
[PMI] Total number of updates: 6
[PMI] Errors during updates : 0
CapsuleRuntimeDxe: ProcessCapsuleImage 1, Status: Success
Current Secure Boot State: disabled
Secure Boot Mode : Setup Mode
PK is not configured
Redfish enabled
DHCP Session Start
0
Press ESC/F2/DEL twice to enter UEFI Menu.
Press ENTER to skip countdown.
3 seconds remain...
2 seconds remain...
1 seconds remain...
0 seconds remain...
EFI stub: Booting Linux Kernel...
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services...
HW watchdog disab[ 3.511272] mlxbf2_gpio MLNXBF22:01: IRQ index 0 not found
[ 3.517276] mlxbf2_gpio MLNXBF22:02: IRQ index 0 not found
Ubuntu 22.04.5 LTS localhost.localdomain hvc0
localhost login: ubuntu
系统正常启动后/dev/rshim1/misc中的日志
[root@localhost ~]# cat /dev/rshim1/misc
DISPLAY_LEVEL 2 (0:basic, 1:advanced, 2:log)
BOOT_MODE 1 (0:rshim, 1:emmc, 2:emmc-boot-swap)
BOOT_TIMEOUT 150 (seconds)
DROP_MODE 0 (0:normal, 1:drop)
SW_RESET 0 (1: reset)
DEV_NAME pcie-0000:05:00.1
DEV_INFO BlueField-2(Rev 1)
OPN_STR MBF2M345A-VENOT_
---------------------------------------
Log Messages
---------------------------------------
INFO[BL2]: start
INFO[BL2]: boot mode (emmc)
INFO[BL2]: DDR POST passed
INFO[BL2]: UEFI loaded
INFO[BL31]: start
INFO[BL31]: lifecycle GA Non-Secured
INFO[BL31]: runtime
INFO[UEFI]: UPVS valid
INFO[UEFI]: eMMC init
INFO[UEFI]: eMMC probed
INFO[UEFI]: PCIe enum start
INFO[UEFI]: PCIe enum end
INFO[UEFI]: PMI: updates started
INFO[UEFI]: PMI: total updates: 1
INFO[UEFI]: PMI: updates completed, status 0
INFO[UEFI]: PMI: updates started
INFO[UEFI]: PMI: total updates: 6
INFO[UEFI]: PMI: updates completed, status 0
INFO[UEFI]: UEFI Secure Boot (disabled)
INFO[UEFI]: Redfish enabled
INFO[UEFI]: DPU-BMC RF credentials not found
WARN[UEFI]: UPVS reclaim start
WARN[UEFI]: UPVS reclaim done
INFO[UEFI]: exit Boot Service
INFO[MISC]: Linux up
INFO[MISC]: DPU is ready
INFO[MISC]: : DPU is ready
综述
本文记录了从DOCA2.8升级到DOCA2.9的详细过程以及遇到的几个问题。其实通过详细日志还有很多值得深入分析查看的地方以便更深入的了解DPU内部的细节。该部分以后有时间再写文章分析。