RDMA介绍
RDMA(Remote Direct Memory Access)技术全称远程直接数据存取,就是为了解决网络传输中服务器端数据处理的延迟而产生的。RDMA通过网络把资料直接传入计算机的存储区,将数据从一个系统快速移动到远程系统存储器中,而不对操作系统造成任何影响,这样就不需要用到多少计算机的处理功能。它消除了外部存储器复制和文本交换操作,因而能解放内存带宽和CPU周期用于改进应用系统性能。
使用DMA 方式的目的是减少大批量数据传输时CPU 的开销.采用专用DMA 控制器(DMAC) 生成访存地址并控制访存过程.优点有操作均由硬件电路实现,传输速度快;CPU 基本不干预,仅在初始化和结束时参与, CPU 与外设并行工作,效率高。
安装驱动
查看网卡是否被主板和ubuntu系统识别,下列信息中看到Mellanox Technologies MT27500 Family就表示网卡已经被主板和系统识别
# lspci |grep Mel
03:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
# lspci |grep Net
03:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
0b:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
0d:00.0 Network controller: Broadcom Corporation BCM4360 802.11ac Wireless Network Adapter (rev 03)
# tar zxvf MLNX_OFED_LINUX-4.1-1.0.2.0-ubuntu14.04-x86_64.tgz
进入驱动目录,直接运行mlnxofedinstall即可进行安装
# ./mlnxofedinstall --all 表示在缺少依赖库的情况下进行在线安装
# ./mlnxofedinstall --force 表示强制安装
查看端口连接状态
# ibnodes
Ca : 0xe41d2d0300b47d00 ports 2 "test213 HCA-1"
Ca : 0xe41d2d0300b47e90 ports 2 "test214 HCA-1"
查看驱动状态
# lshw -numeric -class network
*-network
description: interface
product: MT27500 Family [ConnectX-3] [15B3:1003]
vendor: Mellanox Technologies [15B3]
physical id: 0
bus info: pci@0000:03:00.0
logical name: ib1
version: 00
serial: a0:00:03:00:fe:80
width: 64 bits
clock: 33MHz
capabilities: pm vpd msix pciexpress bus_master cap_list rom physical
configuration: autonegotiation=off broadcast=yes driver=ib_ipoib driverversion=4.1-1.0.2 duplex=full firmware=2.40.7000 ip=192.168.0.2 latency=0 link=yes multicast=yes
resources: irq:26 memory:fb900000-fb9fffff memory:d2800000-d2ffffff memory:fb800000-fb8fffff
启动驱动
# /etc/init.d/openibd restart
Loading HCA driver and Access Layer: [ OK ]
【注意】:在安装驱动前,RDMA的网卡灯是不会亮的,只有操作完以下的网络配置后,网卡灯才会正常闪烁
网络配置
# vi /etc/network/interfaces <增加IB配置>
auto ib0
iface ib0 inet static
address 192.168.0.2
netmask 255.255.255.0
gateway 192.168.0.1
auto ib1
iface ib1 inet static
address 192.168.0.3
netmask 255.255.255.0
gateway 192.168.0.1
查看ib网卡状态
# ibstat
CA 'mlx4_0'
CA type: MT4099
Number of ports: 2
Firmware version: 2.40.7000
Hardware version: 1
Node GUID: 0xe41d2d0300b47e90
System image GUID: 0xe41d2d0300b47e93
Port 1:
State: DOWN
Physical state: LinkUp
Rate: 40
Base lid: 4
LMC: 0
SM lid: 4
Capability mask: 0x0251486a
Port GUID: 0xe41d2d0300b47e91
Link layer: InfiniBand
Port 2:
State: DOWN
Physical state: LinkUp
Rate: 40
Base lid: 3
LMC: 0
SM lid: 3
Capability mask: 0x0251486a
Port GUID: 0xe41d2d0300b47e92
Link layer: InfiniBand
重启系统
# reboot
启动后查看ib网卡状态
# ibstat
CA 'mlx4_0'
CA type: MT4099
Number of ports: 2
Firmware version: 2.40.7000
Hardware version: 1
Node GUID: 0xe41d2d0300b47e90
System image GUID: 0xe41d2d0300b47e93
Port 1:
State: Initializing
Physical state: LinkUp
Rate: 40
Base lid: 4
LMC: 0
SM lid: 4
Capability mask: 0x0251486a
Port GUID: 0xe41d2d0300b47e91
Link layer: InfiniBand
Port 2:
State: Initializing
Physical state: LinkUp
Rate: 40
Base lid: 3
LMC: 0
SM lid: 3
Capability mask: 0x0251486a
Port GUID: 0xe41d2d0300b47e92
Link layer: InfiniBand
直连网络配置
【启动opensm】
# opensm &
-------------------------------------------------
OpenSM 4.9.0.MLNX20170607.280b8f7
Command Line Arguments:
Log File: /var/log/opensm.log
-------------------------------------------------
OpenSM 4.9.0.MLNX20170607.280b8f7
Using default GUID 0xe41d2d0300b47e91
Entering MASTER state
【ib网卡的GUID信息】
# ibstat | grep GUID
Node GUID: 0xe41d2d0300b47e90
System image GUID: 0xe41d2d0300b47e93
Port GUID: 0xe41d2d0300b47e91
Port GUID: 0xe41d2d0300b47e92
【设置opensm】
# opensm -g 0xe41d2d0300b47e91 &
# opensm -g 0xe41d2d0300b47e92 &
【查看ib网卡状态】
# ibstat
CA 'mlx4_0'
CA type: MT4099
Number of ports: 2
Firmware version: 2.40.7000
Hardware version: 1
Node GUID: 0xe41d2d0300b47e90
System image GUID: 0xe41d2d0300b47e93
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 4
LMC: 0
SM lid: 4
Capability mask: 0x0251486a
Port GUID: 0xe41d2d0300b47e91
Link layer: InfiniBand
Port 2:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 3
LMC: 0
SM lid: 3
Capability mask: 0x0251486a
Port GUID: 0xe41d2d0300b47e92
Link layer: InfiniBand
【网络是否联通】
# ping 测试
网速测试
【启动opensm】
# opensm
-------------------------------------------------
OpenSM 4.9.0.MLNX20170607.280b8f7
Command Line Arguments:
Log File: /var/log/opensm.log
-------------------------------------------------
OpenSM 4.9.0.MLNX20170607.280b8f7
Using default GUID 0xe41d2d0300b47e91
Entering MASTER state
【获取名称】
# ibstat
CA 'mlx4_0'
【设置server】
主机1 IP:192.168.0.2
主机2 IP:192.168.0.4
主机1上设置server:
# ib_send_bw -a -c UD -d mlx4_0 -i 2
************************************
* Waiting for client to connect... *
************************************
主机2开启测试:
# ib_send_bw -a -c UD -d mlx4_0 -i 2 192.168.0.2
Max msg size in UD is MTU 2048
Changing to this MTU
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : mlx4_0
Number of qps : 1 Transport type : IB
Connection type : UD Using SRQ : OFF
TX depth : 128
CQ Moderation : 100
Mtu : 2048[B]
Link type : IB
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x01 QPN 0x0257 PSN 0x7d3837
remote address: LID 0x03 QPN 0x0259 PSN 0x86634b
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
2 1000 16.63 15.72 8.239520
主机1打印测速信息:
Max msg size in UD is MTU 2048
Changing to this MTU
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : mlx4_0
Number of qps : 1 Transport type : IB
Connection type : UD Using SRQ : OFF
RX depth : 1000
CQ Moderation : 100
Mtu : 2048[B]
Link type : IB
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x03 QPN 0x0259 PSN 0x86634b
remote address: LID 0x01 QPN 0x0257 PSN 0x7d3837
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
Did not get Message for 120 Seconds, exiting..
Total Received=0, Total Iters Required=1000
【告警信息】
mlnx_tune