我有一台NetApp作为我的nfs服务器,两台
Linux服务器作为nfs客户端.问题是,两台服务器中较新的服务器在同时对nfs服务器进行读写操作时,读写速度会有很大差异.另外,读取和写入对于这个新服务器看起来很棒.旧服务器没有此问题.
老主持人:鲤鱼
Sun Fire x4150带有8核,32 GB RAM
SLES 9 SP4
网络驱动程序:e1000
me@carp:~> uname -a
Linux carp 2.6.5-7.308-smp #1 SMP Mon Dec 10 11:36:40 UTC 2007 x86_64 x86_64 x86_64 GNU/Linux
新主持人:辣椒
HP ProLiant Dl360P Gen8具有8核,64 GB RAM
CentOS 6.3
网络驱动程序:tg3
me@pepper:~> uname -a
Linux pepper 2.6.32-279.el6.x86_64 #1 SMP Fri Jun 22 12:19:21 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
我将跳转到一些说明读/写测试的图表.继承胡椒及其不平衡的读/写:
这是鲤鱼,看起来很好:
测试
以下是我正在运行的读/写测试.我分别运行这些并且它们在胡椒上看起来很棒,但是当它们一起运行时(使用&),写入性能保持稳定,而读取性能则受到很大影响.测试文件的大小是RAM的两倍(辣椒为128 GB,鲤鱼为64 GB).
# write
time dd if=/dev/zero of=/mnt/peppershare/testfile bs=65536 count=2100000 &
# read
time dd if=/mnt/peppershare/testfile2 of=/dev/null bs=65536 &
NFS服务器主机名是nfsc. Linux客户端在子网上具有与其他任何内容分离的专用NIC(即,与主IP不同的子网).每个Linux客户端都将nfs共享从服务器nfsc挂载到/ mnt / hostnameshare.
nfsiostat
在胡椒的模拟测试期间,有1分钟的样本:
me@pepper:~> nfsiostat 60
nfsc:/vol/pg003 mounted on /mnt/peppershare:
op/s rpc bklog
1742.37 0.00
read: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms)
49.750 3196.632 64.254 0 (0.0%) 9.304 26.406
write: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms)
1642.933 105628.395 64.293 0 (0.0%) 3.189 86559.380
我还没有老主机鲤鱼的nfsiostat,但正在努力.
的/ proc /坐骑
me@pepper:~> cat /proc/mounts | grep peppershare
nfsc:/vol/pg003 /mnt/peppershare nfs rw,noatime,nodiratime,vers=3,rsize=65536,wsize=65536,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.x.x.x,mountvers=3,mountport=4046,mountproto=tcp,local_lock=none,addr=172.x.x.x 0 0
me@carp:~> cat /proc/mounts | grep carpshare
nfsc:/vol/pg008 /mnt/carpshare nfs rw,v3,rsize=32768,wsize=32768,timeo=60000,retrans=3,tcp,lock,addr=nfsc 0 0
网卡设置
me@pepper:~> sudo ethtool eth3
Settings for eth3:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 4
Transceiver: internal
Auto-negotiation: on
MDI-X: off
Supports Wake-on: g
Wake-on: g
Current message level: 0x000000ff (255)
Link detected: yes
me@carp:~> sudo ethtool eth1
Settings for eth1:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: umbg
Wake-on: g
Current message level: 0x00000007 (7)
Link detected: yes
卸载设置:
me@pepper:~> sudo ethtool -k eth3
Offload parameters for eth3:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
me@carp:~> # sudo ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
它全部位于LAN上,在nfs客户端和nfs服务器之间具有全双工的千兆交换机.另一方面,我看到cpu在胡椒上的等待时间比鲤鱼多得多,正如预期的那样,因为我怀疑它等待nfs操作.
我用Wireshark / Ethereal捕获了数据包,但我在那个区域并不强大,所以不确定要寻找什么.我没有在Wireshark中看到一堆以红色/黑色突出显示的数据包,所以这就是我所寻找的所有数据包:).这种糟糕的nfs性能体现在我们的Postgres环境中.
还有其他想法或疑难解答提示吗如果我能提供进一步的信息,请告诉我.
UPDATE
根据@ ewwhite的评论,我尝试了两种不同的tuned-adm配置文件,但没有变化.
在我的红色标记的右边还有两个测试.第一个是吞吐量性能,第二个是企业级存储.
nfsiostat 60企业存储配置文件
nfsc:/vol/pg003 mounted on /mnt/peppershare:
op/s rpc bklog
1758.65 0.00
read: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms)
51.750 3325.140 64.254 0 (0.0%) 8.645 24.816
write: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms)
1655.183 106416.517 64.293 0 (0.0%) 3.141 159500.441
更新2