linux mount nfs 超时,I / O无法（偶尔）挂载NFS-服务器超时

最新推荐文章于 2023-03-24 13:02:18 发布

weixin_39716105

最新推荐文章于 2023-03-24 13:02:18 发布

阅读量818

点赞数

文章标签： linux mount nfs 超时

我有一个基于Linux的文件服务器(ark)，该文件服务器通过nfs4导出RAID卷。

有时在执行大型复制操作时，它会超时。

[nathan@ebisu /mnt/extra/disk] rsync -a --progress . /mnt/raid/backup/backup.extra/disk

sending incremental file list

BSD.0/

BSD.0/BSD.0.vdi

411336704 12% 48.60MB/s 0:00:56

rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)

rsync: write failed on "/mnt/raid/backup/backup.extra/disk/BSD.0/BSD.0.vdi": Input/output error (5)

rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9]

rsync: connection unexpectedly closed (32 bytes received so far) [sender]

rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9]

我知道这是超时，因为dmesg告诉我：

[nathan@ebisu ~] dmesg | tail

[52722.138132] nfs: server ark not responding, timed out

[52722.138137] nfs: server ark not responding, timed out

[52722.138145] nfs: server ark not responding, timed out

[52722.138150] nfs: server ark not responding, timed out

[52722.138154] nfs: server ark not responding, timed out

如果您认为这可能是与rsync相关的错误，我也尝试过进行常规复制：

[nathan@ebisu /mnt/extra/disk] cp BSD.0/BSD.0.vdi /mnt/raid/backup/backup.extra/disk

cp: error writing ‘/mnt/raid/backup/backup.extra/disk/BSD.0.vdi’: Input/output error

cp: failed to extend ‘/mnt/raid/backup/backup.extra/disk/BSD.0.vdi’: Input/output error

我什至不知道从哪里开始寻找解决此问题的方法。它们都通过千兆位交换机通过千兆位以太网连接。我已经使用ethtool来验证两者是否都以千兆位速度运行。主机和服务器之间的大多数操作都可以正常进行；它只是在大笔交易中死亡。

文件服务器的dmesg中的任何内容都不会显得笨拙。

[root@ark ~]# dmesg | tail

[ 7.088959] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory

[ 7.266363] NFSD: starting 90-second grace period (net ffffffff81880e80)

[ 8492.222871] type=1326 audit(1365926452.334:2): auid=4294967295 uid=99 gid=99 ses=4294967295 pid=336 comm="sshd" sig=31 syscall=48 compat=0 ip=0x7fe1be17edc7 code=0x0

[ 8492.314714] type=1326 audit(1365926452.424:3): auid=4294967295 uid=99 gid=99 ses=4294967295 pid=338 comm="sshd" sig=31 syscall=48 compat=0 ip=0x7fe30fd9ddc7 code=0x0

[ 8492.405336] type=1326 audit(1365926452.514:4): auid=4294967295 uid=99 gid=99 ses=4294967295 pid=340 comm="sshd" sig=31 syscall=48 compat=0 ip=0x7f6bb032ddc7 code=0x0

[ 8492.501048] type=1326 audit(1365926452.611:5): auid=4294967295 uid=99 gid=99 ses=4294967295 pid=342 comm="sshd" sig=31 syscall=48 compat=0 ip=0x7f81d7c2fdc7 code=0x0

[ 8492.603056] type=1326 audit(1365926452.714:6): auid=4294967295 uid=99 gid=99 ses=4294967295 pid=344 comm="sshd" sig=31 syscall=48 compat=0 ip=0x7f97c8bc9dc7 code=0x0

[ 8492.703732] type=1326 audit(1365926452.814:7): auid=4294967295 uid=99 gid=99 ses=4294967295 pid=346 comm="sshd" sig=31 syscall=48 compat=0 ip=0x7f0661b2fdc7 code=0x0

[ 8492.837977] type=1326 audit(1365926452.947:8): auid=4294967295 uid=99 gid=99 ses=4294967295 pid=348 comm="sshd" sig=31 syscall=48 compat=0 ip=0x7fd024f8cdc7 code=0x0

[54125.173195] type=1326 audit(1365972085.286:9): auid=4294967295 uid=99 gid=99 ses=4294967295 pid=353 comm="sshd" sig=31 syscall=48 compat=0 ip=0x7f390a6b9dc7 code=0x0

syslog同样没有任何问题。

我收集了一些更多的随机诊断信息：

[root@ebisu etc]# nfsstat -rc

Client rpc stats:

calls retrans authrefrsh

1057273 34163 1050608

这是很多重传。

我检查了一下是否使我的nfsd线程饱和，但是不，它们大部分处于空闲状态。

只是为了好玩，我完全在本地进行了一次类似的传输，以查看是否遇到磁盘错误或运行缓慢：

[root@ark ~]# rsync --progress test.img /mnt/bigraid/backup/backup.ark/

test.img

8589934592 100% 48.38MB/s 0:02:49 (xfer#1, to-check=0/1)

sent 8590983238 bytes received 31 bytes 50386998.65 bytes/sec

total size is 8589934592 speedup is 1.00

看起来它的速度低于50MB / s，这大约是我在远程rsync上获得的速度。

我在服务器上运行htop时尝试进行传输，但我确实注意到，过了一段时间，nfsd似乎已请求了更多的内存缓冲区。它可能与内存有关，因为按照现代标准，服务器不是高内存系统。但是在我看来，这应该只会导致传输速度变慢，而不是完全超时。

weixin_39716105

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
linux mount nfs 超时,I / O无法（偶尔）挂载NFS-服务器超时

我有一个基于Linux的文件服务器(ark)，该文件服务器通过nfs4导出RAID卷。有时在执行大型复制操作时，它会超时。[nathan@ebisu /mnt/extra/disk] rsync -a --progress . /mnt/raid/backup/backup.extra/disksending incremental file listBSD.0/BSD.0/BSD.0.vdi41...
复制链接

扫一扫