现象
nfs 服务器因网络等问题与客户端通信失败后,nfs客户端此时执行 ls
命令卡死
原因
单从nfs 的知识来看,造成nfs 客户端卡死的问题,是因为nfs 客户端默认采用 hard
挂载模式。
改用 soft
模式后,则超时后将抛出一个I/O 错误
参考以下文档:
redHat-nfs 文档
hard or soft — Specifies whether the program using a file via an NFS connection should stop and wait (hard) for the server to come back online, if the host serving the exported file system is unavailable, or if it should report an error (soft).
If hard is specified, the user cannot terminate the process waiting for the NFS communication to resume unless the intr option is also specified.
If soft is specified, the user can set an additional timeo=<value> option, where <value> specifies the number of seconds to pass before the error is reported.
Using soft mounts is not recommended as they can generate I/O errors in very congested networks or when using a very busy server.
还需要关注的:选项: intr
— Allows NFS requests to be interrupted if the server goes down or cannot be reached.
解决
由于nfs 服务器 可能不稳定(当然在网络或服务器不稳定的情况下,不推荐使用nfs),因此使用脚本定时探测后,如果服务器正常,则挂载;如果服务器异常,则卸载。
主要使用到的命令如下:showmount -e ${ip地址}
:检测服务器暴露的nfs 挂载点mount -t nfs -o rw,intr,soft,timeo=30,retry=3 ${远程地址} ${本地地址}
: 将远程目录,以soft、intr模式,并设置超时时间、重试次数umount -f -l -t nfs ${远程地址}
lazy、force 卸载
#!/bin/bash
# config
NFS_SERVER_IP="192.168.38.201"
NFS_SHARE_DIR="/root/share"
NFS_SHARE_FULL_PATH="${NFS_SERVER_IP}:${NFS_SHARE_DIR}"
# 日志文件路径
LOG_PATH="/root/nfs.log"
# 本地挂载目录
LOCAL_MOUNT_PATH="/root/client-share/"
# 卸载
doUmount() {
umount -f -l -t nfs ${NFS_SHARE_FULL_PATH} >>${LOG_PATH} 2>&1
echo "卸载结果: $?" >>${LOG_PATH}
}
# 检测nfs 服务器状态
checkServerStatus() {
exportList=$(showmount -e ${NFS_SERVER_IP} 2>>${LOG_PATH} )
exportTargetCount=$(echo "${exportList}" | grep -c ${NFS_SHARE_DIR})
if [ "${exportTargetCount}" -ge 1 ]; then
return 0
else
echo "在showmount 结果中找不到指定目录,exportList:" "${exportList}" >>${LOG_PATH}
return 1
fi
}
# 挂载
doMount() {
mount -t nfs -o rw,intr,soft,timeo=30,retry=3 ${NFS_SHARE_FULL_PATH} ${LOCAL_MOUNT_PATH} >>${LOG_PATH} 2>&1
if [ $? -eq 0 ]; then
echo "NFS [${NFS_SHARE_FULL_PATH}] 自动挂载成功" >>${LOG_PATH}
else
echo "NFS [${NFS_SHARE_FULL_PATH}] 挂载失败" >>${LOG_PATH}
fi
}
# 检测并挂载
checkAndMount() {
# 是否已经挂载
shareCount=$(df -h | grep -c ${NFS_SHARE_FULL_PATH})
if [ "${shareCount}" -ge 1 ]; then
echo "NFS [${NFS_SHARE_FULL_PATH}] 已挂载,无需重复挂载" >>${LOG_PATH}
else
doMount
fi
}
main() {
echo "NFS 检测开始:$(date +"%F %T")" >>${LOG_PATH}
checkServerStatus
# server 状态正常
if [ $? -eq 0 ]; then
checkAndMount
else
echo "请检查server运行情况" >>${LOG_PATH}
doUmount
fi
echo "NFS 检测结束:$(date +"%F %T")" >>${LOG_PATH}
exit 0
}
main
需要注意的是,当 mount
成功后,该文件夹本身包含的文件只有在 umount
后才能看到。可以通过挂载前将已有的文件,移动到其他文件夹后,再拷贝到nfs 挂载目录上。