背景:
pod挂载盘耗时有点长,大部分是8秒,偶发遇到15秒。
kublet日志:
7s完成挂盘
Dec 29 15:06:45 VM-232-125-centos kubelet[1271354]: I1229 15:06:45.080872 1271354 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc---" (UniqueName: "kubernetes.io/qcloud-cbs/---") pod "---" (UID: "---")
Dec 29 15:06:52 VM-232-125-centos kubelet[1271354]: I1229 15:06:52.709972 1271354 operation_generator.go:1332] Controller attach succeeded for volume "pvc---" (UniqueName: "kubernetes.io/qcloud-cbs/---") pod "---" (UID: "---") device path: "/dev/disk/by-id/virtio-disk---"
15s完成挂盘
Dec 29 15:09:09 VM-232-125-centos kubelet[1271354]: I1229 15:09:09.020204 1271354 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc----" (UniqueName: "kubernetes.io/qcloud-cbs/---") pod "---" (UID: "---")
Dec 29 15:09:24 VM-232-125-centos kubelet[1271354]: I1229 15:09:24.578750 1271354 operation_generator.go:1332] Controller attach succeeded for volume "pvc---" (UniqueName: "kubernetes.io/qcloud-cbs/---") pod "---" (UID: "---") device path: "/dev/disk/by-id/virtio-disk----"
原因:
https://github.com/kubernetes/kubernetes/issues/28141
VerifyControllerAttachedVolume会backoff,看起来,会间隔1s,2s,4s,8s。
这个日志里看,VerifyControllerAttachedVolume到了4s.
从代码看,每次间隔时间会*2。最大间隔时间为2分2秒。
kubelet的参数node-status-update-frequency,控制更新node status的频率。目前,默认是10秒。
如果正好错过的话。VerifyControllerAttachedVolume因为backoff,再次触发的时候可能会遇到间隔8s的情况。这个时候算起来,大概就15S左右了。