k8s-存储插件:问题3)cbs-csi存储流程解析

背景

 目前公有云存储中,使用cbs-csi架构的案例很多。 使用场景大都是 statfulset/ statfulsetplus / tapp 等一些k8s原生资源或者第三方扩展资源,   使用方式都是sts模式。

设置多个副本,  每个po副本都按照序号生成po name, 并且每个pod对应新建一个pvc,pv。

某个pod发生重建的场景比较普遍,下面详细描述一下。

假设存在XXX pod从 11.142.61.164 销毁, 并且在11.142.60.212节点新建。整个时序过程如下:

旧pod删除过程:

1.kubelet 收到pod删除事件, kubelet 会开启unmount volume, unmount device流程, kubelet 在执行unmount过程中不会等待disk的detach

在旧节点可以查看kubelet日志: [使用pv name搜索, 里面可以看到原来的va name]

Jan 11 14:58:08 VM-61-164-centos kubelet[21944]: I0111 14:58:08.573794   21944 operation_generator.go:782] UnmountVolume.TearDown succeeded for volume "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-XXX" (OuterVolumeSpecName: "XXX-pvc") pod "5ca209fd-23d0-4f60-a56b-96a4f9f47c1e" (UID: "5ca209fd-23d0-4f60-a56b-96a4f9f47c1e"). InnerVolumeSpecName "pvc-XXX". PluginName "kubernetes.io/csi", VolumeGidValue ""  【unmount volume成功】
Jan 11 14:58:08 VM-61-164-centos kubelet[21944]: I0111 14:58:08.637453   21944 reconciler.go:312] operationExecutor.UnmountDevice started for volume "pvc-XXX" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-XXX") on node "11.142.61.164"  【启动 unmount device任务】
Jan 11 14:58:09 VM-61-164-centos kubelet[21944]: I0111 14:58:09.183342   21944 operation_generator.go:869] UnmountDevice succeeded for volume "pvc-XXX" %!(EXTRA string=UnmountDevice succeeded for volume "pvc-XXX" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc") on node "11.142.61.164" ) 【unmount device成功】
Jan 11 14:58:09 VM-61-164-centos kubelet[21944]: I0111 14:58:09.239894   21944 reconciler.go:319] Volume detached for volume "pvc-XXX" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc") on node "11.142.61.164" DevicePath "csi-3a3640a23315b682265e8a33ea96b28fa5166767781a3c1f181c6df9e93fd7df" 【由于kubelet不承担detached任务,所以只打印日志】

2.  kcm收到pod删除事件, 会触发disk的detach事件

kcm日志:  [使用pv name搜索]  在触发detach之前,会先校验是不是unmount success,  然后设置va deletetime, 不断轮询查询va的status是不是 detached.

I0111 14:58:14.245000       1 reconciler.go:204] attacherDetacher.DetachVolume started for volume "pvc-XXX" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-XXX") on node "11.142.61.164"  【开启detach任务】
I0111 14:58:14.249092       1 operation_generator.go:1384] Verified volume is safe to detach for volume "pvc-XXX" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-XXX") on node "11.142.61.164"  【会先检验unmount是不是success, 具体通过node.status.VolumesInUse来判断】
I0111 14:58:20.868329       1 operation_generator.go:472] DetachVolume.Detach succeeded for volume "pvc-XXX" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-XXX") on node "11.142.61.164" 【这里会轮询va】

3.  va由于设置了finalizer, 删除事件被 external-attacher收到, 会发送detach请求到 cbs-csi 【使用va name搜索】

I0111 06:58:14.257759       1 controller.go:198] Started VA processing "csi-3a3640a23315b682265e8a33ea96b28fa5166767781a3c1f181c6df9e93fd7df"  【开始处理va】
I0111 06:58:14.257779       1 csi_handler.go:216] CSIHandler: processing VA "csi-3a3640a23315b682265e8a33ea96b28fa5166767781a3c1f181c6df9e93fd7df" 
I0111 06:58:14.257783       1 csi_handler.go:267] Starting detach operation for "csi-3a3640a23315b682265e8a33ea96b28fa5166767781a3c1f181c6df9e93fd7df" 【开始detach】
I0111 06:58:14.257807       1 csi_handler.go:274] Detaching "csi-3a3640a23315b682265e8a33ea96b28fa5166767781a3c1f181c6df9e93fd7df" 【发送请求前】
I0111 06:58:20.641295       1 csi_handler.go:575] Detached "csi-3a3640a23315b682265e8a33ea96b28fa5166767781a3c1f181c6df9e93fd7df"  【detach成功后】
I0111 06:58:20.641323       1 util.go:79] Marking as detached "csi-3a3640a23315b682265e8a33ea96b28fa5166767781a3c1f181c6df9e93fd7df" 【update va.status, 促使kcm往下走,  】
I0111 06:58:20.846082       1 request.go:581] Throttling request took 189.597303ms, request: PATCH:https://9.165.248.1:443/apis/storage.k8s.io/v1/volumeattachments/csi-3a3640a23315b682265e8a33ea96b28fa5166767781a3c1f181c6df9e93fd7df
I0111 06:58:20.869350       1 util.go:105] Finalizer removed from "csi-3a3640a23315b682265e8a33ea96b28fa5166767781a3c1f181c6df9e93fd7df"  【移除finalizer,va会被删除】
I0111 06:58:20.869370       1 csi_handler.go:287] Fully detached "csi-3a3640a23315b682265e8a33ea96b28fa5166767781a3c1f181c6df9e93fd7df"  【完成detached掉】

4.   cbs-csi日志 【使用disk-id搜索】

I0111 06:58:14.258022       1 utils.go:97] GRPC call: /csi.v1.Controller/ControllerUnpublishVolume
I0111 06:58:14.258038       1 utils.go:98] GRPC request: {"node_id":"ins-ngtfnf7y","volume_id":"disk-XXX"}   【会调用cbs detach接口解挂】

新pod创建过程:


KCM 触发attach 【根据pv name来搜索】

I0111 14:58:20.912219       1 reconciler.go:282] attacherDetacher.AttachVolume started for volume "pvc-XXX" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-XXX) from node "11.142.60.212" 【开启attach任务,   其实就是新建了一个va对象】
I0111 14:58:32.943615       1 operation_generator.go:361] AttachVolume.Attach succeeded for volume "pvc-XXX" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-XXX") from node "11.142.60.212" 【attach完成, 不断轮询va对象,va.status是attached即可】
I0111 14:58:32.943741       1 event.go:278] Event(v1.ObjectReference{Kind:"Pod", Namespace:"XXX", Name:"XXX", UID:"e0d0fa80-cc4a-4d44-8fae-154aa8142e3a", APIVersion:"v1", ResourceVersion:"17401825151", FieldPath:""}): type: 'Normal' reason: 'SuccessfulAttachVolume' AttachVolume.Attach succeeded for volume "pvc-XXX"

csi-attacher日志 【根据csi name来搜索】
I0111 06:58:20.918846       1 controller.go:198] Started VA processing "csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd"
I0111 06:58:20.918861       1 csi_handler.go:216] CSIHandler: processing VA "csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd"
I0111 06:58:20.918865       1 csi_handler.go:243] Attaching "csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd"
I0111 06:58:20.918870       1 csi_handler.go:422] Starting attach operation for "csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd"  【开始调用cbs-csi attach】
I0111 06:58:20.918912       1 csi_handler.go:304] VA finalizer added to "csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd" 【给va加finalizer】
I0111 06:58:20.918923       1 csi_handler.go:318] NodeID annotation added to "csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd"  【给va加csi.alpha.kubernetes.io/node-id annotation】
I0111 06:58:21.446073       1 request.go:581] Throttling request took 527.032614ms, request: PATCH:https://9.165.248.1:443/apis/storage.k8s.io/v1/volumeattachments/csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd
I0111 06:58:32.933934       1 csi_handler.go:256] Attached "csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd" 【调用结束】
I0111 06:58:32.933952       1 util.go:37] Marking as attached "csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd"
I0111 06:58:32.947473       1 util.go:51] Marked as attached "csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd" 【update va status, 促使kcm往下走】
I0111 06:58:32.947492       1 csi_handler.go:262] Fully attached "csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd"  【整体结束】
I0111 06:58:32.947499       1 csi_handler.go:232] CSIHandler: finished processing "csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd"

CBS-CSI日志
I0111 06:58:21.456998       1 utils.go:97] GRPC call: /csi.v1.Controller/ControllerPublishVolume  【调用cbs attach disk接口来挂盘】
I0111 06:58:21.457014       1 utils.go:98] GRPC request: {"node_id":"ins-hrlnsway","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_context":{"diskType":"CLOUD_PREMIUM","storage.kubernetes.io/csiProvisionerIdentity":"1672314114387-8081-com.tencent.cloud.csi.cbs"},"volume_id":"disk-adge0gvc"}

kubelet日志  【根据pv name来搜索】
【等待attach成功】
Jan 11 14:58:18 VM-60-212-centos kubelet[20721]: I0111 14:58:18.880300   20721 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-4cde4524-6ba4-449c-891b-fe9ce8138e33" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-XXX") pod "XXX" (UID: "e0d0fa80-cc4a-4d44-8fae-154aa8142e3a") 【启动验证attached 逻辑】
Jan 11 14:58:18 VM-60-212-centos kubelet[20721]: E0111 14:58:18.880393   20721 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-XXX podName: nodeName:}" failed. No retries permitted until 2023-01-11 14:58:19.380344104 +0800 CST m=+132526.556747154 (durationBeforeRetry 500ms). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-4cde4524-6ba4-449c-891b-fe9ce8138e33\" (UniqueName: \"kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc\") pod \"weixin-zixun-l3-vob-rbu-online-6\" (UID: \"e0d0fa80-cc4a-4d44-8fae-154aa8142e3a\") "  【先校验volume是否被加到volumesInUse】

Jan 11 14:58:22 VM-60-212-centos kubelet[20721]: I0111 14:58:22.393827   20721 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-4cde4524-6ba4-449c-891b-fe9ce8138e33" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc") pod "XXX" (UID: "e0d0fa80-cc4a-4d44-8fae-154aa8142e3a")
Jan 11 14:58:22 VM-60-212-centos kubelet[20721]: E0111 14:58:22.403648   20721 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc podName: nodeName:}" failed. No retries permitted until 2023-01-11 14:58:26.403619698 +0800 CST m=+132533.580022731 (durationBeforeRetry 4s). Error: "Volume not attached according to node status for volume \"pvc-4cde4524-6ba4-449c-891b-fe9ce8138e33\" (UniqueName: \"kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc\") pod \"weixin-zixun-l3-vob-rbu-online-6\" (UID: \"e0d0fa80-cc4a-4d44-8fae-154aa8142e3a\") "   【校验volume是否被attached, 通过node.Status.VolumesAttached来判断】
Jan 11 14:58:34 VM-60-212-centos kubelet[20721]: I0111 14:58:34.447367   20721 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-4cde4524-6ba4-449c-891b-fe9ce8138e33" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc") pod "XXX" (UID: "e0d0fa80-cc4a-4d44-8fae-154aa8142e3a")
Jan 11 14:58:34 VM-60-212-centos kubelet[20721]: I0111 14:58:34.457104   20721 operation_generator.go:1332] Controller attach succeeded for volume "pvc-4cde4524-6ba4-449c-891b-fe9ce8138e33" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc") pod "XXX" (UID: "e0d0fa80-cc4a-4d44-8fae-154aa8142e3a") device path: "" 【校验attach通过】
Jan 11 14:58:34 VM-60-212-centos kubelet[20721]: I0111 14:58:34.547876   20721 reconciler.go:269] operationExecutor.MountVolume started for volume "pvc-4cde4524-6ba4-449c-891b-fe9ce8138e33" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc") pod "XXX" (UID: "e0d0fa80-cc4a-4d44-8fae-154aa8142e3a")  【开启mount任务】
Jan 11 14:58:34 VM-60-212-centos kubelet[20721]: I0111 14:58:34.548274   20721 operation_generator.go:558] MountVolume.WaitForAttach entering for volume "pvc-4cde4524-6ba4-449c-891b-fe9ce8138e33" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc") pod "XXX" (UID: "e0d0fa80-cc4a-4d44-8fae-154aa8142e3a") DevicePath ""
Jan 11 14:58:34 VM-60-212-centos kubelet[20721]: I0111 14:58:34.561687   20721 operation_generator.go:567] MountVolume.WaitForAttach succeeded for volume "pvc-4cde4524-6ba4-449c-891b-fe9ce8138e33" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc") pod "XXX" (UID: "e0d0fa80-cc4a-4d44-8fae-154aa8142e3a") DevicePath "csi-ef62fd617656080205af907030d51be08f25b43341a98b53fd436db24689accd"  【查询va状态来确认最终attach成功】
【mount】
Jan 11 14:58:34 VM-60-212-centos kubelet[20721]: I0111 14:58:34.646113   20721 operation_generator.go:596] MountVolume.MountDevice succeeded for volume "pvc-4cde4524-6ba4-449c-891b-fe9ce8138e33" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc") pod "XXX" (UID: "e0d0fa80-cc4a-4d44-8fae-154aa8142e3a") device mount path "/data_online/1/kubelet/plugins/kubernetes.io/csi/pv/pvc-4cde4524-6ba4-449c-891b-fe9ce8138e33/globalmount" 【mount device操作成功,调用csi-node】
Jan 11 14:58:34 VM-60-212-centos kubelet[20721]: I0111 14:58:34.676102   20721 operation_generator.go:657] MountVolume.SetUp succeeded for volume "pvc-4cde4524-6ba4-449c-891b-fe9ce8138e33" (UniqueName: "kubernetes.io/csi/com.tencent.cloud.csi.cbs^disk-adge0gvc") pod "XXX" (UID: "e0d0fa80-cc4a-4d44-8fae-154aa8142e3a")  【mount volume操作成功, 调用csi-node】

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值