故障描述
在端午节放假来了以后,通过云主机新建镜像,通过新建的镜像模板去新建一台云主机,报错。然后重建镜像服务器,重建镜像都不行,问题都是一样。
错误描述
请求: {
"type": "post",
"path": "longjobs",
"body": {
"params": {
"jobName": "APICreateRootVolumeTemplateFromRootVolumeMsg",
"jobData": "{\"name\":\"Oracle11gClientDemo\",\"description\":\"\",\"platform\":\"Linux\",\"backupStorageUuids\":[\"7b76ec960040427eb2d4673e9dace169\"],\"resourceUuid\":\"9ec8b2dad9623fe0b50d7bd458549ce7\",\"rootVolumeUuid\":\"7ae63e52f3af45c3909d225387e99a6a\"}"
}
},
"sessionUuid": "99c3730f5e7c47789b6c1d85debb1f74",
"jobUuid": "9ec8b2dad9623fe0b50d7bd458549ce7"
}
返回: {
"inventory": {
"uuid": "5d31f73c55964a61a09f02cc13557553",
"name": "APICreateRootVolumeTemplateFromRootVolumeMsg",
"apiId": "9ec8b2dad9623fe0b50d7bd458549ce7",
"jobName": "APICreateRootVolumeTemplateFromRootVolumeMsg",
"jobData": "{\"name\":\"Oracle11gClientDemo\",\"description\":\"\",\"backupStorageUuids\":[\"7b76ec960040427eb2d4673e9dace169\"],\"rootVolumeUuid\":\"7ae63e52f3af45c3909d225387e99a6a\",\"platform\":\"Linux\",\"system\":false,\"resourceUuid\":\"9ec8b2dad9623fe0b50d7bd458549ce7\",\"session\":{\"uuid\":\"99c3730f5e7c47789b6c1d85debb1f74\",\"accountUuid\":\"36c27e8ff05c4780bf6d2fa65700f22e\",\"userUuid\":\"36c27e8ff05c4780bf6d2fa65700f22e\",\"expiredDate\":\"Jun 20, 2018 12:27:06 PM\",\"createDate\":\"Jun 20, 2018 10:27:06 AM\"},\"timeout\":-1,\"headers\":{},\"id\":\"5c7036c2bbd948e9b64c33c1950a013b\",\"serviceId\":\"storage.backup.imagestore.774bbff9b1504cbcad616ca4282519cf\",\"createdTime\":1529465672420}",
"jobResult": "Failed : {\"causes\":[],\"code\":\"SYS.1006\",\"description\":\"An operation failed\",\"details\":\"在所有镜像服务器上从根云盘[uuid:7ae63e52f3af45c3909d225387e99a6a]创建镜像失败,查看错误原因\",\"cause\":{\"causes\":[],\"code\":\"SYS.1006\",\"description\":\"An operation failed\",\"details\":\"failed to create template from root volume[uuid:7ae63e52f3af45c3909d225387e99a6a] on primary storage[uuid:f703c13570354025875bb9c07dbd4471]\",\"cause\":{\"causes\":[],\"code\":\"SYS.1006\",\"description\":\"An operation failed\",\"details\":\"无法从本地存储[uuid:f703c13570354025875bb9c07dbd4471, path:/zstack_ps2/templateWorkspace/image-9ec8b2dad9623fe0b50d7bd458549ce7/9ec8b2dad9623fe0b50d7bd458549ce7.qcow2]上传数据到镜像仓库[主机名:172.16.23.5],因为failed to execute shell command: /usr/local/zstack/imagestore/bin/zstcli -rootca /var/lib/zstack/imagestorebackupstorage/package/certs/ca.pem -json -callbackurl http://172.16.23.6:8080/zstack/asyncrest/callback -taskid b91e06fdcb884fdaa666a27e16520309 -imageUuid 9ec8b2dad9623fe0b50d7bd458549ce7 add -desc '{\\\"name\\\":\\\"Oracle11gClientDemo\\\",\\\"desc\\\":\\\"\\\",\\\"mediaType\\\":\\\"RootVolumeTemplate\\\",\\\"platform\\\":\\\"Linux\\\",\\\"format\\\":\\\"qcow2\\\",\\\"actualSize\\\":19362873344}' -file /zstack_ps2/templateWorkspace/image-9ec8b2dad9623fe0b50d7bd458549ce7/9ec8b2dad9623fe0b50d7bd458549ce7.qcow2\\nreturn code: 127\\nstdout: \\nstderr: /bin/bash: /usr/local/zstack/imagestore/bin/zstcli: No such file or directory\\n\"}}}",
"state": "Failed",
"managementNodeUuid": "774bbff9b1504cbcad616ca4282519cf",
"createDate": "Jun 20, 2018 11:34:32 AM",
"lastOpDate": "Jun 20, 2018 11:34:32 AM"
},
"jobUuid": "9ec8b2dad9623fe0b50d7bd458549ce7",
"success": false,
"sessionUuid": "99c3730f5e7c47789b6c1d85debb1f74"
}
在管理节点点6上看,确实没有这个文件。只有在镜像服务器点5上有,其它6个节点都没有。
后来经过排查,发现7个节点都是管理节点。安装的时候,通过镜像选的只有两个节点选的安装管理节点,其它都安装的是计算节点。后来经过询问,发现有个同事在所有的节点上都执行了:
bash zstack-installer.bin
命令。手工的把所有节点都变成了管理节点。
至于为什么都是管理节点,就导致了丢文件
/usr/local/zstack/imagestore/bin/zstcli
这个逻辑没搞清楚。
解决办法:
1、清理管理节点对应相关文件:
zstack-ctl stop
rm -rf /usr/local/zstack
在除了点6(点6是我计划要搞的管理节点)以外的其它节点执行上述两个命令
2、然后执行小面的命令,生成/usr/local/zstack/imagestore目录以及里面对应的文件。
bash /var/lib/zstack/imagestorebackupstorage/package/zstack-store.bin
然后就可以了。