docker引擎断电后启动失败常见问题和解决方案
常用排查命令
- 查看containerd运行状态
systemctl status containerd
- 查看docker引擎运行状态
systemctl status docker
- 在系统日志中查看docker引擎最近日志并持续追踪
journalctl -u docker.service -f -n 100
常见问题和解决
容器加载失败问题
使用journalctl
命令查看docker
日志,提示某个容器加载失败,常见于断电使容器文件系统损坏。
dockerd[26166]: time="xxx" level=error msg="failed to load container" container=xxxxx error="invalid character '\\x00' looking for beginning of value"
解决步骤
- 根据日志里
container=
后的容器id,rm /var/lib/docker/containers/<容器id>
删除对应容器目录 - 重启
systemctl restart docker
引擎Page expected异常退出问题
使用journalctl
命令查看docker
日志,发现docker的golang源码抛出panic异常
dockerd[26166]: panic: assertion failed: Page expected to be: 34, but self identifies as xxx
解决步骤
使用玄学删除文件
参考:containerd/issues/3347 Containerd is crashing with panic
参考:Hope will help someone
- Stop Docker and containerd:
systemctl stop docker containerd
- Cleanup containerd data directory (Docker will regenerate it at startup if needed):
rm -rf /var/lib/containerd/
- Find Docker’s database files - one of them (most often local-kv.db) corrupted in your system:
find /var/lib/docker -type f -size -5M -name '*.db' | grep -v overlay2
- will output something like:
/var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db
/var/lib/docker/volumes/metadata.db
/var/lib/docker/network/files/local-kv.db
/var/lib/docker/builder/fscache.db
/var/lib/docker/buildkit/snapshots.db
/var/lib/docker/buildkit/metadata.db
/var/lib/docker/buildkit/cache.db - Simply rename this file to .bak:
mv /var/lib/docker/network/files/local-kv.db{,.bak}
- Start Docker:
systemctl start docker
引擎重启,重启容器验证,由于直接删除了引擎的.db
数据文件,直接docker restart xx
重启容器会出现关联数据找不到的问题。建议先docker remove
或docker compose down
删除旧容器数据。