故障现象:
docker进程僵死,docker命令无法使用
处理过程:
查看docker进程状态,显示句柄数过多,于是重启了docker
[root@data02 opt]# systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2022-02-08 08:41:24 CST; 1 years 5 months ago
Docs: https://docs.docker.com
Main PID: 821 (dockerd)
CGroup: /system.slice/docker.service
├─ 821 /usr/bin/dockerd
├─ 1258 docker-containerd --config /var/run/docker/containerd/containerd.toml
Jul 27 10:28:49 data02 dockerd[821]: http: Accept error: accept unix /var/run/docker.sock: accept4: too many open files; retrying in 1s
Jul 27 10:28:50 data02 dockerd[821]: http: Accept error: accept unix /var/run/docker.sock: accept4: too many open files; retrying in 1s
Jul 27 10:28:51 data02 dockerd[821]: http: Accept error: accept unix /var/run/docker.sock: accept4: too many open files; retrying in 1s
Jul 27 10:28:52 data02 dockerd[821]: http: Accept error: accept unix /var/run/docker.sock: accept4: too many open files; retrying in 1s
Jul 27 10:28:53 data02 dockerd[821]: http: Accept error: accept unix /var/run/docker.sock: accept4: too many open files; retrying in 1s
Jul 27 10:28:54 data02 dockerd[821]: http: Accept error: accept unix /var/run/docker.sock: accept4: too many open files; retrying in 1s
Jul 27 10:28:55 data02 dockerd[821]: http: Accept error: accept unix /var/run/docker.sock: accept4: too many open files; retrying in 1s
Jul 27 10:28:56 data02 dockerd[821]: http: Accept error: accept unix /var/run/docker.sock: accept4: too many open files; retrying in 1s
Jul 27 10:28:57 data02 dockerd[821]: http: Accept error: accept unix /var/run/docker.sock: accept4: too many open files; retrying in 1s
Jul 27 10:28:58 data02 dockerd[821]: http: Accept error: accept unix /var/run/docker.sock: accept4: too many open files; retrying in 1s
执行systemctl restart docker命令之后,卡在命令行
查看docker状态和journalctl -xe,也没有发现明显报错。
[root@data02 ~]# systemctl status docker -l
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: deactivating (stop-sigterm) since Thu 2023-07-27 10:38:18 CST; 34min ago
Docs: https://docs.docker.com
Main PID: 22539 (dockerd)
Memory: 160.3M
CGroup: /system.slice/docker.service
├─22539 /usr/bin/dockerd
└─22546 docker-containerd --config /var/run/docker/containerd/containerd.toml
Jul 27 11:02:15 data02 dockerd[22539]: time="2023-07-27T11:02:15+08:00" level=info msg="loading plugin "io.containerd.grpc.v1.tasks"..." module=containerd type=io.containerd.grpc.v1
Jul 27 11:02:15 data02 dockerd[22539]: time="2023-07-27T11:02:15+08:00" level=info msg="loading plugin "io.containerd.grpc.v1.version"..." module=containerd type=io.containerd.grpc.v1
Jul 27 11:02:15 data02 dockerd[22539]: time="2023-07-27T11:02:15+08:00" level=info msg="loading plugin "io.containerd.grpc.v1.introspection"..." module=containerd type=io.containerd.grpc.v1
Jul 27 11:02:15 data02 dockerd[22539]: time="2023-07-27T11:02:15+08:00" level=info msg=serving... address="/var/run/docker/containerd/docker-containerd-debug.sock" module="containerd/debug"
Jul 27 11:02:15 data02 dockerd[22539]: time="2023-07-27T11:02:15+08:00" level=info msg=serving... address="/var/run/docker/containerd/docker-containerd.sock" module="containerd/grpc"
Jul 27 11:02:15 data02 dockerd[22539]: time="2023-07-27T11:02:15+08:00" level=info msg="containerd successfully booted in 0.013187s" module=containerd
Jul 27 11:02:15 data02 dockerd[22539]: time="2023-07-27T11:02:15.810108948+08:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Jul 27 11:02:15 data02 dockerd[22539]: time="2023-07-27T11:02:15.835765459+08:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Jul 27 11:02:15 data02 dockerd[22539]: time="2023-07-27T11:02:15.836814381+08:00" level=info msg="Loading containers: start."
先stop掉docker后,ps 发现还有几个进程没有停掉。
手动kill掉进程后,再次启动,成功。
docker服务异常停止,重启docker后,容器启动失败
docker启动的时候,会在运行目录(/var/run/docker/runtime-runc/moby)(不同环境,可能目录不一样,可以通过find / -name ‘容器ID’ 查找)下生成以docker-ID,因为docker异常停止,改容器文件并没有删除,所以启动的时候,会报错该容器已存在