ThingsBoard——Docker重启失败,报错Connection to localhost:5432 refused的解决方法

一、问题

最近在写自定义规则节点,因为还没编译好thingsboard源代码,用的是docker搭建起来的环境。写好的规则节点后,要打包好扔到docker里,再重启docker。发现经常重启失败,报错的日志也都是这样:

2022-03-05 08:53:23,164 [main] ERROR com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Exception during pool initialization.
org.postgresql.util.PSQLException: Connection to localhost:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
	at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:303)
...

2022-03-05 08:53:25,187 [main] ERROR o.s.boot.SpringApplication - Application run failed
org.springframework.context.ApplicationContextException: Unable to start web server; nested exception is org.springframework.boot.web.server.WebServerException: Unable to start embedded Tomcat
	at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.onRefresh(ServletWebServerApplicationContext.java:161)
...

pg_ctl: could not open PID file "/data/db/postmaster.pid": Operation not permitted

二、原因

试过删除后重新run,也试过重新赋权限(好像第一次能成功),试过用docker-compose,结果都是一样。最后还是在issues里找到了解决方案,感谢alyf80提供的思路:

I think this is due to postgres not being properly shut down when the container is stopped; this leaves a stale postmaster.pid file in the data/db directory, which on the next run causes pg_ctl to try talking to a non-existing daemon and ultimately failing to start the db server. Deleting that file prior to starting the container reliably fixes the problem in my case.

I have no idea why this is happening. It looks to me that the TB process is ignoring (or simply not getting) the SIGTERM that Docker sends to initiate a graceful shutdown of the container, so after the default 10s shutdown timeout expires everything gets killed without stop-db.sh having had a chance to run.

大致意思:PostgreSQL在docker关闭时,可能没有正常接收到关闭指令,所以数据库进程一直在使用中,没有被释放,导致docker启动后无法连接上数据库

补充:应该是大多数时机都没关闭,偶尔才正常关闭,所以重试几次后,会有成功的

三、解决

既然是进程未释放原因,那就好办了,就在重启前,先把db给结束掉。这里提供多种方法(扩展思路,试过都OK)

# 【荐】法1:安全关闭容器里的数据库,也是alyf80所提到的
$ docker exec -it mytb stop-db.sh

# 法2:强制关闭
$ docker exec -it mytb killall postgres 

# 法3:强制关闭,进入容器内操作
$ docker exec -it mytb bash
# 或先用ps aux找到postgres的PID,然后kill -9 [PID]
$ killall postgres

根据这个原因分析,应该可以推断出修改权限后,为什么第一次成功了。修改权限后,db停止了,所以重启后也就正常了

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值