pulsar之AutoRecovery功能

pulsar支持应用无感知的扩展与迁移。
对broker,我们不论是升级还是扩展都非常简单,此处不做介绍。但是对于bookie,还是需要注意一些地方的。

autorecovery

关闭
bookkeeper shell autorecovery -disable
开启
bookkeeper shell autorecovery -enable
做迁移bookie的时候开启自动拷贝,会自动将关闭bookie的消息拷贝到新增的bookie上。

如何查看拷贝的ledger

显示bookkeeper的复制列表(此处可看出所有下架bookie的消息对否拷贝完全)
bookkeeper shell listunderreplicated
显示bookkeeper的未复制列表(对某台bookie而言)
bookkeeper shell listunderreplicated -missingreplica 172.16.4.224:3181
显示某个ledgerId的元数据信息
bookkeeper shell ledgermetadata -ledgerid 89

问题一

https://github.com/apache/bookkeeper/issues/2001
楼主碰到了这个bug。
现象是
13:34:36.437 [db-storage-cleanup-16-1] WARN org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage - Failed to cleanup db indexes
org.apache.bookkeeper.bookie.Bookie N o E n t r y E x c e p t i o n : E n t r y − 1 n o t f o u n d i n 630856964063500820 a t o r g . a p a c h e . b o o k k e e p e r . b o o k i e . s t o r a g e . l d b . E n t r y L o c a t i o n I n d e x . g e t L a s t E n t r y I n L e d g e r I n t e r n a l ( E n t r y L o c a t i o n I n d e x . j a v a : 123 )   [ o r g . a p a c h e . b o o k k e e p e r − b o o k k e e p e r − s e r v e r − 4.9.0. j a r : 4.9.0 ] a t o r g . a p a c h e . b o o k k e e p e r . b o o k i e . s t o r a g e . l d b . E n t r y L o c a t i o n I n d e x . r e m o v e O f f s e t F r o m D e l e t e d L e d g e r s ( E n t r y L o c a t i o n I n d e x . j a v a : 219 )   [ o r g . a p a c h e . b o o k k e e p e r − b o o k k e e p e r − s e r v e r − 4.9.0. j a r : 4.9.0 ] a t o r g . a p a c h e . b o o k k e e p e r . b o o k i e . s t o r a g e . l d b . S i n g l e D i r e c t o r y D b L e d g e r S t o r a g e . l a m b d a NoEntryException: Entry -1 not found in 630856964063500820 at org.apache.bookkeeper.bookie.storage.ldb.EntryLocationIndex.getLastEntryInLedgerInternal(EntryLocationIndex.java:123) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0] at org.apache.bookkeeper.bookie.storage.ldb.EntryLocationIndex.removeOffsetFromDeletedLedgers(EntryLocationIndex.java:219) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0] at org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.lambda NoEntryException:Entry1notfoundin630856964063500820atorg.apache.bookkeeper.bookie.storage.ldb.EntryLocationIndex.getLastEntryInLedgerInternal(EntryLocationIndex.java:123) [org.apache.bookkeeperbookkeeperserver4.9.0.jar:4.9.0]atorg.apache.bookkeeper.bookie.storage.ldb.EntryLocationIndex.removeOffsetFromDeletedLedgers(EntryLocationIndex.java:219) [org.apache.bookkeeperbookkeeperserver4.9.0.jar:4.9.0]atorg.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.lambdacheckpoint 7 ( S i n g l e D i r e c t o r y D b L e d g e r S t o r a g e . j a v a : 624 )   [ o r g . a p a c h e . b o o k k e e p e r − b o o k k e e p e r − s e r v e r − 4.9.0. j a r : 4.9.0 ] a t j a v a . u t i l . c o n c u r r e n t . E x e c u t o r s 7(SingleDirectoryDbLedgerStorage.java:624) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0] at java.util.concurrent.Executors 7(SingleDirectoryDbLedgerStorage.java:624) [org.apache.bookkeeperbookkeeperserver4.9.0.jar:4.9.0]atjava.util.concurrent.ExecutorsRunnableAdapter.call(Executors.java:511) [?:1.8.0_181]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access 201 ( S c h e d u l e d T h r e a d P o o l E x e c u t o r . j a v a : 180 ) [ ? : 1.8. 0 1 81 ] a t j a v a . u t i l . c o n c u r r e n t . S c h e d u l e d T h r e a d P o o l E x e c u t o r 201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_181] at java.util.concurrent.ScheduledThreadPoolExecutor 201(ScheduledThreadPoolExecutor.java:180)[?:1.8.0181]atjava.util.concurrent.ScheduledThreadPoolExecutorScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
13:35:36.359 [db-storage-cleanup-16-1] INFO org.apache.bookkeeper.bookie.storage.ldb.EntryLocationIndex - Deleting indexes for ledgers: [32768, 32771, 32774, 32777, 32780, 32783, 32786, 32789, 32792, 32795, 32798, 32801, 32804, 32807, 32810, 32813, 32816, 32819, 32822, 32825, 32828, 32831, 32834, 32837, 32840, 32843, 32846, 32849, 32852, 32855, 32858, 32861, 32864, 32867, 32870, 32873, 32876, 32879

暂未解决

问题二

以及可用bookie不足的错误;
12:19:53.378 [ReplicationWorker] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [Bookie:172.16.4.229:3181, Bookie:172.16.4.230:3181, Bookie:172.16.4.222:3181], allBookies [Bookie:172.16.4.222:3181, Bookie:172.16.4.229:3181, Bookie:172.16.4.230:3181].
12:19:53.378 [ReplicationWorker] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to choose a bookie: excluded [Bookie:172.16.4.229:3181, Bookie:172.16.4.230:3181, Bookie:172.16.4.222:3181], fallback to choose bookie randomly from the cluster.
12:19:53.378 [ReplicationWorker] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [Bookie:172.16.4.229:3181, Bookie:172.16.4.230:3181, Bookie:172.16.4.222:3181], allBookies [Bookie:172.16.4.229:3181, Bookie:172.16.4.230:3181, Bookie:172.16.4.222:3181].
12:19:53.378 [ReplicationWorker] WARN org.apache.bookkeeper.replication.ReplicationWorker - BKNotEnoughBookiesException while replicating the fragment
org.apache.bookkeeper.client.BKException$BKNotEnoughBookiesException: Not enough non-faulty bookies available
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandomInternal(RackawareEnsemblePlacementPolicyImpl.java:989) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandom(RackawareEnsemblePlacementPolicyImpl.java:907) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:797) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:200) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:757) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:221) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.replaceBookie(RackawareEnsemblePlacementPolicyImpl.java:659) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.replaceBookie(RackawareEnsemblePlacementPolicy.java:114) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at org.apache.bookkeeper.client.BookKeeperAdmin.getReplacementBookiesByIndexes(BookKeeperAdmin.java:997) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at org.apache.bookkeeper.client.BookKeeperAdmin.replicateLedgerFragment(BookKeeperAdmin.java:1045) ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at org.apache.bookkeeper.replication.ReplicationWorker.rereplicate(ReplicationWorker.java:296) [org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at org.apache.bookkeeper.replication.ReplicationWorker.rereplicate(ReplicationWorker.java:249) [org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at org.apache.bookkeeper.replication.ReplicationWorker.run(ReplicationWorker.java:210) [org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
经过询问pulsar大神sijie后,将bookie关闭AutoRecovery功能,再全部重启后错误不再抛出。如果有碰到的兄弟可以试试这个方法。(关闭bookie的时候注意,最好将producer关闭,要不然会造成消息的重复发送。2.4版本支持消息的事务功能,应该能解决此问题。)

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值