Heron集群搭建完成后拓扑提交问题及解决

问题描述

在完成Heron集群搭建的前序步骤后,提交Heron的拓扑示例WordCountTopology,提交出现异常。提交拓扑的运行代码如下:

yitian@heron01:~$ heron submit aurora/yitian/devel --config-path ~/.heron/conf ~/.heron/examples/heron-api-examples.jar com.twitter.heron.examples.api.WordCountTopology WordCountTopology
[2018-02-25 01:57:00 +0000] [INFO]: Using cluster definition in /home/yitian/.heron/conf/aurora
[2018-02-25 01:57:00 +0000] [INFO]: Launching topology: 'WordCountTopology'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/yitian/.heron/lib/uploader/heron-dlog-uploader.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/yitian/.heron/lib/statemgr/heron-zookeeper-statemgr.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
[2018-02-25 01:57:01 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Starting Curator client connecting to: heron01:2181 
[2018-02-25 01:57:01 -0800] [INFO] org.apache.curator.framework.imps.CuratorFrameworkImpl: Starting 
[2018-02-25 01:57:01 -0800] [INFO] org.apache.curator.framework.state.ConnectionStateManager: State change: CONNECTED 
[2018-02-25 01:57:02 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Directory tree initialized. 
[2018-02-25 01:57:02 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Checking existence of path: /heron/topologies/WordCountTopology 
[2018-02-25 01:57:05 -0800] [INFO] com.twitter.heron.uploader.hdfs.HdfsUploader: Target topology file already exists at '/heron/topologies/aurora/WordCountTopology-yitian-tag-0--8168281366065059093.tar.gz'. Overwriting it now 
[2018-02-25 01:57:05 -0800] [INFO] com.twitter.heron.uploader.hdfs.HdfsUploader: Uploading topology package at '/tmp/tmphINMFM/topology.tar.gz' to target HDFS at '/heron/topologies/aurora/WordCountTopology-yitian-tag-0--8168281366065059093.tar.gz' 
[2018-02-25 01:57:09 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /heron/topologies/WordCountTopology 
[2018-02-25 01:57:09 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /heron/packingplans/WordCountTopology 
[2018-02-25 01:57:09 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /heron/executionstate/WordCountTopology 
[2018-02-25 01:57:09 -0800] [INFO] com.twitter.heron.scheduler.aurora.AuroraLauncher: Launching topology in aurora 
[2018-02-25 01:57:09 -0800] [INFO] com.twitter.heron.scheduler.utils.SchedulerUtils: Updating scheduled-resource in packing plan: WordCountTopology 
[2018-02-25 01:57:09 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Deleted node for path: /heron/packingplans/WordCountTopology 
[2018-02-25 01:57:09 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /heron/packingplans/WordCountTopology  
  INFO] Creating job WordCountTopology 

heron拓扑提交命令到此停止不动后,查看集群相关组件的运行状态如下:

1. Mesos集群运行正常

image

image

image

2. HDFS集群运行正常

image

image

3. Aurora scheduler中出现异常

image

image

运行heron-tracker和heron-ui命令并查看,发现:

(1)heron-tracker运行正常

image

(2)heron-ui中有异常

image

image

造成该问题的原因为:

集群中的主机配置(RAM, CPU, DISK)资源无法满足WordCountTopology拓扑的需要,造成在aurora scheduler进行instances分配时,无法找到合适的运行主机,从而使instance处于PENDDING状态。这些特征可以在aurora的UI页面进行查看。

解决方法

将集群中的主机RAM和CPU配置升为5G,4CORE,以满足WordCountTopology的需要。并查看mesos的stderr日志中有无报错信息,这里解决:成功启动集群-解决“Regular plan unhealthy!” 问题问题之后,即可成功提交heron的该示例拓扑。


补充解决:问题暂时确定的原因为虚拟机的运行资源(CPU, RAM, DISk)不能满足提交的topology中使用aurora scheduler而创建的task的要求。因此在aurora中instances的状态为pending,无法找到满足instances资源要求的主机,无法将task分配到主机中。才出现topology的physical plan无法创建的问题。

具体问题见:Aurora Instances are always pending status after submitted Heron Topology

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值