关于Storm的一些疑问解答

Q1: 出现下面的问题怎么解决?

2011-12-26 11:44:21 worker [ERROR] Error on initialization of server mk-worker
java.lang.UnsatisfiedLinkError: /usr/local/lib/libjzmq.so.0.0.0: libzmq.so.1: cannot open shared object file: No such file or directory
        at java.lang.ClassLoader$NativeLibrary.load(Native Method)
        at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1767)
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1692)
        at java.lang.Runtime.loadLibrary0(Runtime.java:840)
        at java.lang.System.loadLibrary(System.java:1047)
        at org.zeromq.ZMQ.<clinit>(ZMQ.java:34)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:186)
        at zilch.mq__init.load(Unknown Source)
        at zilch.mq__init.<clinit>(Unknown Source)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at clojure.lang.RT.loadClassForName(RT.java:1578)
        at clojure.lang.RT.load(RT.java:399)
        at clojure.lang.RT.load(RT.java:381)
        at clojure.core$load$fn__4511.invoke(core.clj:4905)
        at clojure.core$load.doInvoke(core.clj:4904)
        at clojure.lang.RestFn.invoke(RestFn.java:409)
        at clojure.core$load_one.invoke(core.clj:4729)
        at clojure.core$load_lib.doInvoke(core.clj:4766)
        at clojure.lang.RestFn.applyTo(RestFn.java:143)
        at clojure.core$apply.invoke(core.clj:542)
        at clojure.core$load_libs.doInvoke(core.clj:4800)
        at clojure.lang.RestFn.applyTo(RestFn.java:138)
        at clojure.core$apply.invoke(core.clj:542)
        at clojure.core$require.doInvoke(core.clj:4869)
        at clojure.lang.RestFn.invoke(RestFn.java:422)
        at backtype.storm.messaging.zmq$loading__4410__auto__.invoke(zmq.clj:1)
        at backtype.storm.messaging.zmq__init.load(Unknown Source)
        at backtype.storm.messaging.zmq__init.<clinit>(Unknown Source)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at clojure.lang.RT.loadClassForName(RT.java:1578)
        at clojure.lang.RT.load(RT.java:399)
        at clojure.lang.RT.load(RT.java:381)
        at clojure.core$load$fn__4511.invoke(core.clj:4905)
        at clojure.core$load.doInvoke(core.clj:4904)
        at clojure.lang.RestFn.invoke(RestFn.java:409)
        at clojure.core$load_one.invoke(core.clj:4729)
        at clojure.core$load_lib.doInvoke(core.clj:4766)
        at clojure.lang.RestFn.applyTo(RestFn.java:143)
        at clojure.core$apply.invoke(core.clj:542)
        at clojure.core$load_libs.doInvoke(core.clj:4800)
        at clojure.lang.RestFn.applyTo(RestFn.java:138)
        at clojure.core$apply.invoke(core.clj:542)
        at clojure.core$require.doInvoke(core.clj:4869)
        at clojure.lang.RestFn.invoke(RestFn.java:409)
        at backtype.storm.messaging.loader$mk_zmq_context.doInvoke(loader.clj:8)
        at clojure.lang.RestFn.invoke(RestFn.java:437)
        at backtype.storm.daemon.worker$fn__3102$exec_fn__858__auto____3103.invoke(worker.clj:111)
        at clojure.lang.AFn.applyToHelper(AFn.java:187)
        at clojure.lang.AFn.applyTo(AFn.java:151)
        at clojure.core$apply.invoke(core.clj:540)
        at backtype.storm.daemon.worker$fn__3102$mk_worker__3244.doInvoke(worker.clj:80)
        at clojure.lang.RestFn.invoke(RestFn.java:513)
        at backtype.storm.daemon.worker$_main.invoke(worker.clj:257)
        at clojure.lang.AFn.applyToHelper(AFn.java:174)
        at clojure.lang.AFn.applyTo(AFn.java:151)
        at backtype.storm.daemon.worker.main(Unknown Source)
2011-12-26 11:44:21 util [INFO] Halting process: ("Error on initialization")

A1:sudo vi /etc/ld.so.conf 在该文件添加一行“/usr/local/lib”, 保存以后,执行“sudo /sbin/ldconfig”就可以fixed这个问题。

Q2:Storm选择使用kestrel而不是zeromq的原因?

A2:ZeroMQ is basically a socket library. If you want to pass data between 
topologies, you have two things you need to solve: 
1. Discovering where to send the tuples (since Storm can launch the 
spout tasks anywhere on the cluster, and their locations can change 
over time) 
2. Guaranteeing message processing on the second topology 
If you use a Kestrel queue (or queues) as the intermediate broker, 
that solves #1 because you're using a fixed set of machines that both 
topologies know about. It solves #2 because Kestrel persists the 
messages and is capable of replaying the messages. 

You want to make sure that your spout source can support the out of 
order acking that Storm requires for guaranteed message processing. We 
use Kestrel because it has this property and is the simplest. 

Q3:Storm如果避免发生拥塞?

A3:暂时Storm没有好的方法来控制这个,但是对于保证数据可靠性的topology而言,可以通过TOPOLOGY_MAX_SPOUT_PENDING来对spout进行一些控制;

另外只能靠增加处理速度慢的bolt的并行度或者是减少spout的并行度来调节数据流拥塞问题了。

Q4:关于Storm UI上spout complete latency的一些疑问

. complete latency是如何计算的?

. 对于一个topology而言,理想的complete latency是多少?

. 在实现topology的时候,如果能够保证尽可能小的latency?

A4:

1. The timer is started when the tuple is emitted from the spout and it is 
stopped when the tuple is acked. So it measures the time for the entire 
tuple tree to be completed. 

2. There's no "ideal" time. Every topology has different characteristics. 
For example, in our stream processing topologies we care more about 
throughput than latency and make use of batching techniques that increase 
throughput at the cost of worsened latency. In our distributed RPC 
topologies however, we optimize for latency. 

3. The completion time will be affected by: a) the processing latency of 
the bolts, which you can improve by doing optimization b) the amount of 
congestion in the topology, which you can reduce by increasing parallelism. 
So if you add up the processing latencies of the bolts and they don't come 
close to the complete latency, you should try adding more parallelism. 

 

Q5:为保证数据可靠性,spout acker需要知道哪些东西?

A5:

First, you need to tell Storm whenever you're creating a new link in the tree of tuples. 
Second, you need to tell Storm when you have finished processing an individual tuple. 

如果你的bolt实现的是IRichBolt接口,这两件事情需要你自己亲自去做;但是如果你继承的是BasicBolt, 则它会偷偷
的帮助你去做这两件事情,所以当你需要unanchored的时候,则不要使用BasicBolt

Tuples emitted to BasicOutputCollector are automatically anchored to the input tuple, and 
the input tuple is acked for you automatically when the execute method completes.

Q6:针对Storm DRPC的几个问题?

1) If the drpc server configure as follows, then i need run "./bin/storm drpc" on all of these servers? 

> drpc.servers: 
>    - "10.13.41.104" 
>    - "10.13.41.64" 
>    - "10.13.41.98" 

2)I need use storm jar summit the topology to storm first, then write a application using DRPCClient to call "execute" ?

3)If i can give more than one hosts to DRPCClient? 

DRPCClient client = new DRPCClient("drpc-host", 3772); 

A6:

1)Yes, you do. 

2) Yes. 

3)No, you can't. DRPCClient can definitely be improved (or a higher level one 
can be written on top of it). 

Q7:关于Storm UI(perftest)的几个问题? (基于http://groups.google.com/group/storm-user/browse_thread/thread/822be41e58fed298


1) 为什么topology只定义了一个bolt,但是在storm UI的Bolts下面还有一个Id为-1的,代表什么意思?

2)PrinterBolt并没有调用emit,为什么还会有Emitted呢?

3)Spouts的Emitted为1985940, 而bolt Id 1的Emitted是992980,是因为bolt没有追上spout的emit速度吗?


A7:

1) That's the "acker bolt" which takes care of tracking tuple trees. It's an 
integral part of guaranteeing message processing, and is described in the 
wiki here: 
https://github.com/nathanmarz/storm/wiki/Guaranteeing-message-processing 
In 0.6.0 that bolt was renamed to "__acker"

2)It's emitting ack tuples on its "ack" stream that go to the acker bolt. 

3)Notice that the emitted value of spout 1 is just about double the emitted 
value of bolt 2. This is because for every tuple spout 1 emits, it sends a tuple to bolt 2 
as well as a tuple to the acker bolt (to track the tuple tree). Bolt 2 only 
acks tuples so it sends a single ack tuple per received tuple.  


Storm Google Group是个提问题的好地方,Storm作者很nice,多上那边逛逛吧!

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值