问题:
storm异常停止后,重启storm,nimbus进程几秒后便不见了。 查看日志报错:nimbus [ERROR] Error when processing event
java.io.FileNotFoundException: File ‘/data/storm/nimbus/stormdist/risk_topo-1-1574555563/stormconf.ser’ does not exist
at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:299) ~[analyzer-storm-dependency.jar:na]
at org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1763) ~[analyzer-storm-dependency.jar:na]
at backtype.storm.daemon.nimbus
r
e
a
d
s
t
o
r
m
c
o
n
f
.
i
n
v
o
k
e
(
n
i
m
b
u
s
.
c
l
j
:
89
)
[
s
t
o
r
m
−
c
o
r
e
−
0.9.5.
j
a
r
:
0.9.5
]
a
t
b
a
c
k
t
y
p
e
.
s
t
o
r
m
.
d
a
e
m
o
n
.
n
i
m
b
u
s
read_storm_conf.invoke(nimbus.clj:89) ~[storm-core-0.9.5.jar:0.9.5] at backtype.storm.daemon.nimbus
readstormconf.invoke(nimbus.clj:89) [storm−core−0.9.5.jar:0.9.5]atbacktype.storm.daemon.nimbuscompute_executors.invoke(nimbus.clj:419) ~[storm-core-0.9.5.jar:0.9.5]
at backtype.storm.daemon.nimbusKaTeX parse error: Expected group after '_' at position 17: …ompute_topology_̲_GT_executorsiter__3324__3328KaTeX parse error: Expected group after '_' at position 3: fn_̲_3329.invoke(ni…seq.invoke(core.clj:133) ~[clojure-1.5.1.jar:na]
at clojure.core.protocols
s
e
q
r
e
d
u
c
e
.
i
n
v
o
k
e
(
p
r
o
t
o
c
o
l
s
.
c
l
j
:
30
)
[
c
l
o
j
u
r
e
−
1.5.1.
j
a
r
:
n
a
]
a
t
c
l
o
j
u
r
e
.
c
o
r
e
.
p
r
o
t
o
c
o
l
s
seq_reduce.invoke(protocols.clj:30) ~[clojure-1.5.1.jar:na] at clojure.core.protocols
seqreduce.invoke(protocols.clj:30) [clojure−1.5.1.jar:na]atclojure.core.protocolsfn__6026.invoke(protocols.clj:54) ~[clojure-1.5.1.jar:na]
at clojure.core.protocolsKaTeX parse error: Expected group after '_' at position 3: fn_̲_5979G__5974__5992.invoke(protocols.clj:13) ~[clojure-1.5.1.jar:na]
at clojure.core
r
e
d
u
c
e
.
i
n
v
o
k
e
(
c
o
r
e
.
c
l
j
:
6177
)
[
c
l
o
j
u
r
e
−
1.5.1.
j
a
r
:
n
a
]
a
t
c
l
o
j
u
r
e
.
c
o
r
e
reduce.invoke(core.clj:6177) ~[clojure-1.5.1.jar:na] at clojure.core
reduce.invoke(core.clj:6177) [clojure−1.5.1.jar:na]atclojure.coreinto.invoke(core.clj:6229) ~[clojure-1.5.1.jar:na]
at backtype.storm.daemon.nimbusKaTeX parse error: Expected group after '_' at position 17: …ompute_topology_̲_GT_executors.i…compute_new_topology__GT_executor__GT_node_PLUS_port.invoke(nimbus.clj:550) ~[storm-core-0.9.5.jar:0.9.5]
at backtype.storm.daemon.nimbus
m
k
a
s
s
i
g
n
m
e
n
t
s
.
d
o
I
n
v
o
k
e
(
n
i
m
b
u
s
.
c
l
j
:
662
)
[
s
t
o
r
m
−
c
o
r
e
−
0.9.5.
j
a
r
:
0.9.5
]
a
t
c
l
o
j
u
r
e
.
l
a
n
g
.
R
e
s
t
F
n
.
i
n
v
o
k
e
(
R
e
s
t
F
n
.
j
a
v
a
:
410
)
[
c
l
o
j
u
r
e
−
1.5.1.
j
a
r
:
n
a
]
a
t
b
a
c
k
t
y
p
e
.
s
t
o
r
m
.
d
a
e
m
o
n
.
n
i
m
b
u
s
mk_assignments.doInvoke(nimbus.clj:662) ~[storm-core-0.9.5.jar:0.9.5] at clojure.lang.RestFn.invoke(RestFn.java:410) ~[clojure-1.5.1.jar:na] at backtype.storm.daemon.nimbus
mkassignments.doInvoke(nimbus.clj:662) [storm−core−0.9.5.jar:0.9.5]atclojure.lang.RestFn.invoke(RestFn.java:410) [clojure−1.5.1.jar:na]atbacktype.storm.daemon.nimbusfn__3724KaTeX parse error: Expected group after '_' at position 8: exec_fn_̲_1103__auto____…fn__3730KaTeX parse error: Expected group after '_' at position 3: fn_̲_3731.invoke(ni…fn__3724KaTeX parse error: Expected group after '_' at position 8: exec_fn_̲_1103__auto____…fn__3730.invoke(nimbus.clj:908) ~[storm-core-0.9.5.jar:0.9.5]
at backtype.storm.timer
s
c
h
e
d
u
l
e
r
e
c
u
r
r
i
n
g
schedule_recurring
schedulerecurringthis__1807.invoke(timer.clj:99) ~[storm-core-0.9.5.jar:0.9.5]
at backtype.storm.timer
m
k
t
i
m
e
r
mk_timer
mktimerfn__1790KaTeX parse error: Expected group after '_' at position 3: fn_̲_1791.invoke(ti…mk_timer$fn__1790.invoke(timer.clj:42) ~[storm-core-0.9.5.jar:0.9.5]
at clojure.lang.AFn.run(AFn.java:24) ~[clojure-1.5.1.jar:na]
at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]
2019-11-24T10:43:46.368+0800 b.s.util [ERROR] Halting process: (“Error when processing an event”)
java.lang.RuntimeException: (“Error when processing an event”)
分析:
nimbus重启后便down掉是因为zookeeper里还保留着之前异常挂掉storm的信息,以至于每次重启storm的时候,zookeeper都会去读取该topology信息,如果重启的时候已经将{storm.local.dir}目录下的文件删除了,便会报找不到文件了。此时应该将zookeeper的storm也删除掉,实现同步。
解决方案:
如果不清楚zookeeper安装在哪里,则是使用命令查询
find / -name zkCli.sh
我的zookeeper安装位置:
/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/zookeeper/bin
则修复步骤:
1、进入目录:
cd /opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/zookeeper/bin
2、输入 ./zkCli.sh
3、查看服务: ls /
4、删除storm的服务: rmr /storm
5、重启storm即可
当然也要记得将配置中{storm.local.dir}的路径的supervisor,nimbus目录删除掉
PS:kill掉storm进程命令
ps -ef | grep storm | grep -v ‘grep’ | awk ‘{print $2}’ | xargs kill -9