flume传送数据到hdfs上报错

1,错误主要日志如下

2019-05-19 08:38:58,582 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:178)] Shutdown Metric for type: CHANNEL, name: c1. channel.event.take.success == 0
2019-05-19 08:38:58,582 (agent-shutdown-hook) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.stop(PollingPropertiesFileConfigurationProvider.java:83)] Configuration provider stopping
2019-05-19 08:38:58,584 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:459)] process failed
java.lang.InterruptedException: Timed out before HDFS call was made. Your hdfs.callTimeout might be set too low or HDFS calls are taking too long.
	at org.apache.flume.sink.hdfs.BucketWriter.checkAndThrowInterruptedException(BucketWriter.java:660)
	at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:419)
	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:442)
	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
	at java.lang.Thread.run(Thread.java:745)
2019-05-19 08:38:58,586 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:160)] Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.InterruptedException: Timed out before HDFS call was made. Your hdfs.callTimeout might be set too low or HDFS calls are taking too long.
	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:463)
	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException: Timed out before HDFS call was made. Your hdfs.callTimeout might be set too low or HDFS calls are taking too long.
	at org.apache.flume.sink.hdfs.BucketWriter.checkAndThrowInterruptedException(BucketWriter.java:660)
	at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:419)
	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:442)
	... 3 more
2019-05-19 08:39:03,587 (agent-shutdown-hook) [INFO - org.apache.flume.sink.hdfs.HDFSEventSink.stop(HDFSEventSink.java:492)] Closing hdfs:/Initial_Data/20190519/20190519-
2019-05-19 08:39:03,642 (agent-shutdown-hook) [INFO - org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:363)] Closing hdfs:/Initial_Data/20190519/20190519-.1558269477524.tmp
2019-05-19 08:39:03,668 (hdfs-k1-call-runner-8) [INFO - org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:629)] Renaming hdfs:/Initial_Data/20190519/20190519-.1558269477524.tmp to hdfs:/Initial_Data/20190519/20190519-.1558269477524

2,发现生成了数据。然而上传数据到hdfs报错,报错日志中我记得还提到了master节点。其实我这三台机器的主机名分别是node1,node2,node2,根本没用master节点,并且我部署的是Hadoop ha,看见报错上面一行提到了hdfs-site.xml,想起启动flume会读取conf配置。
3,解决方法,将hadoop的配置hdfs-site.xml和core-site.xml复制到flume的conf路径下,最后上传数据到hdfs成功。
4,贴出我的flume的测试配置

# 定义agent名称,source、channel、sink的名称
f1.sources=r1
f1.channels=c1
f1.sinks=k1

# 具体定义source
f1.sources.r1.type=netcat
f1.sources.r1.bind=192.168.137.183
f1.sources.r1.port=55555
f1.sources.r1.max-line-length=1000000
f1.sources.r1.channels=c1

# channel具体配置
f1.channels.c1.type=memory
f1.channels.c1.capacity=1000
f1.channels.c1.transactionCapacity=1000
f1.channels.c1.keep-alive=30
#具体定义sink
f1.sinks.k1.type = hdfs
f1.sinks.k1.channel=c1
f1.sinks.k1.hdfs.path =hdfs:/Initial_Data/%Y%m%d
#前缀
f1.sinks.k1.hdfs.filePrefix=%Y%m%d-
#类型
f1.sinks.k1.hdfs.fileType=DataStream
f1.sinks.k1.hdfs.useLocalTimeStamp = true
#不按照条数生成文件
f1.sinks.k1.hdfs.rollCount=0
#HDFS上的文件达到10M时生成一个文件
f1.sinks.k1.hdfs.rollSize=10240
#HDFS上的文件达到60分钟生成一个文件
f1.sinks.k1.hdfs.rollInterval=360

5,测试程序

import socket
import datetime
import time


class Flume_test():

    def __init__(self):
        self.flume_host = '192.168.137.183'
        self.flume_port = 55555

    def get_conn(self):
        tcp_cli = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        return tcp_cli
    def get_date(self):
        return "flume_test_datetime:%s\n"%datetime.datetime.now()
    def main(self):
        cli = self.get_conn()
        cli.connect((self.flume_host, self.flume_port))
        while 1:
            data  = self.get_date()
            print(data)
            cli.sendall((data).encode('utf-8'))
            recv = cli.recv(1024)
            print(recv)
            time.sleep(1)
        cli.close()


if __name__ == '__main__':
    df = Flume_test()
    df.main()
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值