安装flume-1.6.0后,测试所遇bug

案例1:Avro

Avro可以发送一个给定的文件给Flume,Avro 源使用AVRO RPC机制。
      a)创建agent配置文件

vi /usr/local/flume-1.6.0/conf/avro.conf


a1.sources = r1
a1.sinks = k1
a1.channels = c1
  
# Describe/configure the source
a1.sources.r1. type = avro
a1.sources.r1.channels = c1
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 4141
  
# Describe the sink
a1.sinks.k1. type = logger
  
# Use a channel which buffers events in memory
a1.channels.c1. type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
  
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1


启动 flume

./flume-ng agent -c . -f /usr/local/flume-1.6.0/conf/avro.conf -n a1 -Dflume.root.logger=INFO,console

打印如下日志

Info: Including Hive libraries found via () for Hive access
+ exec /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.9.x86_64/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp '/usr/local/flume-1.6.0/bin:/usr/local/flume-1.6.0/lib/*:/lib/*' -Djava.library.path= org.apache.flume.node.Application -f /usr/local/flume-1.6.0/conf/avro.conf -n a1
log4j:WARN No appenders could be found for logger (org.apache.flume.lifecycle.LifecycleSupervisor).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.


创建指定文件

echo "hello world" > /usr/local/flume-1.6.0/test.log

使用avro-client发送文件

./flume-ng avro-client -c /usr/local/flume-1.6.0/conf -H m1 -p 4141 -F/usr/local/flume-1.6.0/test.log

控制台无反应,测试失败!

查询资料发现是因为使用 -c 指定的conf位置出错;

修改执行命令:   ./flume-ng agent -c /usr/local/flume-1.6.0/conf -f /usr/local/flume-1.6.0/conf/avro.conf -n a1 -Dflume.root.logger=INFO,console

控制台输出日志如下:

 ......

2015-10-21 15:45:13,739 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink k1
2015-10-21 15:45:13,744 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:184)] Starting Source r1
2015-10-21 15:45:13,745 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.source.AvroSource.start(AvroSource.java:228)] Starting Avro source r1: { bindAddress: 0.0.0.0, port: 4141 }...
2015-10-21 15:45:14,133 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:120)] Monitored counter group for type: SOURCE, name: r1: Successfully registered new MBean.
2015-10-21 15:45:14,133 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: SOURCE, name: r1 started
2015-10-21 15:45:14,134 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.source.AvroSource.start(AvroSource.java:253)] Avro source r1 started.

启动成功!


案例二:

 使用flume,直接将文件压缩后,上传到hdfs上.

   定义conf文件  vi hdfs.conf

#定义agent名, source、channel、sink的名称
a1.sources = r1
a1.channels = c1
a1.sinks = k1

#具体定义source
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /home/hadoop/logs

#具体定义channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 100

#定义拦截器,为消息添加时间戳
a1.sources.r1.interceptors = i1
a1.sources.r1.interceptors.i1.type = org.apache.flume.interceptor.TimestampInterceptor$Builder
a1.sources.r1.ignorePattern = ^(.)*\\.tmp$

#具体定义sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://nn1.hadoop:9000/user/flume/syslogtcp/%Y%m%d%H
a1.sinks.k1.hdfs.filePrefix = events-
#a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.fileType = CompressedStream
a1.sinks.k1.hdfs.codeC = bzip2
#不按照条数生成文件
a1.sinks.k1.hdfs.rollCount = 0
#HDFS上的文件达到128M时生成一个文件 134217728
a1.sinks.k1.hdfs.rollSize = 2097152
#HDFS上的文件达到60秒生成一个文件
a1.sinks.k1.hdfs.rollInterval = 60


其中hdfs.conf中标红的是设置传输格式和压缩格式,其中a1.sinks.k1.hdfs.fileType这一配置项只有 SequenceFile, DataStream or CompressedStream三种配置,默认为SequenceFile,二进制不能直接查看.DataStream不压缩输出文件,不能设置codeC选项;CompressedStream为压缩输出文件,hdfs.codeC必须设置,不能为空;

hdfs.codeC具体可以支持的压缩类型,通过查看源码找到为(gzip, bzip2, lzo, snappy)

a1.sinks.k1.hdfs.fileType = CompressedStream
a1.sinks.k1.hdfs.codeC = bzip2

------------------------------------------------- 分割线

a1.sources.r1.ignorePattern = ^(.)*\\.tmp$

   此项设置是忽略监控目录下的临时文件,防止出现,边写边读的情况

  ERROR org.apache.flume.source.SpoolDirectorySource: FATAL: Spool Directory source source1: { spoolDir: /home/hadoop/logs}: Uncaught exception in SpoolDirectorySource thread. Restart or reconfigure Flume to continue processing. java.nio.charset.MalformedInputException

此异常可以通过设置此参数解决:a1.sources.r1.ignorePattern = ^(.)*\\.tmp$





  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值