flume mysql hbase_【flume】5.采集日志进入hbase

最新推荐文章于 2022-11-11 10:48:55 发布

雷达猴子

最新推荐文章于 2022-11-11 10:48:55 发布

阅读量122

点赞数

文章标签： flume mysql hbase

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_29649047/article/details/114169987

版权

设置我们的flume配置信息

# Licensed to the Apache Software Foundation (ASF) under one

# or more contributor license agreements. See the NOTICE file

# distributed withthis work foradditional information

# regarding copyright ownership. The ASF licensesthisfile

# to you under the Apache License, Version2.0(the

#"License"); you may not use thisfile except in compliance

# with the License. You may obtain a copy of the License at

#

# http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing,

# software distributed under the License is distributed on an

#"AS IS"BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY

# KIND, either express or implied. See the Licenseforthe

# specific language governing permissions and limitations

# under the License.

# The configuration file needs to define the sources,

# the channels and the sinks.

# Sources, channels and sinks are defined per agent,

# inthis case called 'agent'agent1.sources=r1

agent1.channels=c1

agent1.sinks=s1

# For each one of the sources, the type is defined

agent1.sources.r1.type=exec

#tail-F /home/oss/cloud_iom/ktpt/iom-cloud-service/logs/iom-app-debug.log

agent1.sources.r1.command= tail -F /home/oss/cloud_iom/ktpt/iom-cloud-service/logs/iom-app-debug.log

# The channel can be defined as follows.

#agent.sources.seqGenSrc.channels=memoryChannel

agent1.sources.r1.channels=c1

# Each sink's type must be defined

agent1.sinks.s1.type =hbase2

agent1.sinks.s1.table=iom_app_debug

agent1.sinks.s1.columnFamily=log

agent1.sinks.s1.serializer=org.apache.flume.sink.hbase2.RegexHBase2EventSerializer

#agent1.sinks.s1.serializer.regex= \\[(.*?)\\]\\ \\[(.*?)\\]\\ \\[(.*?)\\]\\ \\[(.*?)\\]#Specify the channel the sink should use

agent1.sinks.s1.channel=c1

# Each channel's type is defined.

agent1.channels.c1.type =memory

# Other config values specific to each type of channel(sink or source)

# can be defined as well

# Inthis case, it specifies the capacity of the memory channel

agent1.channels.c1.capacity= 100

这个脚本配置好，设置启动命令,使用nohup是为了之后采集器自己后期自动运行

nohup flume-ng --conf hadoop/flume/conf -f hadoop/flume/conf/flume-conf.properties -n agent1 -Dflume.root.logger=DEBUG,console &

我的flume目录：

采集截图

日志文件截图

当然，这里是按行进行采集的(用的tail -F)，但是shell脚本可以自己定义,只要type配置的是exec，后面sink对象也可以自己配置

第一步数据采集，第二步应该是想想如何进行数据分析，当然这样采集的数据直接分析的可能性也不太大，而且数据杂乱无序，我们还需要定义相应的逻辑先对数据进行清洗，然后再采集进去

这里这样采集是有问题的，正确的做法应该是

1.flume采集数据进入hdfs

2.MapReduce对采集进入的数据进行数据清洗，整理数据

3.MapReduce分析数据，解析入库进入hbase，或者直接保存到hdfs

4.sqoop 迁移数据到对应的数据库(mysql，Oracle)

5.根据解析之后的数据，查询Oracle制作报表图像，分析趋势，预测，或者定位问题关系

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
flume mysql hbase_【flume】5.采集日志进入hbase

设置我们的flume配置信息# Licensed to the Apache Software Foundation (ASF) under one# or more contributor license agreements. See the NOTICE file# distributed withthis work foradditional information# regarding...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。