（1）Storm实时日志分析实战--项目准备

最新推荐文章于 2024-08-26 09:19:44 发布

阳小林

最新推荐文章于 2024-08-26 09:19:44 发布

阅读量4k

点赞数

分类专栏： Storm

本文链接：https://blog.csdn.net/fjse51/article/details/53841738

版权

Storm 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

流程图

这里写图片描述

Flume收集Nginx的日志，然后存在Kafka队列中，由storm读取Kafka中的日志信息，经过相关处理后，保存到HBase和MySQL中

安装步骤Kafka

从官网下载安装包，解压到安装目录

到kafka官网下载页面下载:http://kafka.apache.org/downloads版本：kafka_2.10-0.8.1.1.tgz
```
$ tar -zxvf kafka_2.10-0.8.1.1.tgz -C /work/opt/modules/
```

修改配置文件
/opt/modules/kafka_2.10-0.8.2.1/config/server.properties

broker.id=0
port=9092
host.name=bigdata01.com
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/work/opt/modules/kafka_2.10-0.8.2.1/log-data
num.partitions=1
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
log.cleaner.enable=false
zookeeper.connect=bigdata01.com:2181
zookeeper.connection.timeout.ms=6000

启动broker

启动之前，要先确保Zookeeper正常运行。broker启动命令如下：

$ nohup bin/kafka-server-start.sh config/server.properties > logs/server-start.log 2>&1 &

查看进程是否正常：
```
$ ps -ef | grep kafka
```
检查端口9092是否开放：
```
$ netstat -tlnup | grep 9092
```

创建topic

kafka正常启动运行后，在kafka解压路径下，执行命令：

$ bin/kafka-topics.sh --create --topic nginxlog --partitions 1 --replication-factor 1 --zookeeper bigdata01.com:2181

查看topic详情：

$ bin/kafka-topics.sh --describe --topic nginxlog --zookeeper bigdata01.com:2181

启动console消息生产者，发送消息到kafka的topic上

$ bin/kafka-console-producer.sh --broker-list bigdata01.com:9092 --topic nginxlog

这里写图片描述

启动console消息消费者，读取kafka上topic的消息

$ bin/kafka-console-consumer.sh --zookeeper bigdata01.com:
2181 --topic nginxlog --from-beginning

这里写图片描述

模拟产生Nginx日志文件

在服务器上创建一个工作目录

mkdir -p /home/beifeng/project_workspace

将data-generate-1.0-SNAPSHOT-jar-with-dependencies.jar文件上传到刚才创建好的工作目录
下载地址

执行命令

java -jar data-generate-1.0-SNAPSHOT-jar-with-dependencies.jar 100 >> nginx.log

通过tail -f nginx.log 查看日志生成情况
停止产生日志，先使用jps查看进程pid，然后kill掉

配置Flume

编写flume agent配置文件flume-kafka-storm.properties
内容如下：


# The configuration file needs to define the sources, 


# the channels and the sinks.


# Sources, channels and sinks are defined per agent, 


# in this case called 'agent'


a1.sources =s1

a1.channels =c1

a1.sinks = kafka_sink


# define sources

a1.sources.s1.type = exec

a1.sources.s1.command =tail -F /home/beifeng/project_workplace/nginx.log



#define channels

a1.channels.c1.type = memory

a1.channels.c1.capacity = 100

a1.channels.c1.transactionCapacity = 100



#define kafka sinks

a1.sinks.kafka_sink.type =org.apache.flume.sink.kafka.KafkaSink
a1.sinks.kafka_sink.topic=nginxlog 
a1.sinks.kafka_sink.brokerList=bigdata01.com:9092
a1.sinks.kafka_sink.requireAcks=1
a1.sinks.kafka_sink.batch=20




# Bind the source and sink to the channel

a1.sources.s1.channels = c1
a1.sinks.kafka_sink.channel = c1

启动flume agent

$ bin/flume-ng agent -n a1 -c conf/ --conf-file conf/flume-kafka-storm.properties -Dflume.root.logger=INFO,console

启动kakfa的console消费者查看是否有日志产生

阳小林

关注

0
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录