执行shell脚本模拟nginx生成日志:nginx.sh
for((i=0;i<=500000;i++));
do echo "i am lilei"+$i >> 1.log
done
执行flume程序进行数据采集:
flume 任务文件 exec.conf
a1.sources = r1
a1.channels = c1
a1.sinks = k1
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /root/1.log
a1.sources.r1.channels = c1
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity=100
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic=all1
a1.sinks.k1.brokerList=mini1:9092
a1.sinks.k1.requiredAcks=1
a1.sinks.k1.batchSize=20
a1.sinks.k1.channel=c1
启动flume 程序:
bin/flume-ng agent -c conf -f conf/exec.conf -n a1 -Dflume.root.logger=INFO,console
kafka 集群收集数据:
启动kafka
bin/kafka-server-start.sh config/server.properties
创建topic
sh bin/kafka-console-consumer.sh --zookeeper mini1:2181 --from-beginning --topic all1
storm 程序进行单词统计
https://github.com/JiyangM/stom/tree/master/src/main/java/cn/itcast/storm/kafkastormredis