整合
1. flume+kafka整合
1.1. 记录log
1.1.1. 创建java项目
创建一个java项目
1.1.2. 将log4j配置文件添加到项目中
log4j将日志输出到文件
1.1.3. 在项目中记录log
在代码中通过log.info(“日志”);记录日志
1.1.4. 将项目打成jar包
将项目打成jar包,运行在linux上
1.2. flume读取log,写到kafka
1.2.1. 将依赖包传到flume的lib下
flumeng-kafka-0.1_1.5.0_0.8.1.jar:flume-kafka的依赖包
kafka_2.10-0.8.1.1.jar:kafka的jar包
scala-compiler-2.10.3.jar,scala-library-2.10.3.jar:scala依赖包
1.2.2. 创建agent配置文件
log4j_agent.conf
a1.sources = r1 a1.sinks = k1 k2 a1.channels = c1 c2
# Describe/configure the source a1.sources.r1.type = exec a1.sources.r1.command = tail -F /home/hadoop/flume/logs/test.log
# Describe the sink a1.sinks.k1.type = logger a1.sinks.k2.type = org.apache.flume.plugins.KafkaSink#sink输出数据的类型 a1.sinks.k2.metadata.broker.list=master:9092,slave1:9092,slave2:9092,slave3:9092#kafka配置 a1.sinks.k2.sink.directory = /home/hadoop/flume/logs#日志文件夹 a1.sinks.k2.partitioner.class=org.apache.flume.plugins.SinglePartition#kafka的分区 a1.sinks.k2.serializer.class=kafka.serializer.StringEncoder#序列化 a1.sinks.k2.request.required.acks=0#设置ack a1.sinks.k2.max.message.size=1000000#一次最大接受message的大小 a1.sinks.k2.producer.type=sync#同步 a1.sinks.k2.encoding=UTF-8#编码 a1.sinks.k2.topic.name=testTopic#发送的topic a1.sinks.k2.channel = c2#绑定
# Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 10000 a1.channels.c1.transactionCapacity = 1000 a1.channels.c2.type = memory a1.channels.c2.capacity = 1000 a1.channels.c2.transactionCapacity = 100
# Bind the source and sink to the channel a1.sources.r1.channels = c1 c2 a1.sources.r1.selector.type = replicating a1.sinks.k1.channel = c1 a1.sinks.k2.channel = c2 |
1.2.3. 将agent配置文件上传到linux
cd /home/hadoop/flume/conf
rz –y
上传log4j_agent.conf
1.3. 运行
1.3.1. 启动集群中所有kafka节点
kafka-server-start.sh /home/hadoop/kafka/config/server.properties&
1.3.2. 通过api的consumer消费消息
运行kafka课程中的consumer例子
1.3.3. 启动flume
flume-ng agent -c/home/hadoop/flume/conf/ -f /home/hadoop/flume/conf/log4j_agent.conf -n a1-Dflume.root.logger=INFO,console
1.3.4. 运行jar包
java –jar xxx.jar
1.3.5. 监控
在kafka的consumer控制台查看消息
2. kafka+storm整合
2.1. 书写pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>storm</groupId> <artifactId>Storm-Test</artifactId> <version>0.0.1-SNAPSHOT</version> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>2.3.2</version> <configuration> <source>1.6</source> <target>1.6</target> <compilerVersion>1.6</compilerVersion> </configuration> </plugin> </plugins> </build> <repositories> <!-- Repository where we can found the storm dependencies --> <repository> <id>clojars.org</id> <url>http://clojars.org/repo</url> </repository> <repository> <id>central</id> <name>Maven Repository Switchboard</name> <url>http://repo1.maven.org/maven2</url> </repository> <repository> <id>maven.oschina.net</id> <url>http://maven.oschina.net/content/groups/public/</url> </repository> </repositories> <dependencies> <!-- Storm Dependency --> <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-lang3</artifactId> <version>3.2.1</version> </dependency> <dependency> <groupId>redis.clients</groupId> <artifactId>jedis</artifactId> <version>2.6.0</version> </dependency> <dependency> <groupId>org.apache.storm</groupId> <artifactId>storm-core</artifactId> <version>0.9.2-incubating</version> <scope>provided</scope> <exclusions> <exclusion> <artifactId>log4j-over-slf4j</artifactId> <groupId>org.slf4j</groupId> </exclusion> </exclusions> </dependency> <dependency> <groupId>org.apache.storm</groupId> <artifactId>storm-kafka</artifactId> <version>0.9.2-incubating</version> </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka_2.9.2</artifactId> <version>0.8.1.1</version> <exclusions> <exclusion> <groupId>org.apache.zookeeper</groupId> <artifactId>zookeeper</artifactId> </exclusion> <exclusion> <groupId>log4j</groupId> <artifactId>log4j</artifactId> </exclusion> </exclusions> </dependency> <dependency> <groupId>org.apache.zookeeper</groupId> <artifactId>zookeeper</artifactId> <version>3.4.5</version> </dependency> </dependencies>
</project> <dependencies>
<!-- Storm Dependency --> <dependency> <groupId>storm</groupId> <artifactId>storm</artifactId> <version>0.9.0.1</version> </dependency> <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-lang3</artifactId> <version>3.2.1</version> </dependency> <dependency> <groupId>redis.clients</groupId> <artifactId>jedis</artifactId> <version>2.6.0</version> </dependency> <dependency> <groupId>storm.kafka</groupId> <artifactId>storm-kafka</artifactId> <version>1.0.4</version> </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka_2.9.2</artifactId> <version>0.8.1</version> </dependency> </dependencies>
</project>
|
2.2. 写个storm的程序
写个storm的程序,在main方法中通过storm-kafka插件,从kafka中读取数据,封装到spout中。
package com.itcast.monitor.main; import java.util.ArrayList;
import storm.kafka.BrokerHosts; import storm.kafka.KafkaSpout; import storm.kafka.SpoutConfig; import storm.kafka.StringScheme; import storm.kafka.ZkHosts; import backtype.storm.Config; import backtype.storm.LocalCluster; import backtype.storm.generated.AlreadyAliveException; import backtype.storm.generated.InvalidTopologyException; import backtype.storm.spout.SchemeAsMultiScheme; import backtype.storm.topology.TopologyBuilder;
import com.itcast.monitor.bolts.HandlerBolt; import com.itcast.monitor.utils.PropertiesUtil;
public class WordMain { private static final String ZOOKEEPER_ADDRESS = "zookeeper.address"; private static final String TOPIC_NAME = "topic.name"; public static void main(String[] args) throws InterruptedException, AlreadyAliveException, InvalidTopologyException { String zookeeperAddr = PropertiesUtil.getProperty(ZOOKEEPER_ADDRESS).toString();//从配置文件中读取zookeeper地址 String topicName = PropertiesUtil.getProperty(TOPIC_NAME).toString();//从配置文件读取topic BrokerHosts brokerHosts = new ZkHosts(zookeeperAddr);//创建kafka的BrokerHosts SpoutConfig kafkaConfig = new SpoutConfig(brokerHosts, topicName, "", "monitor_local");//创建kafkaConfig kafkaConfig.scheme = new SchemeAsMultiScheme(new StringScheme());//设置scheme kafkaConfig.zkServers = new ArrayList<String>() {//设置zookeeper,本地模式需要 { add("192.168.56.201"); add("192.168.56.202"); add("192.168.56.203"); } }; kafkaConfig.zkPort = 2181;//zookeeper端口 /** * 创建topologyBuilder */ TopologyBuilder builder = new TopologyBuilder(); builder.setSpout("readlog", new KafkaSpout(kafkaConfig)); builder.setBolt("handlerbolt", new HandlerBolt()).shuffleGrouping("readlog"); Config config = new Config(); config.setDebug(false); LocalCluster cluster = new LocalCluster(); cluster.submitTopology("DubboService-Test", config, builder.createTopology()); Thread.sleep(2000);//睡一会
} }
|
2.3. 书写bolt
package com.itcast.monitor.bolts;
import backtype.storm.topology.BasicOutputCollector; import backtype.storm.topology.OutputFieldsDeclarer; import backtype.storm.topology.base.BaseBasicBolt; import backtype.storm.tuple.Tuple;
public class HandlerBolt extends BaseBasicBolt {
public void execute(Tuple tuple, BasicOutputCollector collector) { System.out.println(tuple.getString(0)+"============================================"); }
public void declareOutputFields(OutputFieldsDeclarer declarer) {
}
}
|
2.4. 运行storm程序
run as
2.5. 运行kafka的producer
运行day11中的kafka producer例子
2.6. 在storm控制台查看信息
3. flume+kafka+storm
1、 开发web项目,配置log4j,并通过log4j打印日志,如log.info();
2、 开发flume,读取log4j文件,输入到kafka
3、 开发storm,从kafka中读取数据,分析后,写到redis中
4、 从redis中读取数据,实时显示在页面上。