Windows PC上创建大数据职业技能竞赛实验环境之六--Flume、Kafka和Flink编程

本文详细介绍了如何在Windows环境下配置大数据工具,包括Flume的日志采集、Kafka的消息中间件、Flink的数据流处理以及Redis的数据存储。通过实例操作,如Flume的Avro和netcat测试,Kafka的生产者与消费者,Flink的WordCount示例以及Redis的启动和连接,全面展示了这些工具的使用流程。
摘要由CSDN通过智能技术生成

1 Flume

参看日志采集工具Flume的安装与使用方法_厦大数据库实验室博客 (xmu.edu.cn)

查看Flume安装

root@client1:~# flume-ng version
Flume 1.7.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 511d868555dd4d16e6ce4fedc72c2d1454546707
Compiled by bessbd on Wed Oct 12 20:51:10 CEST 2016
From source with checksum 0d21b3ffdc55a07e1d08875872c00523

Avro测试

创建agent配置文件。

root@client1:~# cd hadoop/apache-flume*/conf
root@client1:~/hadoop/apache-flume-1.7.0-bin/conf# vi avro.conf
  a1.sources = r1
  a1.sinks = k1
  a1.channels = c1

# Describe/configure the source
  a1.sources.r1.type = avro
  a1.sources.r1.channels = c1
  a1.sources.r1.bind = 0.0.0.0
  a1.sources.r1.port = 4141
#注意这个端口名,在后面的教程中会用得到

# Describe the sink
  a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
  a1.channels.c1.type = memory
  a1.channels.c1.capacity = 1000
  a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
  a1.sources.r1.channels = c1
  a1.sinks.k1.channel = c1

启动flume agent a1

root@client1:~/hadoop/apache-flume-1.7.0-bin/conf# cd
root@client1:~# flume-ng agent -c . -f hadoop/apache-flume-1.7.0-bin/conf/avro.conf -n a1 -Dflume.root.logger=INFO,console
Info: Including Hadoop libraries found via (/root/hadoop/hadoop-2.7.7/bin/hadoop) for HDFS access
Info: Including Hive libraries found via (/root/hadoop/apache-hive-2.3.4-bin) for Hive access
+ exec /usr/lib/jvm/java-8-openjdk-amd64/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp '/root:/root/hadoop/apache-flume-1.7.0-bin/lib/*:/root/hadoop/hadoop-2.7.7/etc/hadoop:/root/hadoop/hadoop-2.7.7/share/hadoop/common/lib/*:/root/hadoop/hadoop-2.7.7/share/hadoop/common/*:/root/hadoop/hadoop-2.7.7/share/hadoop/hdfs:/root/hadoop/hadoop-2.7.7/share/hadoop/hdfs/lib/*:/root/hadoop/hadoop-2.7.7/share/hadoop/hdfs/*:/root/hadoop/hadoop-2.7.7/share/hadoop/yarn/lib/*:/root/hadoop/hadoop-2.7.7/share/hadoop/yarn/*:/root/hadoop/hadoop-2.7.7/share/hadoop/mapreduce/lib/*:/root/hadoop/hadoop-2.7.7/share/hadoop/mapreduce/*:/root/hadoop/hadoop-2.7.7/contrib/capacity-scheduler/*.jar:/root/hadoop/apache-hive-2.3.4-bin/lib/*' -Djava.library.path=:/root/hadoop/hadoop-2.7.7/lib/native org.apache.flume.node.Application -f hadoop/apache-flume-1.7.0-bin/conf/avro.conf -n a1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/hadoop/apache-flume-1.7.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/hadoop/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/hadoop/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
22/04/15 09:07:50 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
22/04/15 09:07:50 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:hadoop/apache-flume-1.7.0-bin/conf/avro.conf
22/04/15 09:07:50 INFO conf.FlumeConfiguration: Added sinks: k1 Agent: a1
22/04/15 09:07:50 INFO conf.FlumeConfiguration: Processing:k1
22/04/15 09:07:50 INFO conf.FlumeConfiguration: Processing:k1
22/04/15 09:07:50 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [a1]
22/04/15 09:07:50 INFO node.AbstractConfigurationProvider: Creating channels
22/04/15 09:07:50 INFO channel.DefaultChannelFactory: Creating instance of channel c1 type memory
22/04/15 09:07:50 INFO node.AbstractConfigurationProvider: Created channel c1
22/04/15 09:07:50 INFO source.DefaultSourceFactory: Creating instance of source r1, type avro
22/04/15 09:07:50 INFO sink.DefaultSinkFactory: Creating instance of sink: k1, type: logger
22/04/15 09:07:50 INFO node.AbstractConfigurationProvider: Channel c1 connected to [r1, k1]
22/04/15 09:07:50 INFO node.Application: Starting new configuration:{ sourceRunners:{r1=EventDrivenSourceRunner: { source:Avro source r1: { bindAddress: 0.0.0.0, port: 4141 } }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@441ddd7c counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} }
22/04/15 09:07:50 INFO node.Application: Starting Channel c1
22/04/15 09:07:50 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean.
22/04/15 09:07:50 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: c1 started
22/04/15 09:07:50 INFO node.Application: Starting Sink k1
22/04/15 09:07:50 INFO node.Application: Starting Source r1
22/04/15 09:07:50 INFO source.AvroSource: Starting Avro source r1: { bindAddress: 0.0.0.0, port: 4141 }...
22/04/15 09:07:50 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SOURCE, name: r1: Successfully registered new MBean.
22/04/15 09:07:50 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: r1 started
22/04/15 09:07:50 INFO source.AvroSource: Avro source r1 started.

打开另外一个终端,在/usr/local/flume下写入一个文件log.00。

root@client1:~# echo "hello world" > log.00
root@client1:~# flume-ng avro-client --conf conf -H localhost -p 4141 -F ./log.00
Info: Including Hadoop libraries found via (/root/hadoop/hadoop-2.7.7/bin/hadoop) for HDFS access
Info: Including Hive libraries found via (/root/hadoop/apache-hive-2.3.4-bin) for Hive access
+ exec /usr/lib/jvm/java-8-openjdk-amd64/bin/java -Xmx20m -cp 'conf:/root/hadoop/apache-flume-1.7.0-bin/lib/*:/root/hadoop/hadoop-2.7.7/etc/hadoop:/root/hadoop/hadoop-2.7.7/share/hadoop/common/lib/*:/root/hadoop/hadoop-2.7.7/share/hadoop/common/*:/root/hadoop/hadoop-2.7.7/share/hadoop/hdfs:/root/hadoop/hadoop-2.7.7/share/hadoop/hdfs/lib/*:/root/hadoop/hadoop-2.7.7/share/hadoop/hdfs/*:/root/hadoop/hadoop-2.7.7/share/hadoop/yarn/lib/*:/root/hadoop/hadoop-2.7.7/share/hadoop/yarn/*:/root/hadoop/hadoop-2.7.7/share/hadoop/mapreduce/lib/*:/root/hadoop/hadoop-2.7.7/share/hadoop/mapreduce/*:/root/hadoop/hadoop-2.7.7/contrib/capacity-scheduler/*.jar:/root/hadoop/apache-hive-2.3.4-bin/lib/*' -Djava.library.path=:/root/hadoop/hadoop-2.7.7/lib/native org.apache.flume.client.avro.AvroCLIClient -H localhost -p 4141 -F ./log.00
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/hadoop/apache-flume-1.7.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/hadoop/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/hadoop/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
22/04/15 09:14:25 WARN api.NettyAvroRpcClient: Using default maxIOWorkers
root@client1:~#

 第一个终端(agent窗口)下的显示,也就是在日志控制台,就会把log.00文件的内容打印出来。

22/04/15 09:14:25 INFO ipc.NettyServer: [id: 0x26eaff20, /127.0.0.1:48154 => /127.0.0.1:4141] OPEN
22/04/15 09:14:25 INFO ipc.NettyServer: [id: 0x26eaff20, /127.0.0.1:48154 => /127.0.0.1:4141] BOUND: /127.0.0.1:4141
22/04/15 09:14:25 INFO ipc.NettyServer: [id: 0x26eaff20, /127.0.0.1:48154 => /127.0.0.1:4141] CONNECTED: /127.0.0.1:48154
22/04/15 09:14:25 INFO ipc.NettyServer: [id: 0x26eaff20, /127.0.0.1:48154 :> /127.0.0.1:4141] DISCONNECTED
22/04/15 09:14:25 INFO ipc.NettyServer: [id: 0x26eaff20, /127.0.0.1:48154 :> /127.0.0.1:4141] UNBOUND
22/04/15 09:14:25 INFO ipc.NettyServer: [id: 0x26eaff20, /127.0.0.1:48154 :> /127.0.0.1:4141] CLOSED
22/04/15 09:14:25 INFO ipc.NettyServer: Connection to /127.0.0.1:48154 disconnected.
22/04/15 09:14:28 INFO sink.LoggerSink: Event: { headers:{} body: 68 65 6C 6C 6F 20 77 6F 72 6C 64                hello world }

netcatsource测试

创建agent配置文件。

root@client1:~# cd hadoop/apache-flume*/conf
root@client1:~/hadoop/apache-flume-1.7.0-bin/conf# vi netsource.conf

启动flume agent (即打开日志控制台)。

root@client1:~/hadoop/apache-flume-1.7.0-bin/conf# cd
root@client1:~# flume-ng agent -c . -f hadoop/apache-flume-1.7.0-bin/conf/netsource.conf -n a1 -Dflume.root.logger=INFO,
console
Info: Including Hadoop libraries found via (/root/hadoop/hadoop-2.7.7/bin/hadoop) for HDFS access
Info: Including Hive libraries found via (/root/hadoop/apache-hive-2.3.4-bin) for Hive access
+ exec /usr/lib/jvm/java-8-openjdk-amd64/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp '/root:/root/hadoop/apache-flume-1.7.0-bin/lib/*:/root/hadoop/hadoop-2.7.7/etc/hadoop:/root/hadoop/hadoop-2.7.7/share/hadoop/common/lib/*:/root/hadoop/hadoop-2.7.7/share/hadoop/common/*:/root/hadoop/hadoop-2.7.7/share/hadoop/hdfs:/root/hadoop/hadoop-2.7.7/share/hadoop/hdfs/lib/*:/root/hadoop/hadoop-2.7.7/share/hadoop/hdfs/*:/root/hadoop/hadoop-2.7.7/share/hadoop/yarn/lib/*:/root/hadoop/hadoop-2.7.7/share/hadoop/yarn/*:/root/hadoop/hadoop-2.7.7/share/hadoop/mapreduce/lib/*:/root/hadoop/hadoop-2.7.7/share/hadoop/mapreduce/*:/root/hadoop/hadoop-2.7.7/contrib/capacity-scheduler/*.jar:/root/hadoop/apache-hive-2.3.4-bin/lib/*' -Djava.library.path=:/root/hadoop/hadoop-2.7.7/lib/native org.apache.flume.node.Application -f hadoop/apache-flume-1.7.0-bin/conf/netsource.conf -n a1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/hadoop/apache-flume-1.7.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/hadoop/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/hadoop/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
22/04/15 09:26:23 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
22/04/15 09:26:23 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:hadoop/apache-flume-1.7.0-bin/conf/netsource.conf
22/04/15 09:26:23 INFO conf.FlumeConfiguration: Added sinks: k1 Agent: a1
22/04/15 09:26:23 INFO conf.FlumeConfiguration: Processing:k1
22/04/15 09:26:23 INFO conf.FlumeConfiguration: Processing:k1
22/04/15 09:26:23 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [a1]
22/04/15 09:26:23 INFO node.AbstractConfigurationProvider: Creating channels
22/04/15 09:26:23 INFO channel.DefaultChannelFactory: Creating instance of channel c1 type memory
22/04/15 09:26:23 INFO node.AbstractConfigurationProvider: Created channel c1
22/04/15 09:26:23 INFO source.DefaultSourceFactory: Creating instance of source r1, type netcat
22/04/15 09:26:23 INFO sink.DefaultSinkFactory: Creating instance of sink: k1, type: logger
22/04/15 09:26:23 INFO node.AbstractConfigurationProvider: Channel c1 connected to [r1, k1]
22/04/15 09:26:23 INFO node.Application: Starting new configuration:{ sourceRunners:{r1=EventDrivenSourceRunner: { source:org.apache.flume.source.NetcatSource{name:r1,state:IDLE} }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@2631fbbf counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} }
22/04/15 09:26:23 INFO node.Application: Starting Channel c1
22/04/15 09:26:23 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean.
22/04/15 09:26:23 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: c1 started
22/04/15 09:26:23 INFO node.Application: Starting Sink k1
22/04/15 09:26:23 INFO node.Application: Starting Source r1
22/04/15 09:26:23 INFO source.NetcatSource: Source starting
22/04/15 09:26:23 INFO source.NetcatSource: Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:44444]

再打开一个终端,输入命令:telnet localhost 44444。

root@client1:~# telnet localhost 44444
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

如果终端提示没有安装telnet,使用apt install telnet安装。

然后我们可以在终端下输入任何字符,第一个终端的日志控制台也会有相应的显示,如我们输入”hello,world”。

root@client1:~# telnet localhost 44444
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Hello world
OK

第一个终端的日志控制台显示。

<
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值