Spark Streaming 02 分布式日志收集框架flume

1 介绍

1.1 产生背景

如何解决数据从其他server移动到hadoop之上。

1.2 概述

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.

1)设计目标

  • 可靠性
  • 扩展性
  • 管理性

2)核心组件

  1. source 收集
  2. channel 聚集(临时存放数据)
  3. sink 输出

2 安装配置

System Requirements:
Java Runtime Environment - Java 1.6 or later (Java 1.7 Recommended)
Memory - Sufficient memory for configurations used by sources, channels or sinks
Disk Space - Sufficient disk space for configurations used by channels or sinks
Directory Permissions - Read/Write permissions for directories used by agent

1)官网下载地址CDH(1.6.0-cdh5.7.0) 版本下载地址
2)解压、配置环境变量
3)配置文件 conf/flume-env.sh,设置JAVA_HOME
4)检测是否安装成功

flume-ng version

这里写图片描述

3 使用

3.1 从指定的网络端口采集数据输出到控制台

1)使用flume的关键就是写配置文件

  • 配饰source
  • 配置channel
  • 配置sink
  • 把以上三个组件串起来

    a1:agent的名称
    r1:source的名称
    k1:sink的名称
    c1:channel的名称
    

2)修改配置文件:conf/example.conf

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

3)启动flume

flume-ng agent \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/example.conf \
--name a1 \
-Dflume.root.logger=INFO,console

4)使用telnet进行测试

telnet localhost 44444
输入内容


5)查看控制台是否有刚刚输入的内容出现
这里写图片描述

3.2 监控一个文件,实时采集新增的数据输出到控制台

1)agent选型:exec source + memory channel + logger sink

2)写配置文件 conf/exec-memory-logger.conf

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command=tail -F file-path
a1.sources.r1.shell=/bin/sh -c

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
#a1.channels.c1.capacity = 1000
#a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

3)启动flume

flume-ng agent \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/exec-memory-logger.conf \
--name a1 \
-Dflume.root.logger=INFO,console

4)向监控的文件中写内容

5)查看flume控制台是否有内容
这里写图片描述

3.3 将A服务器上的日志实时采集到B服务器

1)技术选型

  • exec source + memory channel + avro sink
  • avro source + memory channel +logger sink

2)写配置文件 conf/exec-memory-avro.conf

# Name the components on this agent
exec-memory-avro.sources = exec-source
exec-memory-avro.sinks = avro-sink
exec-memory-avro.channels = memory-channel

# Describe/configure the source
exec-memory-avro.sources.exec-source.type = exec
exec-memory-avro.sources.exec-source.command = tail -F file-path
exec-memory-avro.sources.exec-source.shell = /bin/sh -c

# Describe the sink
exec-memory-avro.sinks.avro-sink.type = avro
exec-memory-avro.sinks.avro-sink.hostname=localhost
exec-memory-avro.sinks.avro-sink.port=44444

# Use a channel which buffers events in memory
exec-memory-avro.channels.memory-channel.type = memory

# Bind the source and sink to the channel
exec-memory-avro.sources.exec-source.channels = memory-channel
exec-memory-avro.sinks.avro-sink.channel = memory-channel

3)写配置文件 conf/avro-memory-logger.conf

# Name the components on this agent
avro-memory-logger.sources = avro-source
avro-memory-logger.sinks = logger-sink
avro-memory-logger.channels = memory-channel

# Describe/configure the source
avro-memory-logger.sources.avro-source.type = avro
avro-memory-logger.sources.avro-source.bind = localhost
avro-memory-logger.sources.avro-source.port = 44444

# Describe the sink
avro-memory-logger.sinks.logger-sink.type = logger
#avro-memory-logger.sinks.logger-sink.hostname=localhost
#avro-memory-logger.sinks.logger-sink.port=44444

# Use a channel which buffers events in memory
avro-memory-logger.channels.memory-channel.type = memory

# Bind the source and sink to the channel
avro-memory-logger.sources.avro-source.channels = memory-channel
avro-memory-logger.sinks.logger-sink.channel = memory-channel

4)启动flume

*** 先启动avro-memory-logger

flume-ng agent \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/avro-memory-logger.conf \
--name avro-memory-logger \
-Dflume.root.logger=INFO,console

*** 再启动exec-memory-avro

flume-ng agent \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/exec-memory-avro.conf \
--name exec-memory-avro \
-Dflume.root.logger=INFO,console

5)向文件中写入新信息

6)查看flume控制台是否有新内容
这里写图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值