模拟客户端将数据flume存储在hdfs上

1.首先,需要在机器上安装flume工具
2.安装flume:
在文件夹 cd /usr/hdp/2.4.0.0-169/下创建flume文件目录
tar zxvf apache-flume-1.6.0-bin.tar.gz
mv apache-flume-1.6.0-bin flume
进入到flume文件的conf文件夹下边cd conf
3.在conf文件夹下创建test1文件:
vim test1:

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind =0.0.0.0
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

a1.sinks.k1.type = logger–表示以日志的形式打印

返回到flume文件夹下cd ..
然后运行flume:命令如下:
bin/flume-ng agent –conf /conf/ –conf-file conf/test1 –name a1 -Dflume.root.logger=INFO,console
此时处于等待状态,图片
这里写图片描述
此时需要在eclipse端启动一个客户端:
在flume的avro source

package com.sxt.flume;

import org.apache.commons.io.output.ThresholdingOutputStream;
import org.apache.flume.Event;
import org.apache.flume.EventDeliveryException;
import org.apache.flume.api.RpcClient;
import org.apache.flume.api.RpcClientFactory;
import org.apache.flume.event.EventBuilder;
import java.nio.charset.Charset;

// 官方例程
/*
 * As of Flume 1.4.0, Avro is the default RPC protocol. 
 * The NettyAvroRpcClient and ThriftRpcClient implement the RpcClient interface. 
 * The client needs to create this object with the host and port of the target Flume agent, 
 * and can then use the RpcClient to send data into the agent. 
 * The following example shows how to use the Flume Client SDK API within a user’s data-generating application:
 */
public class FlumeClientTest {
    public static void main(String[] args) throws Exception{
        MyRpcClientFacade client = new MyRpcClientFacade();
        // Initialize client with the remote Flume agent's host and port
        client.init("192.168.1.241", 44444);

        // Send 10 events to the remote Flume agent. That agent should be
        // configured to listen with an AvroSource.
        // String sampleData = "ERROR";
        String sampleData = "Hello Flume!";
        System.out.println("发送数据");
          for (int i = 0; i < 10; i++) {
              client.sendDataToFlume(sampleData+"    "+i);
          }   
        client.cleanUp();
    }
}

class MyRpcClientFacade {
    private RpcClient client;
    private String hostname;
    private int port;

    public void init(String hostname, int port) {
        // Setup the RPC connection
        this.hostname = hostname;
        this.port = port;
        this.client = RpcClientFactory.getDefaultInstance(hostname, port);
        System.out.println("建立连接");
        // Use the following method to create a thrift client (instead of the
        // above line):
        // this.client = RpcClientFactory.getThriftInstance(hostname, port);
    }

    public void sendDataToFlume(String data) {
        // Create a Flume Event object that encapsulates the sample data
        Event event = EventBuilder.withBody(data, Charset.forName("UTF-8"));

        // Send the event
        try {
            client.append(event);
        } catch (EventDeliveryException e) {
            // clean up and recreate the client
            client.close();
            client = null;
            client = RpcClientFactory.getDefaultInstance(hostname, port);
            // Use the following method to create a thrift client (instead of
            // the above line):
            // this.client = RpcClientFactory.getThriftInstance(hostname, port);
        }
    }

    public void cleanUp() {
        // Close the RPC connection
        System.out.println("断开连接");
        client.close();
    }

}

项目的目录
这里写图片描述f
执行代码后开启flume的telnet 192.168.1.241 44444(复制节点)
这里写图片描述

二、使用avro source将客户端输入的数据写到本地的文件夹下
vim test2:

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.bind =192.168.1.241
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type = file_roll
a1.sinks.k1.sink.directory=/home/sink
a1.sinks.k1.sink.rollInterval=0

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

a1.sinks.k1.type = file_roll
a1.sinks.k1.sink.directory=/home/sink
接收到的数据存放到了sink本地的文件夹下边
1)执行flume的test2
bin/flume-ng agent –conf /conf/ –conf-file conf/test2 –name a1 -Dflume.root.logger=INFO,console
2)运行java代码
去文件/home/sink/下查看存放的数据
这里写图片描述
执行一次java代码就会多10条数据
三、读取文件夹下的目录内容存放到hdfs上
vim test4:

# example.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type =spooldir 
a1.sources.r1.spoolDir=/opt/data

# Describe the sink
a1.sinks.k1.type=hdfs 
a1.sinks.k1.hdfs.path=hdfs://192.168.1.245:8020/flume/data
a1.sinks.k1.hdfs.rollSize=10240000
a1.sinks.k1.hdfs.rollInterval=0
a1.sinks.k1.hdfs.rollCount=0
a1.sinks.k1.hdfs.idleTimeout=5
a1.sinks.k1.hdfs.fileType=DataStream


# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

首先,hdfs的集群可用,将存储到hdfs的active的节点上,文件夹/opt/data/的文件夹存在,并且有文件
这里写图片描述
运行flume下的test4后查看hdfs目录下存在文件
这里写图片描述
查看文件的内容可以使用hdfs
这里写图片描述
四、将avro端读取的数据传到hdfs上边
vim test6:

# example.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type =avro
a1.sources.r1.bind =0.0.0.0
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type=hdfs 
a1.sinks.k1.hdfs.path=hdfs://192.168.1.245:8020/flume/data/%Y-%m-%d
a1.sinks.k1.hdfs.rollSize=10240000
a1.sinks.k1.hdfs.rollInterval=0
a1.sinks.k1.hdfs.rollCount=0
#a1.sinks.k1.hdfs.idleTimeout=5
a1.sinks.k1.hdfs.fileType=DataStream
a1.sinks.k1.hdfs.useLocalTimeStamp=true



# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

a1.sinks.k1.hdfs.path=hdfs://192.168.1.245:8020/flume/data/%Y-%m-%d
hdfs的数据按目录存放。
执行flume的test6文件,之后运行java客户端的代码,查看hdfs端
这里写图片描述
执行一次java代码,就会增加10条数据

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值