flume的简单应用

 

主要实现的是数据源到flume然后sink到对应topic里,随后项目会写到hbase

 

首先,数据源的地址要创建然后把文件放入

 

目前数据源这一块是完成了

然后我们配置flume上面的东西

events.sources = eventsSource
events.channels = eventsChannel
events.sinks = eventsSink

# Use a channel which buffers events in a directory
events.channels.eventsChannel.type = file
events.channels.eventsChannel.checkpointDir = /var/flume/checkpoint/events
events.channels.eventsChannel.dataDirs = /var/flume/data/events

# Setting the source to spool directory where the file exists
events.sources.eventsSource.type = spooldir
events.sources.eventsSource.deserializer = LINE
events.sources.eventsSource.deserializer.maxLineLength = 6400
events.sources.eventsSource.spoolDir = /events/input/intra/events
events.sources.eventsSource.includePattern = events_[0-9]{4]-[0-9]{2]-[0-9]{2].csv
events.sources.eventsSource.channels = eventsChannel

# Define / Configure sink
events.sinks.eventsSink.type = org.apache.flume.sink.kafka.KafkaSink
events.sinks.eventsSink.batchSize = 640
events.sinks.eventsSink.brokerList = sandbox-hdp.hortonworks.com:6667
events.sinks.eventsSink.topic = events
events.sinks.eventsSink.channel = eventsChannel

 然后

写到这里,之后会提示你保存退出

(这是用events举例,剩下的同理)

,这样第二步flume已经处理完,剩下最后一步就是创建topic

kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create --topic users --partitions 3 -- replication-factor 1
  275  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create --topic users --partitions 3 --replication-factor 1
  276  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --list
  277  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create --topic users --partitions 3 --replication-factor 1
  278  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --list
  279  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create --topic user_friends_raw --partotions 3 --replication-factor 1
  280  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create --topic user_friends_raw --partitions 3 --replication-factor 1
  281  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --list
  282  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create events --patitions 3 --replication-factor 1
  283  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create topic events --patitions 3 --replication-factor 1
  284  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create --topic events --patitions 3 --replication-factor 1
  285  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create --topic events --partitions 3 --replication-factor 1
  286  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create --topic event_attendees_raw --partitions 3 --replication-factor 1
  287  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create --topic test --partitions 3 --replication-factor 1
  288  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create --topic train --partitions 3 --replication-factor 1
  289  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --list
  290  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create --topic user_friends --partitions 3 --replication-factor 1
  291  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --create --topic event_attendees --partitions 3 --replication-factor 1
  292  kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --list
  293  mkdir -p /var/flume/checkpoint/users
  294  chmod 777/var/flume/checkpoint/users
  295  chmod 777 /var/flume/checkpoint/users
  296  mkdir -p /var/flume/checkpoint/user_friends_raw
  297  mkdir -p /var/flume/checkpoint/user_friends
  298  mkdir -p /var/flume/checkpoint/train
  299  mkdir -p /var/flume/checkpoint/test
  300  mkdir -p /var/flume/checkpoint/events
  301  mkdir -p /var/flume/checkpoint/event_attendees_raw
  302  mkdir -p /var/flume/checkpoint/event_attendees
  303  chmod -R 777 /var/flume/checkpoint/
  304  ll/var/flume/checkpoint/
  305  ll /var/flume/checkpoint/
  306  mkdir -p /var/flume/data/users
  307  mkdir -p /var/flume/data/user_friends_raw
  308  mkdir -p /var/flume/data/user_friends
  309  mkdir -p /var/flume/data/train
  310  mkdir -p /var/flume/data/test
  311  mkdir -p /var/flume/data/events
  312  mkdir -p /var/flume/data/event_attendees_raw
  313  mkdir -p /var/flumedata/event_attendees
  314  chmod -R 777 /var/data/checkpoint/
  315  chmod -R 777 /var/data/
  316  chmod -R 777 /var/flume/data/
  317  ll /var/flume/data/
  318  mkdir /events/input/intra/users/
  319  cd /BDSP2/
  320  ll

 

这是创建topic并且创建了刚刚在flume里面要source进去的目录,更改权限并且cd到源文件目录里,历史代码有一些有错误跳过即可

中间重现过一个问题,就是topic创建错误,我需要删掉那个topic,如果直接delete,并不是彻底删除,而是给topic加一个标记

删除总共两步,首先去kafka的broker

 

 

kafka-topics.sh --zookeeper sandbox-hdp.hortonworks.com:2181 --delete --topic event_attendees_row 

然后我们开始进行多米诺的骨牌

我们需要把源文件输入到flume的source中,形成流数据,然后给一个监听去看

321  install -m 777 users.csv /events/input/intra/users/users_2018_10_18.csv
  322  install -m 777 users.csv /events/input/intra/user_friends_raw/user_friends_raw_2018_10_18.csv
  323  install -m 777 user_friends.csv /events/input/intra/user_friends_raw/user_friends_raw_2018_10_18.csv
  324  install -m 777 event_attendees.csv /events/input/intra/event_attendees/event_attendees_2018_10_18.csv
  325  install -m 777 test.csv /events/input/intra/test/test_2018_10_18.csv
  326  install -m 777 train.csv /events/input/intra/train/train_2018_10_18.csv
  327* install -m 777 events.csv /events/input/intra/events/events_2018_10_18.csv 

install -m的命令我的理解是,把文件移动到目标位置并且更名 -m给予权限,作死给个777

然后给个监听

kafka-console-consumer.sh --bootstrap-server sandbox-hdp.hortonworks.com:6667 --topic events --from-beginning

你会看到数据

就是这种感觉

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值