spark streaming 实时流处理实战笔记五

4-1 课程目录

分布式消息队列kafka

  1. kafka概述

    和消息系统类似
    消息中间件:生产者和消费者
    妈妈:生产者
    你:消费者
    馒头:数据流
    正常情况下:生产一个,消费一个
    其他情况:一直生产,你吃到某一个馒头时,你卡住了(机器故障),馒头就丢失了
    一直生产,做馒头速度快,你吃来不及,馒头也就丢失了
    拿个碗/篮子,馒头做好后先放到篮子里,你要吃的时候去篮子里面取出来吃
    篮子/框:kafka
    当篮子满了,馒头就装不下了,怎么办?
    多准备几个篮子== kafka的扩容

    三个特性:发布和订阅,,处理,存储

  2. kafka架构及核心概念
    producer: 生产者,就是生产馒头(妈妈)
    consumer:消费者,就是吃馒头的(你)
    broker:篮子
    topic:主题,给馒头带一个标签,topica的馒头是给你吃的,topicb的馒头是给你弟弟吃的

  3. kafka部署及使用
    单节点单Broker的部署及使用
    单节点多Broker的部署及使用
    多节点多Broker的部署及使用

    kafka需要使用zookeeper ,所以要先把zookeeper安装好

单节点多Broker的部署及使用
[hadoop@hadoop000 config]$ cp server.properties server-1.properties
[hadoop@hadoop000 config]$ cp server.properties server-2.properties
[hadoop@hadoop000 config]$ cp server.properties server-3.properties

kafka-server-start.sh -daemon $KAFKA_HOME/config/server-1.properties &
kafka-server-start.sh -daemon $KAFKA_HOME/config/server-2.properties &
kafka-server-start.sh -daemon $KAFKA_HOME/config/server-3.properties &

创建topic:kafka-topics.sh --create --zookeeper hadoop000:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic

[hadoop@hadoop000 config]$ kafka-topics.sh --create --zookeeper hadoop000:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic
Created topic “my-replicated-topic”.

kafka-topics.sh --list --zookeeper hadoop000:2181

[hadoop@hadoop000 ~]$ kafka-topics.sh --describe --zookeeper hadoop000:2181
Topic:hello_topic PartitionCount:1 ReplicationFactor:1 Configs:
Topic: hello_topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Topic:my-replicated-topic PartitionCount:1 ReplicationFactor:3 Configs:
Topic: my-replicated-topic Partition: 0 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1
[hadoop@hadoop000 ~]$ kafka-topics.sh --list --zookeeper hadoop000:2181

hello_topic
my-replicated-topic

查看所有的topic: kafka-topics.sh --list --zookeeper hadoop000:2181

Describe的使用(查看所有topic的详细信息)
kafka-topics.sh --describe --zookeeper localhost:2181

查看制定topic的详细信息:kafka-topics.sh --describe --zookeeper localhost:2181 --topic hello_topic

发送消息 :broker
kafka-console-producer.sh --broker-list hadoop000:9093, hadoop000:9094, hadoop000:9095 --topic my-replicated-topic

消费:kafka-console-consumer.sh --zookeeper hadoop000:2181 --topic my-replicated-topic

  1. kafka容错性
    查看单个副本的详细信息:
    kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic

3号副本是一个主节点

只要有一个副本正常,就不影响正常使用

  1. kafka API编程
    IDEA+Maven构建开发环境
    produce API 的使用

    consumer API的使用

安装idea后创建项目

  1. kafka实战

整合flume和kafka的综合使用

avro-memory-kafka.conf

avro-memory-kafka.sources = avro-source
avro-memory-kafka.sinks = kafka-sink
avro-memory-kafka.channels = memory-channel

avro-memory-kafka.sources.avro-source.type = avro
avro-memory-kafka.sources.avro-source.bind = hadoop000
avro-memory-kafka.sources.avro-source.port = 44444

avro-memory-kafka.sinks.kafka-sink.type = org.apache.flume.sink.kafka.KafkaSink
avro-memory-kafka.sinks.kafka-sink.brokerList = hadoop000:9092
avro-memory-kafka.sinks.kafka-sink.topic = hello_topic
avro-memory-kafka.sinks.kafka-sink.batchSize = 5
avro-memory-kafka.sinks.kafka-sink.requiredAcks = 1

avro-memory-kafka.channels.memory-channel.type = memory

avro-memory-kafka.sources.avro-source.channels = memory-channel
avro-memory-kafka.sinks.kafka-sink.channel = memory-channel

flume-ng agent
–name avro-memory-kafka
–conf $FLUME_HOME/conf
–conf-file $FLUME_HOME/conf/avro-memory-kafka.conf
-Dflume.root.logger=INFO,console

再启动exec-memory-avro.conf:
flume-ng agent
–name exec-memory-avro
–conf $FLUME_HOME/conf
–conf-file $FLUME_HOME/conf/exec-memory-avro.conf
-Dflume.root.logger=INFO,console

kafka-console-consumer.sh --zookeeper hadoop000:2181 --topic hello_topic

[hadoop@hadoop000 data]$ echo hello123456>>data.log
[hadoop@hadoop000 data]$ echo hello123456>>data.log
[hadoop@hadoop000 data]$ echo hello1236>>data.log
[hadoop@hadoop000 data]$ echo hello1236>>data.log
[hadoop@hadoop000 data]$ echo hello1236>>data.log
[hadoop@hadoop000 data]$ echo hello1236>>data.log
[hadoop@hadoop000 data]$ echo hello1236>>data.log
[hadoop@hadoop000 data]$ echo hello125465>>data.log
[hadoop@hadoop000 data]$ echo hello125465>>data.log
[hadoop@hadoop000 data]$ echo hello125465>>data.log
[hadoop@hadoop000 data]$ echo hello125465>>data.log
[hadoop@hadoop000 data]$

[hadoop@hadoop000 ~]$ kafka-console-consumer.sh --zookeeper hadoop000:2181 --topic hello_topic
hello1236
hello1236
hello1236
hello1236
hello125465
hello125465
hello125465
hello125465

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值