Kafka Introduction

Kafka Introduction

Publish & Subscribe

    Kafka is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies. companies.(Kafka用于构建实时,流处理应用。具有水平可扩展,容错,快速的特点,并且可以在生成中使用)


Store

    Store streams of data safely in a distributed, replicated, fault-tolerant cluster.

    可以安全的,多副本的,安全的存储在集群上。

    我们可以把Kafka抽象为一个消息中间件(Buffer)。位于消息的产生者和消费者这间(消息可能有不同的产生者Producer和消费者Consumer)。

Kafka框架

  • producer: 生产者,测试机,生产大量的测试结果
  • consumer:消费者,YMS,分析存储这些测试结果
  • broker:  测试机台与YMS的中间件,是一个application。
  • topic: 主题,给测试的结构带一个标签,topic_inline的测试文件,topic_dev的测试文件,topic_offline的测试文件

Introduction

Apache Kafka is a distributed streaming platform. Waht exactly does that mean?

Kafka是一个分布式流处理平台,有哪些特征?


    A streaming platform has three key capabilities:

  • Publish and subscibe to streams of records, similar to a message queue or enterprise messaging system.(消息系统)
  • Store streams of records in a fault-tolerant durable way.(容错)
  • Process streams of records as they occur.(实时)
    Kafka is generall used for two broad classes of applications:
  • Building real-time streaming data pipelines that reliably get data between systems or applications
  • Building real-time streaming applications that transform or react to the streams of data.
    First a few concepts:
  • Kafka is run as a cluster on one or more servers(单个机器不够存储的话,就用多个机器来存储) that can span multiple datacenters.
  • The Kafka cluster stores streams of records in categories called topics.(Kafka的流可以根据topic分类)
  • Each record consists of a key, a value, and a timestamp(时间戳).

    Kafka has four core APIs:
  • The Producer API allows an application to publish a stream of records to one or more Kafka topics.
  • The Consumer API allows an application to subscribe to one or more topics and process the stream of records produced to them.
  • The Stream API allows an application to act as a stream processor, consuming an input stream from one or more topics, effectively transforming the input streams to output streams.
  • The Connector API allows building and running reusable producers or consumers that connect Kafka topics to existing applications or data systems. For example, a connector to a relational database might capture every change to a table.

Topics and Logs

    Let's first dive into the core abstraction Kafka provides for a stream of records - the topic.
    A topic is a category or feed name to which records are published. Topics in Kafka are always multi-subscriber; that is, a topic can have zero, one, or many consumers that subscribe to the data written to it.

Kafka的部署和使用

  • 单节点单个broker部署和使用
  • 单节点多个broker部署和使用
  • 多节点多个broker部署和使用

配置单节点单个broker

    Kafka的使用,需要用到ZooKeeper,所以在使用Kafa之前,你需要先安装Zookeeper

bash

export ZK_HOME = ....
export PATH=$ZK_HOME/bin:$PATH

配置zoo.cfg conf/zoo.cfg
数据存储的目录
dataDir= ...

启动zookeeper
bin/zookeeper-server-start.sh config/zookeeper.properties 或
bin/zkServer.sh start

jps可以看到
QuorumPeerMain进程表示启动ZK成功

zkCli.sh链接

#配置kafka
export KAFKA_HOME=...
export PATH=$KAFKA_HOME/BIN:$PATH

#配置 $KAFKA_HOME/config/server.properties
broker.id=0 # id of broker
listeners=PLAINTEXT://:9092 #监听的端口,默认在9092
host.name=localhost #当前机器
log.dirs=/tmp/kafka-logs #kafka日志文件存放目录
num.partition=1 #分区数量

zookeeper.connect=localhost:2181 #zookeeper地址


#启动kafka
bin/kafka-server-start.sh config/server.properties

jps / jps -m
多了一个kafka进程

#创建topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic inline
创建topic时需要指定zookeeper地址,副本系数,分区数

#查看所有topic
bin/kafka-topics.sh --list --zookeeper localhost:2181

bin/kafka-topics.sh --describe --zookeeper localhost:2181

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic inline

##发送消息,生产者生成消息
> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic inline
This is a message
This is another message

#消费消息
                                          zookeeper端口
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic inline --from-beginning
This is a message
This is another message

--from-beginning 表示从头开始消费,无此参数则从消费启动后产生的数据

配置单节点多个broker

The broker.id property is the unique and permanent name of each node in the cluster 

server-1.properties
	log.dirs=/home/hadoop/app/tmp/kafka-logs-1
	listeners=PLAINTEXT://:9093
	broker.id=1

server-2.properties
	log.dirs=/home/hadoop/app/tmp/kafka-logs-2
	listeners=PLAINTEXT://:9094
	broker.id=2

server-3.properties
	log.dirs=/home/hadoop/app/tmp/kafka-logs-3
	listeners=PLAINTEXT://:9095
	broker.id=3
#-daemon 后台启动
kafka-server-start.sh -daemon $KAFKA_HOME/config/server-1.properties &
kafka-server-start.sh -daemon $KAFKA_HOME/config/server-2.properties &
kafka-server-start.sh -daemon $KAFKA_HOME/config/server-3.properties &

kafka-topics.sh --create --zookeeper hadoop000:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic

kafka-console-producer.sh --broker-list hadoop000:9093,hadoop000:9094,hadoop000:9095 --topic my-replicated-topic
kafka-console-consumer.sh --zookeeper hadoop000:2181 --topic my-replicated-topic



kafka-topics.sh --describe --zookeeper hadoop000:2181 --topic my-replicated-topic




# 高校智慧校园解决方案摘要 智慧校园解决方案是针对高校信息化建设的核心工程,旨在通过物联网技术实现数字化校园的智能化升级。该方案通过融合计算机技术、网络通信技术、数据库技术和IC卡识别技术,初步实现了校园一卡通系统,进而通过人脸识别技术实现了更精准的校园安全管理、生活管理、教务管理和资源管理。 方案包括多个管理系统:智慧校园管理平台、一卡通卡务管理系统、一卡通人脸库管理平台、智能人脸识别消费管理系统、疫情防控管理系统、人脸识别无感识别管理系统、会议签到管理系统、人脸识别通道管理系统和图书馆对接管理系统。这些系统共同构成了智慧校园的信息化基础,通过统一数据库和操作平台,实现了数据共享和信息一致性。 智能人脸识别消费管理系统通过人脸识别终端,在无需接触的情况下快速完成消费支付过程,提升了校园服务效率。疫情防控管理系统利用热成像测温技术、视频智能分析等手段,实现了对校园人员体温监测和疫情信息实时上报,提高了校园公共卫生事件的预防和控制能力。 会议签到管理系统和人脸识别通道管理系统均基于人脸识别技术,实现了会议的快速签到和图书馆等场所的高效通行管理。与图书馆对接管理系统实现了一卡通系统与图书馆管理系统的无缝集成,提升了图书借阅的便捷性。 总体而言,该智慧校园解决方案通过集成的信息化管理系统,提升了校园管理的智能化水平,优化了校园生活体验,增强了校园安全,并提高了教学和科研的效率。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值