kafka 的配置、使用和数据迁移

2 篇文章 0 订阅
1 篇文章 0 订阅

1)Create a topic

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

查看运行的topic

bin/kafka-topics.sh --list --zookeeper localhost:2181

2)发送消息 和消费消息

console 模式

> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test

This is a message

This is another message

bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning

This is a message

This is another message

3)对于多个broker 集群

设置properties

config/server-1.properties: broker.id=1 port=9093 log.dir=/tmp/kafka-logs-1

config/server-2.properties: broker.id=2 port=9094 log.dir=/tmp/kafka-logs-2

启动

> bin/kafka-server-start.sh config/server-1.properties & 

> bin/kafka-server-start.sh config/server-2.properties &

 

4)查看topic

>bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic

Topic:my-replicated-topic PartitionCount:1 ReplicationFactor:3 Configs: Topic: my-replicated-topic Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0

describe:

  • "leader" is the node responsible for all reads and writes for the given partition. Each node will be the leader for a randomly selected portion of the partitions.
  • "replicas" is the list of nodes that replicate the log for this partition regardless of whether they are the leader or even if they are currently alive.
  • "isr" is the set of "in-sync" replicas. This is the subset of the replicas list that is currently alive and caught-up to the leader.

5) Broker Configs

The essential configurations are the following:

broker.id

log.dirs

zookeeper.connect

每个broker id是唯一的

6)topic config

topic 的配置

> bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic my-topic --partitions 1 --replication-factor 1 --config max.message.bytes=64000 --config flush.messages=1

修改topic信息

> bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic my-topic --config max.message.bytes=128000

删除topic配置

> bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic my-topic --deleteConfig max.message.bytes

7) consumer config

consumer 的配置

The essential consumer configurations are the following:

group.id

zookeeper.connect

A string that uniquely identifies the group of consumer processes to which this consumer belongs. By setting the same group id multiple processes indicate that they are all part of the same consumer group.

识别唯一标识consumer属于哪个进程组

8)producer config

producer 的配置

Essential configuration properties for the producer include:

metadata.broker.list

request.required.acks

producer.type

serializer.class

metadata.broker.list:

This is for bootstrapping and the producer will only use it for getting metadata (topics, partitions and replicas). The socket connections for sending the actual data will be established based on the broker information returned in the metadata. The format is host1:port1,host2:port2, and the list can be a subset of brokers or a VIP pointing to a subset of brokers.

request.required.acks:

  • 0, which means that the producer never waits for an acknowledgement from the broker (the same behavior as 0.7). This option provides the lowest latency but the weakest durability guarantees (some data will be lost when a server fails).
  • 1, which means that the producer gets an acknowledgement after the leader replica has received the data. This option provides better durability as the client waits until the server acknowledges the request as successful (only messages that were written to the now-dead leader but not yet replicated will be lost).
  • -1, The producer gets an acknowledgement after all in-sync replicas have received the data. This option provides the greatest level of durability. However, it does not completely eliminate the risk of message loss because the number of in sync replicas may, in rare cases, shrink to 1. If you want to ensure that some minimum number of replicas (typically a majority) receive a write, then you must set the topic-level min.insync.replicas setting. Please read the Replication section of the design documentation for a more in-depth discussion.

默认为0,特点:不会等待消息的确认,低延迟、持久性差、可能有数据丢失。

1 ,特点:等待消息确认,高延迟,保证持久性,数据无丢失。

-1 ,所有同步副本都收到数据后确认

9)配置日志清理程序

The log cleaner is disabled by default. To enable it set the server config

log.cleaner.enable=true

默认不清理

 

This will start the pool of cleaner threads. To enable log cleaning on a particular topic you can add the log-specific property

log.cleanup.policy=compact

This can be done either at topic creation time or using the alter topic command.

 

10)graceful shutdown

configuration:

controlled.shutdown.enable=true

平衡副本

auto.leader.rebalance.enable=true

11) 两集群间数据迁移

 

> bin/kafka-run-class.sh kafka.tools.MirrorMaker --consumer.config consumer-1.properties --consumer.config consumer-2.properties --producer.config producer.properties --whitelist my-topic

--whitelist topic支持java 正则表达式,如:

topics named A and B using --whitelist 'A|B'

all topics using --whitelist '*'

 

查看消费者的位置

> bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --zkconnect localhost:2181 --group test Group Topic Pid Offset logSize Lag Owner my-group my-topic 0 0 0 0 test_jkreps-mn-1394154511599-60744496-0 my-group my-topic 1 0 0 0 test_jkreps-mn-1394154521217-1a0be913-0

 

官方的配置信息:

# Replication configurations

num.replica.fetchers=4

replica.fetch.max.bytes=1048576

replica.fetch.wait.max.ms=500

replica.high.watermark.checkpoint.interval.ms=5000

replica.socket.timeout.ms=30000

replica.socket.receive.buffer.bytes=65536

replica.lag.time.max.ms=10000

replica.lag.max.messages=4000

controller.socket.timeout.ms=30000

controller.message.queue.size=10

# Log configuration

num.partitions=8

message.max.bytes=1000000

auto.create.topics.enable=true

log.index.interval.bytes=4096

log.index.size.max.bytes=10485760

log.retention.hours=168

log.flush.interval.ms=10000

log.flush.interval.messages=20000

log.flush.scheduler.interval.ms=2000

log.roll.hours=168

log.retention.check.interval.ms=300000

log.segment.bytes=1073741824

# ZK configuration

zookeeper.connection.timeout.ms=6000

zookeeper.sync.time.ms=2000

# Socket server configuration

num.io.threads=8

num.network.threads=8

socket.request.max.bytes=104857600

socket.receive.buffer.bytes=1048576

socket.send.buffer.bytes=1048576

queued.max.requests=16

fetch.purgatory.purge.interval.requests=100

producer.purgatory.purge.interval.requests=100

java 调优参考:

-Xms4g -Xmx4g -XX:PermSize=48m -XX:MaxPermSize=48m -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35

参考文献:

kafka官网(kafka 0.82):http://kafka.apache.org/082/documentation.html

 

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值