【kafka系列教程40】kafka数据中心

本文介绍了Kafka在数据中心的最佳部署实践,强调跨数据中心的集群部署,以确保高可用性和低延迟。建议每个数据中心部署本地Kafka集群,利用镜像工具进行数据同步。此外,还探讨了硬件选择、操作系统配置、磁盘与文件系统优化,以及针对高延迟环境的网络调整,以提升性能和稳定性。
摘要由CSDN通过智能技术生成

数据中心

总结:意思是kafka集群最好部署在相同局域网的环境里,不要部署在不同的网络环境里。

Some deployments will need to manage a data pipeline that spans multiple datacenters. Our recommended approach to this is to deploy a local Kafka cluster in each datacenter with application instances in each datacenter interacting only with their local cluster and mirroring between clusters (see the documentation on the mirror maker toolfor how to do this).

有些部署需要去管理跨多个数据中心的数据通道。对此,我们推荐的方法是在每个数据中心部署一套本地kafka集群,每个数据中心的应用程序实例只会影响它们本地集群和集群之间的镜像(查看镜像制造工具的文档,是如何做到这一点的)。

 

This deployment pattern allows datacenters to act as independent entities and allows us to manage and tune inter-datacenter replication centrally. This allows each facility to stand alone and operate even if the inter-datacenter links are unavailable: when this occurs the mirroring falls behind until the link is restored at which time it catches up.

这种部署模式允许数据中心当做独立的实体,使我们整体去管理和调整跨数据中心之间的复制。这使得每个设置都能独立的运转和操作,即使数据中心之间的链路不可用:当这种情况发生时落后的镜像,直到链路恢复了,此时,落后的镜像同步最新的镜像。

 

For applications that need a global view of all data you can use mirroring to provide clusters which have aggregate data mirrored from the local clusters in alldatacenters. These aggregate clusters are used for reads by applications that require the full data set.

对于应用程序,它需要读取完整的数据集。你可以使用所有数据中心里本地集群已经汇总的数据镜像提供到集群,这些汇总的集群被应用程序读写。

 

This is not the only possible deployment pattern. It is possible to read from or write to a remote Kafka cluster over the WAN, though obviously this will add whatever latency is required to get the cluster.

这不是唯一的部署模式,它可通过WAN直接读或写到远程kafka集群,虽然很明显这将增加延迟获取集群。

 

Kafka naturally batches data in both the producer and consumer so it can achieve high-throughput even over a high-latency connection. To allow this though it may be necessary to increase the TCP socket buffer sizes for the producer, consumer, and broker using thesocket.send.buffer.bytes and socket.receive.buffer.bytesconfigurations. The appropriate way to set this is documented here.

Kafka轻松的同时在消费者和生产者进行批处理数据。因此它能在高延迟连接下实现高吞吐量,为实现这一点,它通过配置生产者,消费者和broker的thesocket.send.buffer.bytes和socket.receive.buffer.bytes 以增加TCP套接字缓存的大小。适当的设置,设置方法文档在这里。

 

It is generally notadvisable to run a singleKafka cluster that spans multiple datacenters over a high-latency link. This will incur very high replication latency both for Kafka writes and ZooKeeper writes, and neither Kafka nor ZooKeeper will remain available in all locations if the network between locations is unavailable.

通常我们的不建议运行在高延迟链路跨多个数据中心的单一kafka集群。这将产生很高复制延迟无论是kafka的写入还是zookeeper的写入。如果网络在本地之间不可用,除kafka和zookeeper的

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值