如何在kubernetes上运行高可用的kafka

最新推荐文章于 2024-04-17 06:30:39 发布

weixin_26711425

最新推荐文章于 2024-04-17 06:30:39 发布

阅读量196

点赞数

文章标签： python

原文链接：https://medium.com/better-programming/how-to-run-highly-available-kafka-on-kubernetes-a1824db8a3e2

版权

Apache Kafka is one of the most popular event-based distributed streaming platforms. LinkedIn first developed it, and technology leaders such as Uber, Netflix, Slack, Coursera, Spotify, and others currently use it.

Apache Kafka是最流行的基于事件的分布式流平台之一。 LinkedIn首先开发了它，而Uber，Netflix，Slack，Coursera，Spotify等技术领导者目前正在使用它。

Though very powerful, Kafka is equally complex and requires a highly available robust platform to run on. Most of the time, engineers struggle to feed and water the Kafka servers, and standing one up and maintaining it is not a piece of cake.

尽管功能强大，但Kafka同样复杂，需要运行高度可用的强大平台。在大多数情况下，工程师很难给Kafka服务器喂食和浇水，并且站起来维护它并不是小菜一碟。

With microservices in vogue and most companies adopting distributed computing, standing up Kafka as the core messaging backbone has its advantages. Kubernetes is a popular choice to run container-based microservices, and using Kafka as an eventing platform is another.

随着微服务的流行和大多数公司采用分布式计算，将Kafka作为核心消息传递骨干站具有其优势。 Kubernetes是运行基于容器的微服务的流行选择，而使用Kafka作为事件平台是另一种选择。

If you are running your microservices in Kubernetes, it will make sense to run your Kafka cluster within Kubernetes to take advantage of its in-built resilience and high-availability. The Kubernetes pods can also easily interact with the Kafka pods within the cluster using the inbuilt Kubernetes service discovery.

如果您在Kubernetes中运行微服务，则有必要在Kubernetes中运行您的Kafka集群，以利用其内置的弹性和高可用性。使用内置的Kubernetes服务发现，Kubernetes吊舱还可以轻松地与集群中的Kafka吊舱进行交互。

Let’s take a peek at how to build a distributed Kafka cluster on Kubernetes through a hands-on exercise. We will use Helm charts and stateful sets. Kubernetes will dynamically request persistent volumes from the cloud provider and use them to persist data.

让我们看一下如何通过动手练习在Kubernetes上构建分布式Kafka集群。我们将使用Helm图表和有状态集合。 Kubernetes将动态地向云提供商请求持久卷，并使用它们来持久化数据。

先决条件 (Prerequisites)

You will need a running Kubernetes cluster for this exercise. I have used Google Kubernetes Engine 1.16.13-gke.1.

您将需要一个正在运行的Kubernetes集群以进行此练习。我已经使用了Google Kubernetes Engine 1.16.13-gke.1。

安装头盔 (Install Helm)

Helm is the package manager for Kubernetes. It is one of the most popular tools allowing people to version and share their manifests. Kafka also contains production-tested Helm charts that will help you install a production-ready cluster. You can also customise the Helm installation according to your requirements.

Helm是Kubernetes的软件包经理。它是允许人们进行版本控制和共享清单的最受欢迎的工具之一。 Kafka还包含经过生产测试的Helm图表，可帮助您安装可用于生产的集群。您还可以根据需要自定义Helm安装。

Helm v3 does not require Tiller and therefore is a simple binary download and path setup.

Helm v3不需要Tiller，因此是简单的二进制下载和路径设置。

wget https://get.helm.sh/helm-v3.3.0-linux-amd64.tar.gz
tar -zxvf helm-v3.3.0-linux-amd64.tar.gz
sudo cp -a linux-amd64/helm /usr/local/bin/helm
chmod +x /usr/local/bin/helm

安装卡夫卡 (Install Kafka)

We will now install Kafka using the Helm chart. For that, we will first add the chart repository and use that to download the Kafka Helm chart.

现在，我们将使用Helm图表安装Kafka。为此，我们将首先添加图表存储库并使用它来下载Kafka Helm图表。

Add the incubator Helm repo.

添加孵化器头盔回购。

helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator

Next we’ll download the values.yaml file. This file contains a configuration that we can use to customise our installation. We will use the default values and create a Helm release from there, but feel free to customise it as per your need.

接下来，我们将下载values.yaml文件。该文件包含可用于自定义安装的配置。我们将使用默认值并从那里创建Helm版本，但可以根据需要自定义它。

curl https://raw.githubusercontent.com/helm/charts/master/incubator/kafka/values.yaml > values.yaml
helm install kafka incubator/kafka -f values.yaml

We will now see that Kafka is getting ready. Get the pods and wait till they are ready.

现在，我们将看到Kafka正在准备。拿起豆荚，等它们准备好为止。

The setup has spun three zookeeper pods and three Kafka pods. They spread across the cluster, so it means that the design is highly available.

该装置旋转了三个动物园管理员豆荚和三个卡夫卡豆荚。它们分布在整个集群中，因此这意味着该设计具有很高的可用性。

Let’s have a look at the persistent volumes so that we understand where the disks have come from.

让我们看一下持久卷，以便我们了解磁盘的来源。

As we see, Kubernetes dynamically claims the persistent volumes from the cloud provider (GCP in this case). If you haven’t enabled dynamic provisioning for your cluster or your cluster does not support it, you can always modify the values.yaml to use static provisioned volumes that you will need to provide before you do the Helm install.

如我们所见，Kubernetes动态地向云提供商(在这种情况下为GCP)声明持久卷。如果尚未为集群启用动态配置，或者集群不支持动态配置，则始终可以修改values.yaml以使用在执行Helm安装之前需要提供的静态配置卷。

Let’s look at the Kubernetes services

让我们看一下Kubernetes服务

As we see, there is a zookeeper service called kafka-zookeper and a Kafka service called kafka. For the Kafka cluster management, we will interact with the kafka-zookeper service, and for sending and receiving messages from the cluster we will use the kafka service.

如我们所见，有一个叫做kafka-zookeper服务和一个叫做kafka的Kafka服务。对于Kafka集群管理，我们将与kafka-zookeper服务进行交互，对于从集群发送和接收消息，我们将使用kafka服务。

安装Kafka客户端 (Install the Kafka Client)

As the Kafka cluster is up and ready, let’s install a Kafka client that will help us put and get messages from topics.

随着Kafka集群的建立和就绪，让我们安装一个Kafka客户端，它将帮助我们放置和获取主题消息。

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: testclient
spec:
  containers:
  - name: kafka
    image: solsson/kafka:0.11.0.0
    command:
      - sh
      - -c
      - "exec tail -f /dev/null"
EOF

测试Kafka集群 (Test the Kafka Cluster)

Finally, time for some testing! Let’s create a topic “messages” with one partition and replication factor ‘1’.

最后，是时候进行一些测试了！让我们创建一个具有一个分区和复制因子“ 1”的主题“消息”。

kubectl exec -it testclient -- ./bin/kafka-topics.sh --zookeeper kafka-zookeeper:2181 --topic messages --create --partitions 1 --replication-factor 1

Let’s now create a producer that will publish messages to the topic.

现在，让我们创建一个生产者，该生产者将向该主题发布消息。

kubectl  exec -ti testclient -- ./bin/kafka-console-producer.sh --broker-list kafka:9092 --topic messages

In a separate window, let’s open a consumer session so that we can see the messages as we send.

在一个单独的窗口中，让我们打开一个消费者会话，以便我们可以在发送时看到消息。

kubectl exec -ti testclient -- ./bin/kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic messages

Now start sending messages on the Producer, and you will see them appear almost instantly on the Consumer side.

现在开始在生产者上发送消息，您将看到它们几乎立即出现在消费者端。

Congratulations! The Kafka cluster is working correctly.

恭喜你！ Kafka集群正常运行。

结论 (Conclusion)

Kafka on a Kubernetes cluster sounds like an excellent proposition for organisations that are already using Kubernetes for their container workload and want a distributed eventing engine to foster communications between them.

对于已经使用Kubernetes进行容器工作负载并希望使用分布式事件引擎来促进它们之间的通信的组织，Kafka在Kubernetes集群上听起来是一个极好的建议。

Event streams are a powerful way of establishing asynchronous communication between your microservices, and Kafka is a robust, production-grade, battle-tested solution.

事件流是在微服务之间建立异步通信的有效方法，而Kafka是一种经过生产测试，经过测试的强大解决方案。

Running a Kafka cluster on Kubernetes helps take advantage of multiple resiliences and ops capabilities it provides out of the box, and that simplifies a lot of tasks for you.

在Kubernetes上运行Kafka集群有助于立即利用其提供的多种弹性和运维功能，从而为您简化了许多任务。

Thanks for reading! I hope you enjoyed the article.

谢谢阅读！希望您喜欢这篇文章。