postgres-operator 原理解析- 章节 I_postgres operator-CSDN博客

本文链接：https://blog.csdn.net/qq_33745102/article/details/127901201

本文介绍如何利用Kubernetes的EndpointSlice机制实现Postgres数据库的高可用性，特别是在客户端连接路由方面，确保在主节点故障转移后客户端能够无缝连接。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

这篇文章我想写postgres-operator如何利用kubernetes实现高可用功能其中的客户端流量路由部分。

总体的目的呢就是客户端数据库连接请求，如果通过利用kubernetes的机制实现将流量路由到实际的Postgresql主节点。

基础知识

Services without selectors

平常得Service都是通过Selector机制去选择Pod, 但是有一些稍微复杂一点得场景：

希望此Service将流量路由到其他命名空间或者集群运行的Pod
业务迁移到kubernetes,但是出于尝试目的，后端实例有一部分以外部服务形式存在。

当然第三种场景就是本文主要讨论的Operator模式：

Service对象如果没有选择器selector, Service对应的EndpointSlice不会自动创建。因此需要手动创建。

手动创建的EndpointSlice对象通过其label kubernetes.io/service-name关联Service对象。当客户端流量访问不带selector的Service时，会通过上述关联方式找到对应的EndpointSlice，路由到EndpointSlice中记录的Pod IP.

Operator使用clieng-go库，可以动态的操作EndpointSlice对象中记录的IP.

因此本文中讨论的patroni方案就利用这个特性，当发生failover时，Service对象acid-minimal-cluster对应的EndpointSlice对象中记录的IP会被替换成新主节点Pod对应的IP，以此来实现failover成功后，客户端无感知。

架构

首先来看个patroni部署高可用Postgresql的架构图：

数据库服务部署模式
一主一从，每个Pod里运行着patroni-agent和pg数据库，并挂载持久化PV.
数据库服务流量路由：
客户端通过Service demo的端口访问数据库，Service demo是一个不带selector的Service。从库Service为demo-repl是一个正常的Service.
数据库主从切换
主从Pod上的patroni-agent通过更新Endpoint的annotations实现leader选举.

用例

下面这个是用例图：

Operator监控postgresql CRD对象, 并创建

代表Postgresql 集群的StatefulSet 对象。
创建包含数据库用户名、密码的Secret对象。
路由客户端数据库连接的Service及Endpoint。

实际编排文件

客户端通过Service acid-minimal-cluster访问数据库主节点，但是按照之前的理解。Service对象依靠Selector 选择合适label的Pod. 但是我查看了下Service对象acid-minimal-cluster没有Selector:

那么是如何实现路由的呢？答案就是控制器手动创建EndpointSlice。对象通过其label kubernetes.io/service-name关联Service对象。

EndpointSlice & Service

without-selector Service

# kubectl get service acid-minimal-cluster  -o yaml
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2022-11-16T08:21:00Z"
  labels:
    application: spilo
    cluster-name: acid-minimal-cluster
    spilo-role: master
    team: acid
  name: acid-minimal-cluster
  namespace: default
spec:
  clusterIP: 10.102.154.118
  clusterIPs:
  - 10.102.154.118
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: postgresql
    port: 5432
    protocol: TCP
    targetPort: 5432
  type: ClusterIP

EndpointSlice

# kubectl get endpointslice acid-minimal-cluster-4z5c8 -o yaml
addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
  - 10.8.112.49
  targetRef:
    kind: Pod
    name: acid-minimal-cluster-0
    namespace: default
kind: EndpointSlice
metadata:
  labels:
    application: spilo
    cluster-name: acid-minimal-cluster
    # 对应创建EndpointSlice对象的控制器的名字（by convention）
    endpointslice.kubernetes.io/managed-by: endpointslicemirroring-controller.k8s.io
    # 值设置为关联Service对象的name
    kubernetes.io/service-name: acid-minimal-cluster   
    spilo-role: master
    team: acid
  # EndpointSlice name以关联Service对象的name为前缀(by convention)
  # 此处Service name=acid-minimal-cluster 
  name: acid-minimal-cluster-4z5c8
  namespace: default
  ownerReferences:
  - apiVersion: v1
    kind: Endpoints
    name: acid-minimal-cluster  # 这里可以发现EndpointSlice的持有者是Endpoint对象acid-minimal-cluster， 而非Service.
ports:
- name: postgresql
  port: 5432
  protocol: TCP

Endpoint

主节点Pod和从节点Pod通过更新Endpoint的annotations字段实现Leader选举。

# kubectl get endpoints acid-minimal-cluster -o wide -o yaml
apiVersion: v1
kind: Endpoints
metadata:
  annotations:
    acquireTime: "2022-11-18T00:29:41.683512+00:00"
    leader: acid-minimal-cluster-0
    optime: "1090519312"
    renewTime: "2022-11-18T06:18:59.159796+00:00"  # 锁续约时间
    transitions: "5"
    ttl: "30"  # 锁生存时间，通过nowTime - renewTime >= ttl来判断是否过期，也就是主库是否挂掉
  creationTimestamp: "2022-11-16T08:21:00Z"
  labels:
    application: spilo
    cluster-name: acid-minimal-cluster
    spilo-role: master
    team: acid
  name: acid-minimal-cluster
  namespace: default