PostgreSQL最佳实践

Zhao·o

于 2024-09-13 15:29:15 发布

阅读量415

点赞数 3

文章标签： postgresql 数据库运维 linux

本文链接：https://blog.csdn.net/zje1110/article/details/142211926

版权

PostgreSQL最佳实践

前置条件

PostgreSQL 版本 >= 9.0

创建监控账号

-- PostgreSQL >= 10
create user datakit with password '<PASSWORD>';
grant pg_monitor to datakit;
grant SELECT ON pg_stat_database to datakit;

-- PostgreSQL < 10
create user datakit with password '<PASSWORD>';
grant SELECT ON pg_stat_database to datakit;

在这里插入图片描述

采集器配置

主机采集postgreSql

进入Datakit安装目录下的 conf.d/db 目录，复制 postgresql.conf.sample 并命名为 postgresql.conf
注意：address定义了连接到 PostgreSQL 服务器的 URI，这里的 datakit 是用户名，PASSWORD 应被替换为实际的密码，localhost 是数据库服务器地址，sslmode=disable 表示不使用 SSL 连接

postgresql_lock, postgresql_stat, postgresql_index, postgresql_size, postgresql_statio这些指标需要开启配置文件中的 relations 字段

[[inputs.postgresql]]
  ## Server address
  # URI format
  # postgres://[datakit[:PASSWORD]]@localhost[/dbname]?sslmode=[disable|verify-ca|verify-full]
  # or simple string
  # host=localhost user=pqgotest password=... sslmode=... dbname=app_production

  address = "postgres://datakit:PASSWORD@localhost?sslmode=disable"

  ## Ignore databases which are gathered. Do not use with 'databases' option.
  #
  # ignored_databases = ["db1"]

  ## Specify the list of the databases to be gathered. Do not use with the 'ignored_databases' option.
  #
  # databases = ["db1"]

  ## Specify the name used as the "server" tag.
  #
  # outputaddress = "db01"

  ## Collect interval
  # Time unit: "ns", "us" (or "µs"), "ms", "s", "m", "h"
  #
  interval = "10s"

  ## Relations config
  # The list of relations/tables can be specified to track per-relation metrics. To collect relation
  # relation_name refer to the name of a relation, either relation_name or relation_regex must be set.
  # relation_regex is a regex rule, only takes effect when relation_name is not set.
  # schemas used for filtering, ignore this field when it is empty
  # relkind can be a list of the following options:
  #   r(ordinary table), i(index), S(sequence), t(TOAST table), p(partitioned table),
  #   m(materialized view), c(composite type), f(foreign table)
  #
  # [[inputs.postgresql.relations]]
  # relation_name = "<TABLE_NAME>"
  # relation_regex = "<TABLE_PATTERN>"
  # schemas = ["public"]
  # relkind = ["r", "p"]

  ## Set true to enable election
  election = true

  ## Run a custom SQL query and collect corresponding metrics.
  #
  # [[inputs.postgresql.custom_queries]]
  #   sql = '''
  #     select datname,numbackends,blks_read
  #     from pg_stat_database
  #     limit 10
  #   '''
  #   metric = "postgresql_custom_stat"
  #   tags = ["datname" ]
  #   fields = ["numbackends", "blks_read"]

  ## Log collection
  #
  # [inputs.postgresql.log]
  # files = []
  # pipeline = "postgresql.p"

  ## Custom tags
  #
  [inputs.postgresql.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"
  # ...

在这里插入图片描述

配置好后重启Datakit

K8s集群接入postgreSql

使用CM挂载的方式开启PostgreSQL采集器，配置datakit.yaml
在volumeMounts处配置如下

-mountPath:/usr/local/datakit/conf.d/db/postgresql.conf
          name: datakit-conf
          subPath: postgresql.conf
          readOnly: true

在这里插入图片描述

datakit.yaml底部配置CM

apiVersion: v1
kind: ConfigMap
metadata:
  name: datakit-conf
  namespace: datakit
data:
  postgresql.conf: |-
    [[inputs.postgresql]]
      address = "postgres://username:xxxxx@postgresql.default.svc.cluster.local?sslmode=disable"
      interval = "10s"
    [[inputs.postgresql.relations]]
      #relation_name = "<TABLE_NAME>"
      relation_regex = ".*"
      schemas = ["public"]
      relkind = ["r", "p"]
      election = true
    [inputs.postgresql.log]
      files = []
      #pipeline = "postgresql.p"
---

在这里插入图片描述

kubectl apply -f datakit.yaml

验证

接入datakit后可以通过datakit monitor命令查看数据是否正常上报

在这里插入图片描述

日志

PostgreSQL 日志默认是输出至 stderr，如需开启文件日志，可在 PostgreSQL 的配置文件 /var/lib/pgsql/15/data/postgresql.conf` ，进行如下配置：

logging_collector = on    # 开启日志写入文件功能

log_directory = 'pg_log'  # 设置文件存放目录，绝对路径或相对路径（相对 PGDATA）

log_filename = 'pg.log'   # 日志文件名称
log_statement = 'all'     # 记录所有查询

#log_duration = on
log_line_prefix= '%m [%p] %d [%a] %u [%h] %c ' # 日志行前缀
log_file_mode = 0644

# For Windows
#log_destination = 'eventlog'

在这里插入图片描述

PostgreSQL 采集器默认是未开启日志采集功能，可在 conf.d/db/postgresql.conf 中将 files 打开，并写入 PostgreSQL 日志文件的绝对路径。比如：

[[inputs.postgresql]]

  ...

  [inputs.postgresql.log]
  files = ["/tmp/pgsql/postgresql.log"]

在这里插入图片描述

开启日志采集后，默认会产生日志来源（source）为 postgresql 的日志

在这里插入图片描述

注意：日志采集仅支持已安装 DataKit 主机上的日志。

其中需要在/usr/local/datakit/conf.d/postgresql.conf部分配置

multiline_match = '''^\S'''
auto_multiline_detection = true
auto_multiline_extra_patterns = []

将日志合并多行

在这里插入图片描述

日志 Pipeline 切割

pipeline规则：

add_pattern("application_name", "\\[\\w+ \\d+ - DB:\\w+\\]")
grok(_, '%{TIMESTAMP_ISO8601:time} %{NOTSPACE:timezone} \\[%{NOTSPACE:process_id}\\] %{NOTSPACE:db_name} %{application_name:application_name} %{NOTSPACE:user} \\[%{NOTSPACE:localhost}\\] %{NOTSPACE:session_id} ')
default_time(time)       # 将 time 字段作为输出数据的时间戳

日志原文：

2021-05-31 15:23:45.110 CST [74305] test [pgAdmin 4 - DB:postgres] postgres [127.0.0.1] 60b48f01.12241 LOG:  statement:
        SELECT psd.*, 2^31 - age(datfrozenxid) as wraparound, pg_database_size(psd.datname) as pg_database_size
        FROM pg_stat_database psd
        JOIN pg_database pd ON psd.datname = pd.datname
        WHERE psd.datname not ilike 'template%'   AND psd.datname not ilike 'rdsadmin'
        AND psd.datname not ilike 'azure_maintenance'   AND psd.datname not ilike 'postgres'

切割效果
在这里插入图片描述