K8S环境 使用ELK+Filebeat 收集容器日志的方案和安装部署

前言

看到这篇文章的同学相信对K8S已经有一定的了解了,现在很多公司都使用K8S来部署自己的应用,使用K8S来发布应用确实很好,但是日志查看方面还是有诸多的不便,比如主机权限不能随便开放,并且公司层面的K8S 集群内应用数和副本数都比较多,问题定位时不知道从那个副本的日志开始查看,还有一个是容器的重启后想查看重启之前的日志等等问题,造成了在K8S集群查看日志的诸多不便。
所以引申出下面的方案设计

方案说明

此方案 使用 ELK +Filebeat 的架构。
本来 ELK 就能实现日志采集,查询和展示,由于 logstash资源占用比较大(800M+内存),官方使用 GO语言重写了一个Filebeat实现了部分 logstash的功能,资源占用( 30M+) 指数级的资源占用下降,但是日志解析方面还做得不到位,所以我们使用 filebeat采集日志, logstash来进行解析的设计。
Filebeat 使用DaemonSet 部署负责收集每个节点的所有应用日志数据上报 Logstash ,Logstash 对数据进行一次处理,推送到 ES ,最后使用 Kibana进行展示 。

架构图

在这里插入图片描述
画了一个简易架构图来说明:
K8S的应用,默认会把标准输出日志,生成一个文件,通过软链接的方式挂载到节点主机的 /var/log/containers 目录,我们在每个节点部署一个Filebeat去采集这个目录下的内容就OK了

安装部署

相关的chart已经上传 github.com 点击此处 ELK+Filbeat 下载原码
或者 码云下载

我们这里使用 helm来安装,helm安装特别简单,没啥说的,主要是chart编写,详情这里就不说了,点击之处了解 ,安装命令在最后,现把主要的编排内容摘抄如下:

ES编排文件

kind: StatefulSet
metadata:
  name: enervated-puma-elasticsearch
  labels:
    heritage: "Tiller"
    release: "enervated-puma"
    chart: "elasticsearch"
    app: "enervated-puma-elasticsearch"
  annotations:
    esMajorVersion: "7"
spec:
  serviceName: enervated-puma-elasticsearch-headless
  selector:
    matchLabels:
      app: "enervated-puma-elasticsearch"
  replicas: 1
  podManagementPolicy: Parallel
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      name: "enervated-puma-elasticsearch"
      labels:
        heritage: "Tiller"
        release: "enervated-puma"
        chart: "elasticsearch"
        app: "enervated-puma-elasticsearch"
      annotations:

    spec:
      securityContext:
        fsGroup: 1000
        runAsUser: 1000

      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - "enervated-puma-elasticsearch"
            topologyKey: kubernetes.io/hostname
      terminationGracePeriodSeconds: 120
      volumes:
      initContainers:
      - name: configure-sysctl
        securityContext:
          runAsUser: 0
          privileged: true
        image: "docker.elastic.co/elasticsearch/elasticsearch:7.5.0"
        imagePullPolicy: "Always"
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        resources:
          {}


      containers:
      - name: "elasticsearch"
        securityContext:
          capabilities:
            drop:
            - ALL
          runAsNonRoot: true
          runAsUser: 1000

        image: "docker.elastic.co/elasticsearch/elasticsearch:7.5.0"
        imagePullPolicy: "Always"
        readinessProbe:
          failureThreshold: 3
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 3
          timeoutSeconds: 5

          exec:
            command:
              - sh
              - -c
              - |
                #!/usr/bin/env bash -e
                # If the node is starting up wait for the cluster to be ready (request params: 'wait_for_status=green&timeout=1s' )
                # Once it has started only check that the node itself is responding
                START_FILE=/tmp/.es_start_file

                http () {
                    local path="${1}"
                    if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
                      BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
                    else
                      BASIC_AUTH=''
                    fi
                    curl -XGET -s -k --fail ${BASIC_AUTH} http://127.0.0.1:9200${path}
                }

                if [ -f "${START_FILE}" ]; then
                    echo 'Elasticsearch is already running, lets check the node is healthy and there are master nodes available'
                    http "/_cluster/health?timeout=0s"
                else
                    echo 'Waiting for elasticsearch cluster to become cluster to be ready (request params: "wait_for_status=green&timeout=1s" )'
                    if http "/_cluster/health?wait_for_status=green&timeout=1s" ; then
                        touch ${START_FILE}
                        exit 0
                    else
                        echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
                        exit 1
                    fi
                fi
        ports:
        - name: http
          containerPort: 9200
        - name: transport
          containerPort: 9300
        resources:
          limits:
            cpu: 1000m
            memory: 4Gi
          requests:
            cpu: 100m
            memory: 2Gi

        env:
          - name: node.name
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: cluster.initial_master_nodes
            value: "enervated-puma-elasticsearch-0,"
          - name: discovery.seed_hosts
            value: "enervated-puma-elasticsearch-headless"
          - name: cluster.name
            value: "enervated-puma"
          - name: network.host
            value: "0.0.0.0"
          - name: ES_JAVA_OPTS
            value: "-Xmx1g -Xms1g"
          - name: node.data
            value: "true"
          - name: node.ingest
            value: "true"
          - name: node.master
            value: "true"
        volumeMounts:

logstash编排文件

apiVersion: apps/v1
kind: StatefulSet
metadata:
  annotations:
    log.mount.launch.status: '[]'
    log.mount.policy.launcher: '[]'
    meta.helm.sh/release-name: lsh-mcp-logstash
    meta.helm.sh/release-namespace: default
  creationTimestamp: "2021-03-17T07:49:03Z"
  generation: 1
  labels:
    app: lsh-mcp-logstash-lsh-mcp-logstash
    app.kubernetes.io/managed-by: Helm
    chart: lsh-mcp-logstash
    heritage: Helm
    release: lsh-mcp-logstash
  name: lsh-mcp-logstash-lsh-mcp-logstash
  namespace: default
  resourceVersion: "2422055"
  selfLink: /apis/apps/v1/namespaces/default/statefulsets/lsh-mcp-logstash-lsh-mcp-logstash
  uid: bce7a7ce-9765-493c-b2b8-8fe2bfdbe0da
spec:
  podManagementPolicy: Parallel
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: lsh-mcp-logstash-lsh-mcp-logstash
      release: lsh-mcp-logstash
  serviceName: lsh-mcp-logstash-lsh-mcp-logstash
  template:
    metadata:
      annotations:
        configchecksum: 4c78faf0f98fb6aa26019a156449fec2197a7b42cbfc7a666c879aadf4875a8
        pipelinechecksum: 9ccfaf557f6835ce7ae626767421d6e50bcd859ffd08564f5271722bcc5f022
      creationTimestamp: null
      labels:
        app: lsh-mcp-logstash-lsh-mcp-logstash
        chart: lsh-mcp-logstash
        heritage: Helm
        release: lsh-mcp-logstash
      name: lsh-mcp-logstash-lsh-mcp-logstash
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - lsh-mcp-logstash-lsh-mcp-logstash
            topologyKey: kubernetes.io/hostname
      containers:
      - env:
        - name: LS_JAVA_OPTS
          value: -Xmx1g -Xms1g
        image: docker.elastic.co/logstash/logstash:7.5.0
        imagePullPolicy: IfNotPresent
        name: lsh-mcp-logstash
        resources:
          limits:
            cpu: "1"
            memory: 1536Mi
          requests:
            cpu: 100m
            memory: 500Mi
        securityContext:
          capabilities:
            drop:
            - ALL
          runAsNonRoot: true
          runAsUser: 1000
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /usr/share/logstash/config/logstash.yml
          name: logstashconfig
          subPath: logstash.yml
        - mountPath: /usr/share/logstash/pipeline
          name: logstashpipeline
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        fsGroup: 1000
        runAsUser: 1000
      terminationGracePeriodSeconds: 120
      volumes:
      - configMap:
          defaultMode: 420
          name: lsh-mcp-logstash-lsh-mcp-logstash-config
        name: logstashconfig
      - configMap:
          defaultMode: 420
          name: lsh-mcp-logstash-lsh-mcp-logstash-pipeline
        name: logstashpipeline
  updateStrategy:
    type: RollingUpdate

kibana编排文件

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    log.mount.launch.status: '[]'
    log.mount.policy.launcher: '[]'
    meta.helm.sh/release-name: lsh-mcp-kibana
    meta.helm.sh/release-namespace: default
  creationTimestamp: "2021-03-17T01:43:57Z"
  generation: 1
  labels:
    app: lsh-mcp-kibana
    app.kubernetes.io/managed-by: Helm
    heritage: Helm
    release: lsh-mcp-kibana
  name: lsh-mcp-kibana-lsh-mcp-kibana
  namespace: default
  resourceVersion: "2533966"
  selfLink: /apis/extensions/v1beta1/namespaces/default/deployments/lsh-mcp-kibana-lsh-mcp-kibana
  uid: 449fe647-4767-4f0f-8b06-b53c2c4a30b8
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: lsh-mcp-kibana
      release: lsh-mcp-kibana
  strategy:
    type: Recreate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: lsh-mcp-kibana
        release: lsh-mcp-kibana
    spec:
      containers:
      - env:
        - name: ELASTICSEARCH_HOSTS
          value: http://lsh-mcp-elasticsearch-lsh-mcp-elasticsearch:9200
        - name: SERVER_HOST
          value: 0.0.0.0
        - name: NODE_OPTIONS
          value: --max-old-space-size=1800
        image: docker.elastic.co/kibana/kibana:7.5.0
        imagePullPolicy: Always
        name: kibana
        ports:
        - containerPort: 5601
          protocol: TCP
        readinessProbe:
          exec:
            command:
            - sh
            - -c
            - |
              #!/usr/bin/env bash -e

              # Disable nss cache to avoid filling dentry cache when calling curl
              # This is required with Kibana Docker using nss < 3.52
              export NSS_SDB_USE_CACHE=no

              http () {
                  local path="${1}"
                  set -- -XGET -s --fail -L

                  if [ -n "${ELASTICSEARCH_USERNAME}" ] && [ -n "${ELASTICSEARCH_PASSWORD}" ]; then
                    set -- "$@" -u "${ELASTICSEARCH_USERNAME}:${ELASTICSEARCH_PASSWORD}"
                  fi

                  STATUS=$(curl --output /dev/null --write-out "%{http_code}" -k "$@" "http://localhost:5601${path}")
                  if [[ "${STATUS}" -eq 200 ]]; then
                    exit 0
                  fi

                  echo "Error: Got HTTP code ${STATUS} but expected a 200"
                  exit 1
              }

              http "/app/kibana"
          failureThreshold: 3
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 3
          timeoutSeconds: 5
        resources:
          limits:
            cpu: "1"
            memory: 2Gi
          requests:
            cpu: 100m
            memory: 800Mi

Filebeat编排文件

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  creationTimestamp: "2021-03-01T10:46:09Z"
  generation: 1
  labels:
    app: lsh-mcp-filebeat-lsh-mcp-filebeat
    chart: lsh-mcp-filebeat-7.5.0
    heritage: Tiller
    release: lsh-mcp-filebeat
  name: lsh-mcp-filebeat
  namespace: default
  resourceVersion: "10275"
  selfLink: /apis/extensions/v1beta1/namespaces/default/daemonsets/lsh-mcp-filebeat
  uid: 78427beb-d2f4-40e0-ad9c-acfb943e0d2a
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: lsh-mcp-filebeat-lsh-mcp-filebeat
  template:
    metadata:
      annotations:
        configChecksum: 2f66a41b31553cfe7a7e89de6648f4208a638e11dc101b89295807fdd7123ad
      creationTimestamp: null
      labels:
        app: lsh-mcp-filebeat-lsh-mcp-filebeat
        chart: lsh-mcp-filebeat-7.5.0
        heritage: Tiller
        release: lsh-cluster-csm-log-agent
      name: lsh-mcp-filebeat-lsh-mcp-filebeat
    spec:
      containers:
      - args:
        - -e
        - -E
        - http.enabled=true
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        image: docker.elastic.co/beats/filebeat:7.5.0
        imagePullPolicy: Always
        name: lsh-mcp-filebeat
        resources:
          limits:
            cpu: "1"
            memory: 400Mi
          requests:
            cpu: 100m
            memory: 100Mi
        securityContext:
          privileged: false
          runAsUser: 0
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /usr/share/filebeat/filebeat.yml
          name: filebeat-config
          readOnly: true
          subPath: filebeat.yml
        - mountPath: /usr/share/filebeat/config
          name: filebeat-reload-config
        - mountPath: /usr/share/filebeat/data
          name: data
        - mountPath: /var/lib/docker/containers
          name: varlibdockercontainers
          readOnly: true
        - mountPath: /data/docker/containers
          name: data-docker-containers
        - mountPath: /var/log
          name: varlog
          readOnly: true
        - mountPath: /var/run/docker.sock
          name: varrundockersock
          readOnly: true
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: lsh-mcp-filebeat-lsh-mcp-filebeat
      serviceAccountName: lsh-mcp-filebeat-lsh-mcp-filebeat
      terminationGracePeriodSeconds: 30
      volumes:
      - configMap:
          defaultMode: 384
          name: lsh-mcp-filebeat-config
        name: filebeat-config
      - configMap:
          defaultMode: 420
          name: lsh-mcp-filebeat-reload-config
        name: filebeat-reload-config
      - hostPath:
          path: /var/lib/lsh-mcp-filebeat-lsh-mcp-filebeat-default-data
          type: DirectoryOrCreate
        name: data
      - hostPath:
          path: /var/lib/docker/containers
          type: ""
        name: varlibdockercontainers
      - hostPath:
          path: /var/log
          type: ""
        name: varlog
      - hostPath:
          path: /data/docker/containers
          type: ""
        name: data-docker-containers
      - hostPath:
          path: /var/run/docker.sock
          type: ""
        name: varrundockersock

安装部署中的一些问题总结

ELK的部署不需要修改内容,这里讲讲 Filebeat的采集配置内容, filebeat/values.yaml 文件下的配置解释

  filebeat.yml: |
    filebeat.inputs:
    - type: container
      paths:
        - /var/log/containers/*.log
      processors:
        - add_kubernetes_metadata:
            host: ${NODE_NAME}
            matchers:
            - logs_path:
                logs_path: "/var/log/containers/"
      multiline.type: pattern
      multiline.pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:'
      multiline.negate: false
      multiline.match: after

    output.logstash:
      hosts: ["lsh-mcp-logstash-lsh-mcp-logstash"]

根据 ES官方文档 容器类型日志采集,类型一定要设置成 type: container,这样才会附带容器的一些信息:比如 pod 的命名空间,labels 值 等等,方便后面搜索过滤日志

multiline.type: pattern
multiline.pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:'
multiline.negate: false
multiline.match: after

multiline 相关配置是多行合并,java等应用一个事件有时会输出成多行日志,不方便查看 ,使用此配置进行合并 详情见官方文档

K8S 标准输入文件的路径:paths : /var/log/containers/*.log

另外有一个需要特别注意的地方就是 ,K8S标准输出文件是一个经过两次软链接汇总到一起的文件,需要把两次软链接指向的目录都通过目录形式挂载到 filebeat pod内,不然会采集不到内容。挂载配置如下

volumeMounts:
        - name: data
          mountPath: /usr/share/filebeat/data
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: data-docker-containers
          mountPath: /data/docker/containers
        - name: varlog
          mountPath: /var/log
          readOnly: true
        
volumes:
	  - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: varlog
        hostPath:
          path: /var/log
      - name: data-docker-containers
        hostPath:
          path: /data/docker/containers

此部署的ELK,除filebeat外都是单副本 ,采集所有的K8S pod 标准输出日志,ES以 pod 标准输出文件名称创建 index,如需要更多的副本,请自行修改 values.yaml 中的 replicas: 1
在这里插入图片描述

如果数据量大需要考虑添加缓存:kafka reids 等 filebeat数据推送到缓存,logstash从缓存中取出数据处理到Es中
kibanan 的端口设置的是 :30961
如需要修改开放端口号请修改 kibana/values.yaml 下的 nodePort: 30961 为自己想要开放的端口

service:
  type: NodePort
  loadBalancerIP: ""
  port: 5601
  nodePort: 30961

以上提到的内容都已经在chart中修改过了,只需要执行下面的安装命令就可以部署

安装命令

使用 helm3 安装
安装 es
helm3 install elasticsearch lsh-mcp-elasticsearch
安装完成后查看 pod 运行情况
在这里插入图片描述

安装 logstash
helm3 install logstash lsh-mcp-logstash
安装完成后查看 pod 运行情况
在这里插入图片描述

安装 filebeat
helm3 install filebeat lsh-mcp-filebeat
安装完成后查看 pod 运行情况
在这里插入图片描述

安装 kibana
helm3 install kibana lsh-mcp-kibana
安装完成后查看 pod 运行情况
在这里插入图片描述

如果使用的是 hlem2 则命令是 helm install --name elasticsearch lsh-mcp-elasticsearch
其它同上 以helm 开头 --name 指定安装名称 后面一个参数是文件夹名称

最后的展示效果:
在这里插入图片描述

  • 1
    点赞
  • 17
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值