kubernetes EFK日志管理系统

1. kubernetes 日志管理方案介绍:



因此flunetd组件一般是以DaemonSet的资源类型运行在k8s之上,包括k8s master节点都需要运行fluentd容器;



  有的解决方案是这样的. fluentd--->logstash(日志格式转换等)--->Redis缓存--->ES集群存储--->kibana展示。其中用到的logstash是用来做日志的过滤和格式转换的,而redis用来做缓存,主要是因为ES的写入速度很慢,但是fluentd的发送日志速度很快,如果中间没有缓存的话,可能会造成数据的延迟或者丢失;

  在k8s容器中收集日志也有集中类型,一般fluentd的日志收集客户端只能收集STDOUT(标准输出)的日志, 什么是标准输出?就是你可以通过kubectl logs -f Pod_name 查询到的日志输出就是标准输出。但是有些JAVA应用程序并没有把业务日志输出到STDOUT,而是输出到一个文件里面。
边角容器这种方法也有一个明显的缺陷,就是日志不仅会在原容器文件中保留下来,还会通过 stdout 输出后占用磁盘空间,这样无形中就增加了一倍磁盘空间。

2. 使用Helm安装fluentd、ES、kibana:


  1. helm的官方网站是: https://github.com/helm/charts/tree/master/stable/ 你需要的应用都在这个目录下面;
  2. 一般的流程就是helm search 应用名---> helm fetch 应用名--->根据官方文档配置value.yaml--->启动helm应用;


  repository: k8s.harbor.maimaiti.site/system/fluentd
## Specify an imagePullPolicy (Required)
## It's recommended to change this to 'Always' if the image tag is 'latest'
## ref: http://kubernetes.io/docs/user-guide/images/#updating-images
  tag: v2.5.1
  pullPolicy: IfNotPresent
  ## Optionally specify an array of imagePullSecrets.
  ## Secrets must be manually created in the namespace.
  ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
  # pullSecrets:
  #   - myRegistrKeySecretName

## If using AWS Elasticsearch, all requests to ES need to be signed regardless of whether
## one is using Cognito or not. By setting this to true, this chart will install a sidecar
## proxy that takes care of signing all requests being sent to the AWS ES Domain.
  enabled: false
    repository: abutaha/aws-es-proxy
    tag: 0.9

# Specify to use specific priorityClass for pods
# ref: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/
# If a Pod cannot be scheduled, the scheduler tries to preempt (evict) lower priority
# Pods to make scheduling of the pending Pod possible.
priorityClassName: ""

## Configure resource requests and limits
## ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources: {}
  # limits:
  #   cpu: 100m
  #   memory: 500Mi
  # requests:
  #   cpu: 100m
  #   memory: 200Mi

  host: 'elasticsearch-client.kube-system'
  port: 9200
  scheme: 'http'
  ssl_version: TLSv1_2
  user: ""
  password: ""
  buffer_chunk_limit: 2M
  buffer_queue_limit: 8
  logstash_prefix: 'logstash'

# If you want to add custom environment variables, use the env dict
# You can then reference these in your config file e.g.:
#     user "#{ENV['OUTPUT_USER']}"
  # OUTPUT_USER: my_user

# If you want to add custom environment variables from secrets, use the secret list
#   secret_name: elasticsearch
#   secret_key: password

  create: true

  # Specifies whether a ServiceAccount should be created
  create: true
  # The name of the ServiceAccount to use.
  # If not set and create is true, a name is generated using the fullname template
  name: ""

## Specify if a Pod Security Policy for node-exporter must be created
## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/
  enabled: false
  annotations: {}
    ## Specify pod annotations
    ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#apparmor
    ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#seccomp
    ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#sysctl
    # seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'
    # seccomp.security.alpha.kubernetes.io/defaultProfileName: 'docker/default'
    # apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default'

  enabled: true

annotations: {}

  prometheus.io/scrape: "true"
  prometheus.io/port: "24231"

## DaemonSet update strategy
## Ref: https://kubernetes.io/docs/tasks/manage-daemon/update-daemon-set/
  type: RollingUpdate

  - key: node-role.kubernetes.io/master
    operator: Exists
    effect: NoSchedule

affinity: {}
  # nodeAffinity:
  #   requiredDuringSchedulingIgnoredDuringExecution:
  #     nodeSelectorTerms:
  #     - matchExpressions:
  #       - key: node-role.kubernetes.io/master
  #         operator: DoesNotExist

nodeSelector: {}

  type: ClusterIP
    - name: "monitor-agent"
      port: 24231

  ## If true, a ServiceMonitor CRD is created for a prometheus operator
  ## https://github.com/coreos/prometheus-operator
  enabled: false
  interval: 10s
  path: /metrics
  labels: {}

  ## If true, a PrometheusRule CRD is created for a prometheus operator
  ## https://github.com/coreos/prometheus-operator
  enabled: false
  prometheusNamespace: monitoring
  labels: {}
  #  role: alert-rules

  system.conf: |-
      root_dir /tmp/fluentd-buffers/
  containers.input.conf: |-
    # This configuration file for Fluentd / td-agent is used
    # to watch changes to Docker log files. The kubelet creates symlinks that
    # capture the pod name, namespace, container name & Docker container ID
    # to the docker logs for pods in the /var/log/containers directory on the host.
    # If running this fluentd configuration in a Docker container, the /var/log
    # directory should be mounted in the container.
    # These logs are then submitted to Elasticsearch which assumes the
    # installation of the fluent-plugin-elasticsearch & the
    # fluent-plugin-kubernetes_metadata_filter plugins.
    # See https://github.com/uken/fluent-plugin-elasticsearch &
    # https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter for
    # more information about the plugins.
    # Example
    # =======
    # A line in the Docker log file might look like this JSON:
    # {"log":"2014/09/25 21:15:03 Got request with path wombat\n",
    #  "stream":"stderr",
    #   "time":"2014-09-25T21:15:03.499185026Z"}
    # The time_format specification below makes sure we properly
    # parse the time format produced by Docker. This will be
    # submitted to Elasticsearch and should appear like:
    # $ curl 'http://elasticsearch-logging:9200/_search?pretty'
    # ...
    # {
    #      "_index" : "logstash-2014.09.25",
    #      "_type" : "fluentd",
    #      "_id" : "VBrbor2QTuGpsQyTCdfzqA",
    #      "_score" : 1.0,
    #      "_source":{"log":"2014/09/25 22:45:50 Got request with path wombat\n",
    #                 "stream":"stderr","tag":"docker.container.all",
    #                 "@timestamp":"2014-09-25T22:45:50+00:00"}
    #    },
    # ...
    # The Kubernetes fluentd plugin is used to write the Kubernetes metadata to the log
    # record & add labels to the log record if properly configured. This enables users
    # to filter & search logs on any metadata.
    # For example a Docker container's logs might be in the directory:
    #  /var/lib/docker/containers/997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b
    # and in the file:
    #  997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b-json.log
    # where 997599971ee6... is the Docker ID of the running container.
    # The Kubernetes kubelet makes a symbolic link to this file on the host machine
    # in the /var/log/containers directory which includes the pod name and the Kubernetes
    # container name:
    #    synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
    #    ->
    #    /var/lib/docker/containers/997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b/997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b-json.log
    # The /var/log directory on the host is mapped to the /var/log directory in the container
    # running this instance of Fluentd and we end up collecting the file:
    #   /var/log/containers/synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
    # This results in the tag:
    #  var.log.containers.synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
    # The Kubernetes fluentd plugin is used to extract the namespace, pod name & container name
    # which are added to the log message as a kubernetes field object & the Docker container ID
    # is also added under the docker field object.
    # The final tag is:
    #   kubernetes.var.log.containers.synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
    # And the final log record look like:
    # {
    #   "log":"2014/09/25 21:15:03 Got request with path wombat\n",
    #   "stream":"stderr",
    #   "time":"2014-09-25T21:15:03.499185026Z",
    #   "kubernetes": {
    #     "namespace": "default",
    #     "pod_name": "synthetic-logger-0.25lps-pod",
    #     "container_name": "synth-lgr"
    #   },
    #   "docker": {
    #     "container_id": "997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b"
    #   }
    # }
    # This makes it easier for users to search for logs by pod name or by
    # the name of the Kubernetes container regardless of how many times the
    # Kubernetes pod has been restarted (resulting in a several Docker container IDs).
    # Json Log Example:
    # {"log":"[info:2016-02-16T16:04:05.930-08:00] Some log text here\n","stream":"stdout","time":"2016-02-17T00:04:05.931087621Z"}
    # CRI Log Example:
    # 2016-02-17T00:04:05.931087621Z stdout F [info:2016-02-16T16:04:05.930-08:00] Some log text here
      @id fluentd-containers.log
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/containers.log.pos
      tag raw.kubernetes.*
      read_from_head true
        @type multi_format
          format json
          time_key time
          time_format %Y-%m-%dT%H:%M:%S.%NZ
          format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
          time_format %Y-%m-%dT%H:%M:%S.%N%:z
      @id nginxtest1.log
      @type tail
      path /var/log/containers/nginxtest1-*.log
      pos_file /var/log/nginxtest1.log.pos
      tag nginxtest1
      read_from_head true
        @type multi_format
          format json
          time_key time
          time_format %Y-%m-%dT%H:%M:%S.%NZ
          format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
          time_format %Y-%m-%dT%H:%M:%S.%N%:z
      @id httpdtest1.log
      @type tail
      path /var/log/containers/httpdtest1-*.log
      pos_file /var/log/httpdtest1.log.pos
      tag httpdtest1
      read_from_head true
        @type multi_format
          format json
          time_key time
          time_format %Y-%m-%dT%H:%M:%S.%NZ
          format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
          time_format %Y-%m-%dT%H:%M:%S.%N%:z

    # Detect exceptions in the log output and forward them as one log entry.
    <match raw.kubernetes.**>
      @id raw.kubernetes
      @type detect_exceptions
      remove_tag_prefix raw
      message log
      stream stream
      multiline_flush_interval 5
      max_bytes 500000
      max_lines 1000

    # Concatenate multi-line logs
    <filter **>
      @id filter_concat
      @type concat
      key message
      multiline_end_regexp /\n$/
      separator ""

    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      @id filter_kubernetes_metadata
      @type kubernetes_metadata

    # Fixes json fields in Elasticsearch
    <filter kubernetes.**>
      @id filter_parser
      @type parser
      key_name log
      reserve_data true
      remove_key_name_field true
        @type multi_format
          format json
          format none

  system.input.conf: |-
    # Example:
    # 2015-12-21 23:17:22,066 [salt.state       ][INFO    ] Completed state [net.ipv4.ip_forward] at time 23:17:22.066081
      @id minion
      @type tail
      format /^(?<time>[^ ]* [^ ,]*)[^\[]*\[[^\]]*\]\[(?<severity>[^ \]]*) *\] (?<message>.*)$/
      time_format %Y-%m-%d %H:%M:%S
      path /var/log/salt/minion
      pos_file /var/log/salt.pos
      tag salt

    # Example:
    # Dec 21 23:17:22 gke-foo-1-1-4b5cbd14-node-4eoj startupscript: Finished running startup script /var/run/google.startup.script
      @id startupscript.log
      @type tail
      format syslog
      path /var/log/startupscript.log
      pos_file /var/log/startupscript.log.pos
      tag startupscript

    # Examples:
    # time="2016-02-04T06:51:03.053580605Z" level=info msg="GET /containers/json"
    # time="2016-02-04T07:53:57.505612354Z" level=error msg="HTTP Error" err="No such image: -f" statusCode=404
    # TODO(random-liu): Remove this after cri container runtime rolls out.
      @id docker.log
      @type tail
      format /^time="(?<time>[^)]*)" level=(?<severity>[^ ]*) msg="(?<message>[^"]*)"( err="(?<error>[^"]*)")?( statusCode=($<status_code>\d+))?/
      path /var/log/docker.log
      pos_file /var/log/docker.log.pos
      tag docker

    # Example:
    # 2016/02/04 06:52:38 filePurge: successfully removed file /var/etcd/data/member/wal/00000000000006d0-00000000010a23d1.wal
      @id etcd.log
      @type tail
      # Not parsing this, because it doesn't have anything particularly useful to
      # parse out of it (like severities).
      format none
      path /var/log/etcd.log
      pos_file /var/log/etcd.log.pos
      tag etcd

    # Multi-line parsing is required for all the kube logs because very large log
    # statements, such as those that include entire object bodies, get split into
    # multiple lines by glog.
    # Example:
    # I0204 07:32:30.020537    3368 server.go:1048] POST /stats/container/: (13.972191ms) 200 [[Go-http-client/1.1]]
      @id kubelet.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kubelet.log
      pos_file /var/log/kubelet.log.pos
      tag kubelet

    # Example:
    # I1118 21:26:53.975789       6 proxier.go:1096] Port "nodePort for kube-system/default-http-backend:http" (:31429/tcp) was open before and is still needed
      @id kube-proxy.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kube-proxy.log
      pos_file /var/log/kube-proxy.log.pos
      tag kube-proxy

    # Example:
    # I0204 07:00:19.604280       5 handlers.go:131] GET /api/v1/nodes: (1.624207ms) 200 [[kube-controller-manager/v1.1.3 (linux/amd64) kubernetes/6a81b50]]
      @id kube-apiserver.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kube-apiserver.log
      pos_file /var/log/kube-apiserver.log.pos
      tag kube-apiserver

    # Example:
    # I0204 06:55:31.872680       5 servicecontroller.go:277] LB already exists and doesn't need update for service kube-system/kube-ui
      @id kube-controller-manager.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kube-controller-manager.log
      pos_file /var/log/kube-controller-manager.log.pos
      tag kube-controller-manager

    # Example:
    # W0204 06:49:18.239674       7 reflector.go:245] pkg/scheduler/factory/factory.go:193: watch of *api.Service ended with: 401: The event in requested index is outdated and cleared (the requested history has been cleared [2578313/2577886]) [2579312]
      @id kube-scheduler.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kube-scheduler.log
      pos_file /var/log/kube-scheduler.log.pos
      tag kube-scheduler

    # Example:
    # I0603 15:31:05.793605       6 cluster_manager.go:230] Reading config from path /etc/gce.conf
      @id glbc.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/glbc.log
      pos_file /var/log/glbc.log.pos
      tag glbc

    # Example:
    # TODO Add a proper example here.
      @id cluster-autoscaler.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/cluster-autoscaler.log
      pos_file /var/log/cluster-autoscaler.log.pos
      tag cluster-autoscaler

    # Logs from systemd-journal for interesting services.
    # TODO(random-liu): Remove this after cri container runtime rolls out.
      @id journald-docker
      @type systemd
      matches [{ "_SYSTEMD_UNIT": "docker.service" }]
        @type local
        persistent true
        path /var/log/journald-docker.pos
      read_from_head true
      tag docker

      @id journald-container-runtime
      @type systemd
      matches [{ "_SYSTEMD_UNIT": "{{ fluentd_container_runtime_service }}.service" }]
        @type local
        persistent true
        path /var/log/journald-container-runtime.pos
      read_from_head true
      tag container-runtime

      @id journald-kubelet
      @type systemd
      matches [{ "_SYSTEMD_UNIT": "kubelet.service" }]
        @type local
        persistent true
        path /var/log/journald-kubelet.pos
      read_from_head true
      tag kubelet

      @id journald-node-problem-detector
      @type systemd
      matches [{ "_SYSTEMD_UNIT": "node-problem-detector.service" }]
        @type local
        persistent true
        path /var/log/journald-node-problem-detector.pos
      read_from_head true
      tag node-problem-detector

      @id kernel
      @type systemd
      matches [{ "_TRANSPORT": "kernel" }]
        @type local
        persistent true
        path /var/log/kernel.pos
        fields_strip_underscores true
        fields_lowercase true
      read_from_head true
      tag kernel

  forward.input.conf: |-
    # Takes the messages sent over TCP
      @id forward
      @type forward

  monitoring.conf: |-
    # Prometheus Exporter Plugin
    # input plugin that exports metrics
      @id prometheus
      @type prometheus

      @id monitor_agent
      @type monitor_agent

    # input plugin that collects metrics from MonitorAgent
      @id prometheus_monitor
      @type prometheus_monitor
        host ${hostname}

    # input plugin that collects metrics for output plugin
      @id prometheus_output_monitor
      @type prometheus_output_monitor
        host ${hostname}

    # input plugin that collects metrics for in_tail plugin
      @id prometheus_tail_monitor
      @type prometheus_tail_monitor
        host ${hostname}

  output.conf: |-
    <match nginxtest1>
      @id nginxtest1
      @type elasticsearch
      @log_level info
      include_tag_key true
      type_name _doc
      host "#{ENV['OUTPUT_HOST']}"
      port "#{ENV['OUTPUT_PORT']}"
      scheme "#{ENV['OUTPUT_SCHEME']}"
      ssl_version "#{ENV['OUTPUT_SSL_VERSION']}"
      ssl_verify true
      user "#{ENV['OUTPUT_USER']}"
      password "#{ENV['OUTPUT_PASSWORD']}"
      logstash_format true
      logstash_prefix nginxtest1
      reconnect_on_error true
        @type file
        path /var/log/fluentd-buffers/nginxtest1.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_max_interval 30
        chunk_limit_size "#{ENV['OUTPUT_BUFFER_CHUNK_LIMIT']}"
        queue_limit_length "#{ENV['OUTPUT_BUFFER_QUEUE_LIMIT']}"
        overflow_action block
    <match httpdtest1>
      @id httpdtest1
      @type elasticsearch
      @log_level info
      include_tag_key true
      type_name _doc
      host "#{ENV['OUTPUT_HOST']}"
      port "#{ENV['OUTPUT_PORT']}"
      scheme "#{ENV['OUTPUT_SCHEME']}"
      ssl_version "#{ENV['OUTPUT_SSL_VERSION']}"
      ssl_verify true
      user "#{ENV['OUTPUT_USER']}"
      password "#{ENV['OUTPUT_PASSWORD']}"
      logstash_format true
      logstash_prefix httpdtest1
      reconnect_on_error true
        @type file
        path /var/log/fluentd-buffers/httpdtest1.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_max_interval 30
        chunk_limit_size "#{ENV['OUTPUT_BUFFER_CHUNK_LIMIT']}"
        queue_limit_length "#{ENV['OUTPUT_BUFFER_QUEUE_LIMIT']}"
        overflow_action block
    <match **>
      @id elasticsearch
      @type elasticsearch
      @log_level info
      include_tag_key true
      type_name _doc
      host "#{ENV['OUTPUT_HOST']}"
      port "#{ENV['OUTPUT_PORT']}"
      scheme "#{ENV['OUTPUT_SCHEME']}"
      ssl_version "#{ENV['OUTPUT_SSL_VERSION']}"
      ssl_verify true
      user "#{ENV['OUTPUT_USER']}"
      password "#{ENV['OUTPUT_PASSWORD']}"
      logstash_format true
      logstash_prefix "#{ENV['LOGSTASH_PREFIX']}"
      reconnect_on_error true
        @type file
        path /var/log/fluentd-buffers/kubernetes.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_max_interval 30
        chunk_limit_size "#{ENV['OUTPUT_BUFFER_CHUNK_LIMIT']}"
        queue_limit_length "#{ENV['OUTPUT_BUFFER_QUEUE_LIMIT']}"
        overflow_action block

# extraVolumes:
#   - name: es-certs
#     secret:
#       defaultMode: 420
#       secretName: es-certs
# extraVolumeMounts:
#   - name: es-certs
#     mountPath: /certs
#     readOnly: true


  1. 按照惯例将所有的镜像改成私服镜像;
  2. 连接Es集群的配置段,重点是这里连接的是client的service;
  3. 定义了fluentd的service
  4. 重点在configMap的配置,也就是fluentd的配置文件:定义缓存目录,因为fluentd也有一个缓存文件,日志先是写入到缓存文件然后再发送给ES,如果缓存文件设置的太小容易造成堵塞;
  5. fluentd默认的配置里面已经包含了容器的日志收集/var/log/containers/*.log 因为我们的容器只要是STDOUT输出的日志,默认都会在宿主机的/var/log/containers/Pod名称开始的日志名
  6. 如果使用fluentd的默认配置的话,所有的日志都会收集到一个索引文件,也就是默认名称为logstash-年月日的索引文件,在Kibana上面创建索引后,所有的日志都汇总在一起,如果需要
  7. 所以我们做了一个测试,我的k8s容器里面运行了nginx应用和apahce httpd应用。然后两个应用分布设置了两个不同的索引,使用不同的TAG;这个也是我们这个EFK日志解决方案的关键之处;
  8. 除了pod的日志,fluentd的配置文件中也默认收集了kube-controller-manager、 kube-scheduler、 kube-apiserver、kube-proxy、kubelet、etcd、docker等服务的日志;
  9. output.conf文件主要针对不同的源(比如nginx和httpd两个源)做了不同的索引前缀logstash_prefix,默认的logstash_prefix就是logstash
  10. 定义输出的时候还有几个参数可能需要优化,如flush_thread_count flush_interval chunk_limit_size等关于fluentd本地缓存的参数;调整的好就不至于日志d
LAST DEPLOYED: Tue Apr 30 17:55:30 2019
NAMESPACE: kube-system

==> v1/ConfigMap
NAME                   DATA  AGE
fluentd-elasticsearch  6     2d

==> v1/ServiceAccount
NAME                   SECRETS  AGE
fluentd-elasticsearch  1        2d

==> v1/ClusterRole
NAME                   AGE
fluentd-elasticsearch  2d

==> v1/ClusterRoleBinding
NAME                   AGE
fluentd-elasticsearch  2d

==> v1/Service
NAME                   TYPE       CLUSTER-IP     EXTERNAL-IP  PORT(S)    AGE
fluentd-elasticsearch  ClusterIP  <none>       24231/TCP  2d

==> v1/DaemonSet
fluentd-elasticsearch  8        8        8      8           8          <none>         2d

==> v1/Pod(related)
NAME                         READY  STATUS   RESTARTS  AGE
fluentd-elasticsearch-2trp8  1/1    RunningE0502 18:09:33.641998   29738 portforward.go:303] error copying from remote stream to local connection: readfrom tcp4> write tcp4> write: broken pipe
  0         2d
fluentd-elasticsearch-2xgtb  1/1    Running  0         2d
fluentd-elasticsearch-589jc  1/1    Running  0         2d
fluentd-elasticsearch-ctkv8  1/1    Running  0         2d
fluentd-elasticsearch-d5dvz  1/1    Running  0         2d
fluentd-elasticsearch-kgdxp  1/1    Running  0         2d
fluentd-elasticsearch-r2c8h  1/1    Running  0         2d
fluentd-elasticsearch-z8p7b  1/1    Running  0         2d

1. To verify that Fluentd has started, run:

  kubectl --namespace=kube-system get pods -l "app.kubernetes.io/name=fluentd-elasticsearch,app.kubernetes.io/instance=fluentd-elasticsearch"

THIS APPLICATION CAPTURES ALL CONSOLE OUTPUT AND FORWARDS IT TO elasticsearch . Anything that might be identifying,
including things like IP addresses, container images, and object names will NOT be anonymized.
2. Get the application URL by running these commands:
  export POD_NAME=$(kubectl get pods --namespace kube-system -l "app.kubernetes.io/name=fluentd-elasticsearch,app.kubernetes.io/instance=fluentd-elasticsearch" -o jsonpath="{.items[0].metadata.name}")
  echo "Visit to use your application"
  kubectl port-forward $POD_NAME 8080:80


# Default values for elasticsearch.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
appVersion: "6.7.0"

## Define serviceAccount names for components. Defaults to component's fully qualified name.
    create: true
    create: true
    create: true

## Specify if a Pod Security Policy for node-exporter must be created
## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/
  enabled: false
  annotations: {}
    ## Specify pod annotations
    ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#apparmor
    ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#seccomp
    ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#sysctl
    # seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'
    # seccomp.security.alpha.kubernetes.io/defaultProfileName: 'docker/default'
    # apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default'

  # repository: "k8s.harbor.maimaiti.site/system/elasticsearch-oss"
  repository: "k8s.harbor.maimaiti.site/system/elasticsearch"
  tag: "6.7.0"
  pullPolicy: "IfNotPresent"
  # If specified, use these secrets to access the image
  # pullSecrets:
  #   - registry-secret

  image: "dduportal/bats"
  tag: "0.4.0"

  repository: "busybox"
  tag: "latest"
  pullPolicy: "Always"

  name: "elasticsearch"
  # If you want X-Pack installed, switch to an image that includes it, enable this option and toggle the features you want
  # enabled in the environment variables outlined in the README
  xpackEnable: true
  # Some settings must be placed in a keystore, so they need to be mounted in from a secret.
  # Use this setting to specify the name of the secret
  # keystoreSecret: eskeystore
  config: {}
  # Custom parameters, as string, to be added to ES_JAVA_OPTS environment variable
  additionalJavaOpts: ""
  # Command to run at the end of deployment
  bootstrapShellCommand: ""
    # IMPORTANT: https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#minimum_master_nodes
    # To prevent data loss, it is vital to configure the discovery.zen.minimum_master_nodes setting so that each master-eligible
    # node knows the minimum number of master-eligible nodes that must be visible in order to form a cluster.
  # List of plugins to install via dedicated init container
  plugins: []
    # - ingest-attachment
    # - mapper-size

  name: client
  replicas: 2
  serviceType: ClusterIP
  ## If coupled with serviceType = "NodePort", this will set a specific nodePort to the client HTTP port
  # httpNodePort: 30920
  loadBalancerIP: {}
  loadBalancerSourceRanges: {}
## (dict) If specified, apply these annotations to the client service
#  serviceAnnotations:
#    example: client-svc-foo
  heapSize: "512m"
  # additionalJavaOpts: "-XX:MaxRAM=512m"
  antiAffinity: "soft"
  nodeAffinity: {}
  nodeSelector: {}
  tolerations: []
  initResources: {}
    # limits:
    #   cpu: "25m"
    #   # memory: "128Mi"
    # requests:
    #   cpu: "25m"
    #   memory: "128Mi"
      cpu: "1"
      # memory: "1024Mi"
      cpu: "25m"
      memory: "512Mi"
  priorityClassName: ""
  ## (dict) If specified, apply these annotations to each client Pod
  # podAnnotations:
  #   example: client-foo
    enabled: false
    minAvailable: 1
    # maxUnavailable: 1
    enabled: false
    # user: NAME
    # password: PASSWORD
    annotations: {}
      # kubernetes.io/ingress.class: nginx
      # kubernetes.io/tls-acme: "true"
    path: /
      - chart-example.local
    tls: []
    #  - secretName: chart-example-tls
    #    hosts:
    #      - chart-example.local

  name: master
  exposeHttp: false
  replicas: 3
  heapSize: "512m"
  # additionalJavaOpts: "-XX:MaxRAM=512m"
    enabled: false
    accessMode: ReadWriteOnce
    name: data
    size: "4Gi"
    storageClass: "dynamic"
      path: /_cluster/health?local=true
      port: 9200
    initialDelaySeconds: 5
  antiAffinity: "soft"
  nodeAffinity: {}
  nodeSelector: {}
  tolerations: []
  initResources: {}
    # limits:
    #   cpu: "25m"
    #   # memory: "128Mi"
    # requests:
    #   cpu: "25m"
    #   memory: "128Mi"
      cpu: "1"
      # memory: "1024Mi"
      cpu: "25m"
      memory: "512Mi"
  priorityClassName: ""
  ## (dict) If specified, apply these annotations to each master Pod
  # podAnnotations:
  #   example: master-foo
  podManagementPolicy: OrderedReady
    enabled: false
    minAvailable: 2  # Same as `cluster.env.MINIMUM_MASTER_NODES`
    # maxUnavailable: 1
    type: OnDelete

  name: data
  exposeHttp: false
  replicas: 2
  heapSize: "1536m"
  # additionalJavaOpts: "-XX:MaxRAM=1536m"
    enabled: false
    accessMode: ReadWriteOnce
    name: data
    size: "30Gi"
    storageClass: "dynamic"
      path: /_cluster/health?local=true
      port: 9200
    initialDelaySeconds: 5
  terminationGracePeriodSeconds: 3600
  antiAffinity: "soft"
  nodeAffinity: {}
  nodeSelector: {}
  tolerations: []
  initResources: {}
    # limits:
    #   cpu: "25m"
    #   # memory: "128Mi"
    # requests:
    #   cpu: "25m"
    #   memory: "128Mi"
      cpu: "1"
      # memory: "2048Mi"
      cpu: "25m"
      memory: "1536Mi"
  priorityClassName: ""
  ## (dict) If specified, apply these annotations to each data Pod
  # podAnnotations:
  #   example: data-foo
    enabled: false
    # minAvailable: 1
    maxUnavailable: 1
  podManagementPolicy: OrderedReady
    type: OnDelete
  hooks:  # post-start and pre-stop hooks
    drain:  # drain the node before stopping it and re-integrate it into the cluster after start
      enabled: true

## Sysctl init container to setup vm.max_map_count
# see https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html
# and https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration-memory.html#mlockall
  enabled: true
## Additional init containers
extraInitContainers: |


  1. 经镜像都改为私服镜像;
  2. 定义ES集群的名字;
  3. 定义client服务Pod的名字,副本数,jvm内存使用量;
  4. 定义master服务Pod的名字,副本数,master一般为奇数,3或者5,最好开启持久化存储,使用ceph RBD的sc
  5. 定义data节点的POD的名字,副本数,最好开启持久化存储;注意数据节点这里开启了hooks。就是数据节点
    6、 kibana和fluented连接的都是elasticsearch-client的svc;9200是提供服务的端口,9300是集群端口。还有一个
[root@master-01 fluentd-elasticsearch]# helm status elasticsearch
LAST DEPLOYED: Tue Apr 30 17:17:13 2019
NAMESPACE: kube-system

==> v1/ConfigMap
NAME                DATA  AGE
elasticsearch       4     2d
elasticsearch-test  1     2d

==> v1/ServiceAccount
NAME                  SECRETS  AGE
elasticsearch-client  1        2d
elasticsearch-data    1        2d
elasticsearch-master  1        2d

==> v1/Service
NAME                     TYPE       CLUSTER-IP     EXTERNAL-IP  PORT(S)   AGE
elasticsearch-client     ClusterIP  <none>       9200/TCP  2d
elasticsearch-discovery  ClusterIP  None           <none>       9300/TCP  2d

==> v1beta1/Deployment
elasticsearch-client  2        2        2           2          2d

==> v1beta1/StatefulSet
NAME                  DESIRED  CURRENT  AGE
elasticsearch-data    2        2        2d
elasticsearch-master  3        3        2d

==> v1/Pod(related)
NAME                                   READY  STATUS   RESTARTS  AGE
elasticsearch-client-6bb89766f9-wfbxh  1/1    Running  0         2d
elasticsearch-client-6bb89766f9-xvz6c  1/1    Running  0         2d
elasticsearch-data-0                   1/1    Running  0         2d
elasticsearch-data-1                   1/1    Running  0         2d
elasticsearch-master-0                 1/1    Running  0         2d
elasticsearch-master-1                 1/1    Running  0         2d
elasticsearch-master-2                 1/1    Running  0         2d

The elasticsearch cluster has been installed.

Elasticsearch can be accessed:

  * Within your cluster, at the following DNS name at port 9200:


  * From outside the cluster, run these commands in the same shell:

    export POD_NAME=$(kubectl get pods --namespace kube-system -l "app=elasticsearch,component=client,release=elasticsearch" -o jsonpath="{.items[0].metadata.name}")
    echo "Visit to use Elasticsearch"
    kubectl port-forward --namespace kube-system $POD_NAME 9200:9200
  # repository: "k8s.harbor.maimaiti.site/system/kibana-oss"
  repository: "k8s.harbor.maimaiti.site/system/kibana"
  tag: "6.7.0"
  pullPolicy: "IfNotPresent"

  image: "dduportal/bats"
  tag: "0.4.0"

  args: []

env: {}
  # All Kibana configuration options are adjustable via env vars.
  # To adjust a config option to an env var uppercase + replace `.` with `_`
  # Ref: https://www.elastic.co/guide/en/kibana/current/settings.html
  # ELASTICSEARCH_URL: http://elasticsearch-client:9200
  # SERVER_PORT: 5601
  # SERVER_DEFAULTROUTE: "/app/kibana"

    ## Default Kibana configuration from kibana-docker.
    server.name: kibana
    server.host: "0"
    elasticsearch.url: http://elasticsearch-client.kube-system:9200

    ## Custom config properties below
    ## Ref: https://www.elastic.co/guide/en/kibana/current/settings.html
    # server.port: 5601
    # logging.verbose: "true"
    # server.defaultRoute: "/app/kibana"

  annotations: {}

  type: NodePort
  nodePort: 30001
  # clusterIP: None
  # portName: kibana-svc
  externalPort: 443
  internalPort: 5601
  # authProxyPort: 5602 To be used with authProxyEnabled and a proxy extraContainer
  ## External IP addresses of service
  ## Default: nil
  # externalIPs:
  # -
  ## LoadBalancer IP if service.type is LoadBalancer
  ## Default: nil
  # loadBalancerIP:
  annotations: {}
    # Annotation example: setup ssl with aws cert when service.type is LoadBalancer
    # service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:us-east-1:EXAMPLE_CERT
  labels: {}
    ## Label example: show service URL in `kubectl cluster-info`
    # kubernetes.io/cluster-service: "true"
  ## Limit load balancer source ips to list of CIDRs (where available)
  # loadBalancerSourceRanges: []
  selector: {}

  enabled: false
  # hosts:
    # - kibana.localhost.localdomain
    # - localhost.localdomain/kibana
  # annotations:
  #   kubernetes.io/ingress.class: nginx
  #   kubernetes.io/tls-acme: "true"
  # tls:
    # - secretName: chart-example-tls
    #   hosts:
    #     - chart-example.local

  # Specifies whether a service account should be created
  create: true
  # The name of the service account to use.
  # If not set and create is true, a name is generated using the fullname template
  # If set and create is false, the service account must be existing

  enabled: false
  path: /status
  initialDelaySeconds: 30
  timeoutSeconds: 10

  enabled: false
  path: /status
  initialDelaySeconds: 30
  timeoutSeconds: 10
  periodSeconds: 10
  successThreshold: 5

# Enable an authproxy. Specify container in extraContainers
authProxyEnabled: false

extraContainers: |
# - name: proxy
#   image: quay.io/gambol99/keycloak-proxy:latest
#   args:
#     - --resource=uri=/*
#     - --discovery-url=https://discovery-url
#     - --client-id=client
#     - --client-secret=secret
#     - --listen=
#     - --upstream-url=
#   ports:
#     - name: web
#       containerPort: 9090

extraVolumeMounts: []

extraVolumes: []

resources: {}
  # limits:
  #   cpu: 100m
  #   memory: 300Mi
  # requests:
  #   cpu: 100m
  #   memory: 300Mi

priorityClassName: ""

# Affinity for pod assignment
# Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
# affinity: {}

# Tolerations for pod assignment
# Ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
tolerations: []

# Node labels for pod assignment
# Ref: https://kubernetes.io/docs/user-guide/node-selection/
nodeSelector: {}

podAnnotations: {}
replicaCount: 1
revisionHistoryLimit: 3

# Custom labels for pod assignment
podLabels: {}

# To export a dashboard from a running Kibana 6.3.x use:
# curl --user <username>:<password> -XGET https://kibana.yourdomain.com:5601/api/kibana/dashboards/export?dashboard=<some-dashboard-uuid> > my-dashboard.json
# A dashboard is defined by a name and a string with the json payload or the download url
  enabled: false
  timeout: 60
    enabled: true
    username: myuser
    password: mypass
  dashboards: {}
    # k8s: https://raw.githubusercontent.com/monotek/kibana-dashboards/master/k8s-fluentd-elasticsearch.json

# List of plugins to install using initContainer
# NOTE : We notice that lower resource constraints given to the chart + plugins are likely not going to work well.
  # set to true to enable plugins installation
  enabled: false
  # set to true to remove all kibana plugins before installation
  reset: false
  # Use <plugin_name,version,url> to add/upgrade plugin
    # - elastalert-kibana-plugin,1.0.1,https://github.com/bitsensor/elastalert-kibana-plugin/releases/download/1.0.1/elastalert-kibana-plugin-1.0.1-6.4.2.zip
    # - logtrail,0.1.31,https://github.com/sivasamyk/logtrail/releases/download/v0.1.31/logtrail-6.6.0-0.1.31.zip
    # - other_plugin

  # set to true to use pvc
  enabled: false
  # set to true to use you own pvc
  existingClaim: false
  annotations: {}

    - ReadWriteOnce
  size: "5Gi"
  ## If defined, storageClassName: <storageClass>
  ## If set to "-", storageClassName: "", which disables dynamic provisioning
  ## If undefined (the default) or set to null, no storageClassName spec is
  ##   set, choosing the default provisioner.  (gp2 on AWS, standard on
  ##   GKE, AWS & OpenStack)
  # storageClass: "-"

# default security context
  enabled: false
  allowPrivilegeEscalation: false
  runAsUser: 1000
  fsGroup: 2000

extraConfigMapMounts: []
  # - name: logtrail-configs
  #   configMap: kibana-logtrail
  #   mountPath: /usr/share/kibana/plugins/logtrail/logtrail.json
  #   subPath: logtrail.json

# Add your own init container or uncomment and modify the given example.
initContainers: {}
  ## Don't start kibana till Elasticsearch is reachable.
  ## Ensure that it is available at http://elasticsearch:9200
  # es-check:  # <- will be used as container name
  #   image: "appropriate/curl:latest"
  #   imagePullPolicy: "IfNotPresent"
  #   command:
  #     - "/bin/sh"
  #     - "-c"
  #     - |
  #       is_down=true
  #       while "$is_down"; do
  #         if curl -sSf --fail-early --connect-timeout 5 http://elasticsearch:9200; then
  #           is_down=false
  #         else
  #           sleep 5
  #         fi
  #       done


  1. 镜像换成私有仓库地址;
  2. kibana的配置文件中配置连接ES的地址,注意是连接elasticsearch-client的9200端口;
  3. kibana就部署一个deployment无状态应用就行。前面提到的ES就一定要是StatefulSet资源类型,fluentd一定要是DaemonSet资源类型;
  4. 可以定义一个ingress,因为kibana才是给用户提供访问的;
