文章目录
可以参考我翻译的官方相关文档:Fluent Bit 安装在 Kubernetes。
使用 kubectl version 命令,输出的信息会显示 client 和 server 的版本信息,client 代表 kubectl 版本信息,server 代表的是 master 节点的 k8s 版本信息。
[test@gov-master29 new-logging]$ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:53:14Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
[test@gov-master29 new-logging]$
1. 文件下载
GitHub 资源库地址:https://github.com/fluent/fluent-bit-kubernetes-logging。
首先找到上图中的五个文件,下载下来放到本地服务器上的一个目录里,我这里放在了/home/data/fluentd
。
[test@gov-master29 fluentd]$ cd /home/data/fluentd
[test@gov-master29 fluentd]$ ll
total 20
-rw-rw-r--. 1 test test 3869 Jul 28 11:06 fluent-bit-configmap.yaml
-rw-rw-r--. 1 test test 1894 Jul 28 14:17 fluent-bit-ds.yaml
-rw-rw-r--. 1 test test 270 Jul 28 11:01 fluent-bit-role-binding.yaml
-rw-rw-r--. 1 test test 189 Jul 28 11:00 fluent-bit-role.yaml
-rw-rw-r--. 1 test test 92 Jul 28 10:23 fluent-bit-service-account.yaml
2. 文件修改
接下来需要修改下里面的对应内容,比如命名空间,还有配置。
2.1. fluent-bit-service-account.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluent-bit
namespace: gtcom-logging
2.2. fluent-bit-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluent-bit-read
rules:
- apiGroups: [""]
resources:
- namespaces
- pods
verbs: ["get", "list", "watch"]
2.3. fluent-bit-role-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: fluent-bit-read
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: fluent-bit-read
subjects:
- kind: ServiceAccount
name: fluent-bit
namespace: gtcom-logging
2.4. fluent-bit-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
namespace: gtcom-logging
labels:
k8s-app: fluent-bit
data:
# Configuration files: server, input, filters and output
# ======================================================
fluent-bit.conf: |
[SERVICE]
Flush 1
Log_Level info
Daemon off
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
@INCLUDE input-kubernetes.conf
@INCLUDE filter-kubernetes.conf
@INCLUDE output-kafka.conf
input-kubernetes.conf: |
[INPUT]
Name tail
Tag gtcom.*
Path /var/log/containers/gtcom-*.log
Parser docker
DB /var/log/flb_kube.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10
Buffer_Chunk_Size 32k
Buffer_Max_Size 100k
Path_Key log_flie_path
[INPUT]
Name tail
Tag dataquality.*
Path /var/log/containers/dataquality-*.log
Parser docker
DB /var/log/flb_kube.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10
Buffer_Chunk_Size 32k
Buffer_Max_Size 100k
Path_Key log_flie_path
filter-kubernetes.conf: |
[FILTER]
Name kubernetes
Match gtcom.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Kube_Tag_Prefix gtcom.var.log.containers.
Merge_Log Off
K8S-Logging.Parser On
K8S-Logging.Exclude Off
Labels Off
[FILTER]
Name parser
Match gtcom.*
Key_Name log
Parser field_parse
Reserve_Data On
[FILTER]
Name record_modifier
Match gtcom.*
Record source k8s-cluster-gtcom
[FILTER]
Name kubernetes
Match dataquality.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Kube_Tag_Prefix dataquality.var.log.containers.
Merge_Log Off
K8S-Logging.Parser On
K8S-Logging.Exclude Off
Labels Off
[FILTER]
Name parser
Match dataquality.*
Key_Name log
Parser key_switch
Reserve_Data On
[FILTER]
Name record_modifier
Match dataquality.*
Record source k8s-cluster-dataquality
output-kafka.conf: |
[OUTPUT]
Name kafka
Match *
Brokers 192.168.50.10:9092,192.168.50.11:9092,192.168.50.12:9092
Topics fluent-bit-k8s
Timestamp_Key timestamp
Retry_Limit false
# hides errors "Receive failed: Disconnected" when kafka kills idle connections
rdkafka.log.connection.close false
# producer buffer is not included in http://fluentbit.io/documentation/0.12/configuration/memory_usage.html#estimating
rdkafka.queue.buffering.max.kbytes 10240
# for logs you'll probably want this ot be 0 or 1, not more
rdkafka.request.required.acks 1
parsers.conf: |
[PARSER]
Name field_parse
Format regex
Regex /^(?<log_time>\d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2},\d+) \[(?<thread>[^\]]+)\] (?<level>[A-Z]+)\s+(?<class>[a-zA-Z][a-zA-Z0-9\._]*)\s+\S+\s+(?<message>.*)/
[PARSER]
Name key_switch
Format regex
Regex /^(?<message>.*)/
[PARSER]
Name apache
Format regex
Regex ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name apache2
Format regex
Regex ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name apache_error
Format regex
Regex ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$
[PARSER]
Name nginx
Format regex
Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name json
Format json
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
[PARSER]
Name syslog
Format regex
Regex ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
Time_Key time
Time_Format %b %d %H:%M:%S
2.5. fluent-bit-ds.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluent-bit
namespace: gtcom-logging
labels:
k8s-app: fluent-bit-logging
version: v1
kubernetes.io/cluster-service: "true"
spec:
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
k8s-app: fluent-bit-logging
template:
metadata:
labels:
k8s-app: fluent-bit-logging
version: v1
kubernetes.io/cluster-service: "true"
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "2020"
prometheus.io/path: /api/v1/metrics/prometheus
spec:
containers:
- name: fluent-bit
image: fluent/fluent-bit:1.8
imagePullPolicy: Always
ports:
- containerPort: 2020
resources:
requests:
cpu: 5m
memory: 10Mi
limits:
cpu: 50m
memory: 60Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /home/data/docker/containers
readOnly: true
- name: fluent-bit-config
mountPath: /fluent-bit/etc/
- name: host-time
mountPath: /etc/localtime
readOnly: true
terminationGracePeriodSeconds: 10
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /home/data/docker/containers
- name: host-time
hostPath:
path: /etc/localtime
- name: fluent-bit-config
configMap:
name: fluent-bit-config
serviceAccountName: fluent-bit
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
- operator: "Exists"
effect: "NoExecute"
- operator: "Exists"
effect: "NoSchedule"
3. 部署服务
第一次部署时执行
[test@gov-master29 fluentd]$ kubectl create -f fluent-bit-service-account.yaml
[test@gov-master29 fluentd]$ kubectl create -f fluent-bit-role.yaml
[test@gov-master29 fluentd]$ kubectl create -f fluent-bit-role-binding.yaml
[test@gov-master29 fluentd]$ kubectl create -f fluent-bit-configmap.yaml
[test@gov-master29 fluentd]$ kubectl create -f fluent-bit-ds.yaml
执行过之后如果有修改需要使用 replace
[test@gov-master29 fluentd]$ kubectl replace --force -f fluent-bit-service-account.yaml
[test@gov-master29 fluentd]$ kubectl replace --force -f fluent-bit-role.yaml
[test@gov-master29 fluentd]$ kubectl replace --force -f fluent-bit-role-binding.yaml
[test@gov-master29 fluentd]$ kubectl replace --force -f fluent-bit-configmap.yaml
[test@gov-master29 fluentd]$ kubectl replace --force -f fluent-bit-ds.yaml
之后再次修改了 configmap 文件后(为了测试调整 Fluent Bit 的配置,需要经常修改该文件),先在界面上停止 pod,然后执行 ds 文件创建并启动 pod。
[test@gov-master29 fluentd]$ kubectl replace --force -f fluent-bit-ds.yaml
daemonset.apps/fluent-bit replaced
[test@gov-master29 fluentd]$
工作负载,守护进程集:
容器组:
配置字典,可以在这里直接修改 Fluent Bit 的配置内容,然后重启容器组中的容器即可生效,用于测试 Fluent Bit 配置:
这里的修改只是临时的,如果工作负载没了,重新使用 ds 文件创建 pod 就会恢复原状,所以这里测试成功后,需要将新的 Fluent Bit 的配置内容更新到服务器上的 fluent-bit-configmap.yaml 文件上,这样下次使用 fluent-bit-ds.yaml 创建负载的时候就使用的是最新的配置了。
4. 进行测试
K8S 控制台打印的业务日志:
# K8S 日志目录
[test@gov-node36 containers]$ pwd
/var/log/containers
[test@gov-node36 containers]$ sudo tail -f gtcom-governance-social-k8s-kafka-taskmanager-2-3_gtcom-governance_flink-task-manager-8adadb28921d2417efac12604f61e97806351e616e70895bbabaef07226bc2ce.log
[sudo] password for test:
...这里打印日志...
格式化后如下所示:
{
"log":"2021-08-07 12:42:32,917 [算法治理 (48/50)#1] INFO com.gtcom.governance.common.Processor - [messageId=61042605055]流程[社交治理流程]阶段[算法治理]处理总耗时: [0]ms",
"stream":"stdout",
"time":"2021-08-07T04:42:32.917075844Z"
}
消费 Kafka 的数据:
[test@node-10 ~]$ cd /home/data/kafka/bin/
[test@node-10 bin]$ ./kafka-console-consumer.sh --bootstrap-server 192.168.50.10:9092,192.168.50.11:9092,192.168.50.12:9092 --topic fluent-bit-k8s
Kafka 中收到的日志:
{
"timestamp":1628311352.917076,
"log_time":"2021-08-07 12:42:32,917",
"thread":"算法治理 (48/50)#1",
"level":"INFO",
"class":"com.gtcom.governance.common.Processor",
"message":"[messageId=61042605055]流程[社交治理流程]阶段[算法治理]处理总耗时: [0]ms",
"log_flie_path":"/var/log/containers/gtcom-governance-social-k8s-kafka-taskmanager-2-3_gtcom-governance_flink-task-manager-8adadb28921d2417efac12604f61e97806351e616e70895bbabaef07226bc2ce.log",
"stream":"stdout",
"time":"2021-08-07T04:42:32.917075844Z",
"kubernetes":{
"pod_name":"gtcom-governance-social-k8s-kafka-taskmanager-2-3",
"namespace_name":"gtcom-governance",
"pod_id":"a0a243e2-29ac-42fc-b409-af2a4344e85c",
"host":"gov-node36",
"container_name":"flink-task-manager",
"docker_id":"8adadb28921d2417efac12604f61e97806351e616e70895bbabaef07226bc2ce",
"container_hash":"192.168.50.28:30002/gtcom/gtcom-governance@sha256:fbbdaca76e0ae67c4949a5bcf7e8e7fc6cf73973c0cbab637410120e2e866f9f",
"container_image":"192.168.50.28:30002/gtcom/gtcom-governance:2.3.44"
},
"source":"k8s-cluster-gtcom"
}