在做运维工作期间的文档记录总结
- 一、docker
- 二、yum download only
- 三、keepalived主备模式实例
- 四、Supervisor
- 五、授权非root用户启动1024以下端口的应用服务
- 六、Linux ssh免密登录
- 七、ELK
- 八、Ansible
- 九、开机启动脚本示例
- 十、Linux 定时任务
- 十一、书栈网BookStack
- 十二、nginx
- 十三、redis
- 十四、解压、压缩
- 十五、tomcat监控
- 十六、pip离线安装包
- 零零散散的
一、docker
docker安装方法
在centos中安装docker环境
docker-ce官网安装教程:https://docs.docker.com/install/linux/docker-ce/centos/
第一种
配置yum源在线安装
1、如果有旧的版本docker,需要先卸载
yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine
2、安装 yum-utils 等组件
yum install -y yum-utils \
device-mapper-persistent-data \
lvm2
3、添加yum源
yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
4、安装docker-ce
yum install docker-ce
第二种
在线脚本安装
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker your-user
# 启动docker并设置开机启动
systemctl start docker
systemctl enable docker
第三种
离线安装,下载docker包,这种方法最快最省事,比较奇葩
下载地址:https://download.docker.com/linux/static/stable/x86_64/
官方文档:https://docs.docker.com/install/linux/docker-ce/binaries/
官网说明:
Download the static binary archive. Go to https://download.docker.com/linux/static/stable/ (or change stable
to edge
or test
), choose your hardware platform, and download the .tgz
file relating to the version of Docker CE you want to install.
tar xzvf /path/to/<FILE>.tgz
sudo cp docker/* /usr/bin/
sudo dockerd &
卸载docker
sudo yum remove docker-ce
sudo rm -rf /var/lib/docker
1. node_exporter
docker运行node_exporter
docker run -d \
--net="host" \
--pid="host" \
--name=node-exporter \
-v "/:/host:ro,rslave" \
quay.io/prometheus/node-exporter \
--path.rootfs /host
2. grafana
docker运行grafana
docker run -d -p 3000:3000 --name=grafana --network host \
-e "GF_SERVER_ROOT_URL=http://grafana.server.name" \
-e "GF_SECURITY_ADMIN_PASSWORD=secret" \
grafana/grafana
2.1 grafana变量级联问题
2.1.1 Prometheus配置
prometheus的配置文件job_name.static_configs.labels配置item
查询数据时,会有item标签,instance标签
在grafana中添加Prometheus数据源
2.1.2 设置变量item取item标签的值
1、选择设置变量
2、取名item
3、type选择Query
4、数据源选添加的Prometheus数据源
5、查询语句写up(根据实际情况更改)
6、正则过滤结果:/.*item="([^"]*).*/
7、结果值
2.1.3 设置变量server提取item标签下的instance
与上一步相同,关键点就是在Query里要引用上一步定义的变量item
Query:up{item="[[item]]"}
#使用[[]]来引用变量
Regex::/.*instance="([^"]*).*/
2.1.4 结果
item选择redis,就在在server中展示item="redis"的所有instance,同理选log
tips:
变量设置例子
数据:a{service="XXX"}
/.*service="([^"]*).*/ 取得XXX
数据:a{instance="101.22.34.22:8080"}
/.*instance="([^"]*):.*/ 取得101.22.34.22
/.*instance="([^"]*).*/ 取得101.22.34.22:8080
3. blackbox_exporter
docker运行blackbox_exporter
docker pull prom/blackbox-exporter
docker run -it -p 9115:9115 -v /root/blackbox.yml:/etc/blackbox_exporter/config.yml prom/blackbox-exporter
# /root/blackbox.yml是你自己的配置文件
4. cadvisor
docker run \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:ro \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--volume=/dev/disk/:/dev/disk:ro \
--publish=8090:8080 \
--detach=true \
--name=cadvisor \
google/cadvisor
5. alertmanager
- 在docker运行
# 拉取镜像
docker pull prom/alertmanager
# 运行镜像,a.yml是当前目录下alertmanager需要用的配置文件
docker run -it -p 9093:9093 -v `pwd`/a.yml:/etc/alertmanager/alertmanager.yml --name alertmanager prom/alertmanager:latest
- 在Linux系统中运行
# Prometheus官网下载安装包
wget https://github.com/prometheus/alertmanager/releases/download/v0.15.2/alertmanager-0.15.2.linux-amd64.tar.gz
# 解压后切到解压目录
./alertmanager --config.file=<alertmanager_config_file.yml>
- 报警请求链接示例配置
global:
resolve_timeout: 5m
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'web.hook'
receivers:
- name: 'web.hook'
webhook_configs:
- url: 'http://192.168.255.1:8080/alert/' //报警时需要调用的链接
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
发送邮件示例
global:
resolve_timeout: 5m
smtp_smarthost: 'smtp.163.com:25'
smtp_from: 'XXX@163.com'
smtp_auth_username: 'XXX@163.com'
smtp_auth_password: '12345678'
smtp_require_tls: false
#templates:
#- 'templates/*.tmpl'
route:
receiver: 'default-receiver'
group_by: ['alertname']
group_wait: 30s
group_interval: 5m
repeat_interval: 3h
receivers:
- name: 'default-receiver'
email_configs:
- to: '87654321@qq.com, 12345678@qq.com'
# html: '{
{ template "alert.html" . }}'
headers: {
Subject: "[WARN] 邮件test" }
6. prometheus
6.1 docker中运行prometheus server
prometheus server的配置文件prometheus.yml实例
global:
scrape_interval: 15s
external_labels:
monitor: 'codelab-monitor'
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['192.168.255.129:9090']
labels:
instance: 'prometheus'
- job_name: 'linux1'
static_configs:
- targets: ['192.168.255.128:9100']
labels:
instance: 'sys1'
- job_name: 'linux2'
static_configs:
- targets: ['192.168.255.129:9100']
labels:
instance: 'sys2'
alerting:
alertmanagers:
- static_configs:
- targets: ["192.168.255.129:9093"]
# 报警规则文件必须在Docker中运行必须是这个路径/etc/prometheus/rules.yml
rule_files:
- /etc/prometheus/rules.yml
prometheus server的报警规则文件rules.yml实例
groups:
- name: test-rule
rules:
- alert: ServerDown
expr: up == 0
for: 2m
labels:
team: node
annotations:
summary: "{
{$labels.instance}}: Down"
description: "{
{$labels.instance}}: Down (current value is: {
{ $value }}"
- alert: UrlConnectable
expr: probe_success == 0
for: 2m
labels:
team: blackbox
annotations:
summary: "{
{$labels.instance}}: Lost connection "
description: "{
{$labels.instance}}: Lost connection (current value is: {
{ $value }}"
检测语法错误:./promtool check-config prometheus.yml
运行prometheus server
# prometheus.yml rules.yml都放在/root/prometheus/(同一个目录)目录下
docker run -it -p 9090:9090 -v /root/