采集方法 : 读取 /proc 目录、执行系统调用、执行命令行工具、远程黑盒探测、远程拉取特定协议的数据
读取 /proc
/proc : 在内存中的伪文件系统,保存 Linux 的运行数据,如 : 内存数据、网卡流量、机器负载
内存相关指标 :
- Gauge 类型
cat /proc/meminfo
MemTotal: 7954676 kB
MemFree: 211136 kB
MemAvailable: 2486688 kB
Buffers: 115068 kB
Cached: 2309836 kB
...
网卡流量相关指标 :
- Counter 类型 : 启动后的累计值
irate二次计算
head -n3 /proc/net/dev
-- 入方向 出方向
Inter-| Receive | Transmit
face | bytes packets errs drop fifo frame compressed multicast | bytes packets errs drop fifo colls carrier compressed
eth0: 697407964307 2580235035 0 0 0 0 0 0 1969289573661 3137865547 0 0 0 0 0 0
命令行
获取 9090 端口的监听状态 :
ss -tln | grep 9090
查看分区使用率 :
df -k
缺点 :
- 通用性 : 各命令行,由于版本不一样,输出格式也不愿意
- 性能问题 : 命令行要 fork 进程,性能较低
黑盒探测
黑盒监控 : 把监控对象当黑盒子,不了解内部运行机理,只做简单探测
- 探测手段 : ICMP、TCP、HTTP
- 产品 : Blackbox Exporter 、Categraf、Datadog-Agent
Ping :
- ICMP 协议
ping -c 3 www.baidu.com
#....
# 3个数据包 丢包率
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 1.382/1.393/1.405/0.009 ms
TCP 探测 :
telnet www.baidu.com 22
HTTP 探测 : 测试连通性,和 response body 是否有 success
拉取协议
拉取 Elasticsearch 的 /_cluster/health 接口
curl -ucpucode:cpu123 http://cpucode:9200/_cluster/health -s | jq .
{
"cluster_name": "elasticsearch-cluster",
"status": "yellow",
"timed_out": false,
"number_of_nodes": 3,
"number_of_data_nodes": 3,
"active_primary_shards": 430,
"active_shards": 430,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 430,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 50
}
目标命令
查看 MySQL 连接情况 :
- Threads_connected : 当前有多少连接
- Max_used_connections : 曾经最多有多少连接
- Connections : 总计接收过多少连接
show global status like '%onn%';
+-----------------------------------------------+---------------------+
| Variable_name | Value |
+-----------------------------------------------+---------------------+
| Aborted_connects | 3212 |
| Connection_errors_accept | 0 |
| Connection_errors_internal | 0 |
| Connection_errors_max_connections | 0 |
| Connection_errors_peer_address | 0 |
| Connection_errors_select | 0 |
| Connection_errors_tcpwrap | 0 |
| Connections | 3281 |
| Locked_connects | 0 |
| Max_used_connections | 13 |
| Max_used_connections_time | 2022-10-30 16:41:35 |
| Performance_schema_session_connect_attrs_lost | 0 |
| Ssl_client_connects | 0 |
| Ssl_connect_renegotiates | 0 |
| Ssl_finished_connects | 0 |
| Threads_connected | 1 |
+-----------------------------------------------+---------------------+
16 rows in set (0.01 sec)
查看 MySQL 连接变量 :
max_connections: 最大连接数,默认 : 151
show global variables like '%onn%';
+-----------------------------------------------+-----------------+
| Variable_name | Value |
+-----------------------------------------------+-----------------+
| character_set_connection | utf8 |
| collation_connection | utf8_general_ci |
| connect_timeout | 10 |
| disconnect_on_expired_password | ON |
| init_connect | |
| max_connect_errors | 100 |
| max_connections | 5000 |
| max_user_connections | 0 |
| performance_schema_session_connect_attrs_size | 512 |
+-----------------------------------------------+-----------------+
9 rows in set (0.01 sec)
查看 Redis 内存指标 :
info memory
# Memory
used_memory:1345568
used_memory_human:1.28M
used_memory_rss:3653632
used_memory_rss_human:3.48M
used_memory_peak:1504640
used_memory_peak_human:1.43M
used_memory_peak_perc:89.43%
used_memory_overhead:1103288
used_memory_startup:1095648
used_memory_dataset:242280
used_memory_dataset_perc:96.94%
...
文章介绍了通过读取Linux系统的/proc目录获取内存和网络流量等信息,利用命令行工具进行系统状态检查,以及采用黑盒探测方法(如ICMP、TCP、HTTP)来监测服务可用性。同时,提到了通过特定协议拉取如Elasticsearch的健康状态和MySQL、Redis的连接及内存数据。
2021

被折叠的 条评论
为什么被折叠?



