Hikari提供了setMetricRegistry方法,让我们可以注入MetricRegistry来实现对连接池指标的收集。这样我们可以较为方便的监控连接池的运行状态
关于MetricRegistry
关于MetricRegistry可以看我之前写的内容
设置MetricRegistry
在设置数据源的时候我们可以在配置项中创建符合我们需求的MetricRegistry
@Configuration
public class DataSourceConfig {
private static final Logger LOGGER = LoggerFactory.getLogger(DataSourceConfig.class);
@Value("${datasource.name:test}")
String poolName;
@Bean
public DataSource getDataSource() {
HikariConfig hikariConfig = new HikariConfig();
hikariConfig.setJdbcUrl("jdbc:postgresql://localhost:5433/test?charactorEncoding=utf-8&useSSL=false");
hikariConfig.setUsername("postgres");
hikariConfig.setPassword("postgres");
hikariConfig.setDriverClassName("org.postgresql.Driver");
hikariConfig.setAutoCommit(true);
hikariConfig.setPoolName(poolName);
hikariConfig.setMetricRegistry(initMetricRegistry(poolName));
HikariDataSource dataSource = new HikariDataSource(hikariConfig);
return dataSource;
}
/**
* 配置指标监控
* @param poolName
* @return
*/
public MetricRegistry initMetricRegistry(String poolName) {
MetricRegistry metricRegistry = new MetricRegistry();
Slf4jReporter reporter = Slf4jReporter.forRegistry(metricRegistry)
.filter((name, metric) -> name.startsWith(poolName + ".pool"))
.outputTo(LOGGER)
.convertRatesTo(TimeUnit.SECONDS)
.convertDurationsTo(TimeUnit.MILLISECONDS)
.build();
reporter.start(30, TimeUnit.SECONDS);
return metricRegistry;
}
}
在initMetricRegistry
中过滤出了要输出的指标、输出的LOG对象、以及作为TPS统计时的时间单位和作为时长统计时候的时间单位。最后设置的统计信息的输出频率
如果直接初始化数据源也可以在HikariDataSource中设置对指标的收集dataSource.setMetricRegistry()
输出内容
最终会输出下面的结果,这里打印的格式是根据项目中配置的log配置有关
[2022-05-11 13:44:11,880][traceId:][metrics-logger-reporter-1-thread-1][INFO ][urce.DynamicDataSourceRegister.log:373]-type=GAUGE, name=testdb.pool.ActiveConnections, value=0
[2022-05-11 13:44:11,880][traceId:][metrics-logger-reporter-1-thread-1][INFO ][urce.DynamicDataSourceRegister.log:373]-type=GAUGE, name=testdb.pool.IdleConnections, value=5
[2022-05-11 13:44:11,880][traceId:][metrics-logger-reporter-1-thread-1][INFO ][urce.DynamicDataSourceRegister.log:373]-type=GAUGE, name=testdb.pool.MaxConnections, value=100
[2022-05-11 13:44:11,880][traceId:][metrics-logger-reporter-1-thread-1][INFO ][urce.DynamicDataSourceRegister.log:373]-type=GAUGE, name=testdb.pool.MinConnections, value=5
[2022-05-11 13:44:11,880][traceId:][metrics-logger-reporter-1-thread-1][INFO ][urce.DynamicDataSourceRegister.log:373]-type=GAUGE, name=testdb.pool.PendingConnections, value=0
[2022-05-11 13:44:11,880][traceId:][metrics-logger-reporter-1-thread-1][INFO ][urce.DynamicDataSourceRegister.log:373]-type=GAUGE, name=testdb.pool.TotalConnections, value=5
[2022-05-11 13:44:11,880][traceId:][metrics-logger-reporter-1-thread-1][INFO ][urce.DynamicDataSourceRegister.log:373]-type=HISTOGRAM, name=testdb.pool.ConnectionCreation, count=19, min=5, max=7, mean=5.7644448194490385, stddev=0.7224233113758666, median=6.0, p75=6.0, p95=7.0, p98=7.0, p99=7.0, p999=7.0
[2022-05-11 13:44:11,880][traceId:][metrics-logger-reporter-1-thread-1][INFO ][urce.DynamicDataSourceRegister.log:373]-type=HISTOGRAM, name=testdb.pool.Usage, count=23, min=0, max=464, mean=18.999946773090524, stddev=0.027297873535122703, median=19.0, p75=19.0, p95=19.0, p98=19.0, p99=19.0, p999=19.0
[2022-05-11 13:44:11,880][traceId:][metrics-logger-reporter-1-thread-1][INFO ][urce.DynamicDataSourceRegister.log:373]-type=METER, name=testdb.pool.ConnectionTimeoutRate, count=0, mean_rate=0.0, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second
[2022-05-11 13:44:11,880][traceId:][metrics-logger-reporter-1-thread-1][INFO ][urce.DynamicDataSourceRegister.log:373]-type=TIMER, name=testdb.pool.Wait, count=23, min=0.0034, max=2.8665, mean=1.4988036080240779, stddev=0.0018504058560577733, median=1.4988, p75=1.4988, p95=1.4988, p98=1.4988, p99=1.4988, p999=1.4988, mean_rate=0.003250194916506673, m1=1.5977015120292538E-21, m5=5.569177197236021E-7, m15=3.308753844558376E-4, rate_unit=events/second, duration_unit=milliseconds
输出指标说明
打印指标的格式为
{连接池名称}.pool.{指标}
指标 | 解释 | 在运维时的作用 |
---|---|---|
ActiveConnections | 活跃连接数 | 此数据长期保持最大连接数值的时候可以尝试扩大连接数 |
IdleConnections | 空闲连接数 | 此数据过高的时候可以尝试减少配置中的最小连接数 |
MaxConnections | 配置的最大连接数 | |
MinConnections | 配置的最小连接数 | |
PendingConnections | 排队等待连接的线程数 | 如果此数据持续飙高,表示连接池中已经没有空闲线程了 |
TotalConnections | 当前总连接数 | |
ConnectionCreation | 创建新连接的耗时 | 此数据主要反应当前服务到数据服务的网络延迟 |
ConnectionTimeoutRate | 创建新连接的超时 | 如果经常创建连接超时这个时候需要排查数据服务或者网络通讯是否异常 |
Usage | 连接被复用时长 | 此参数表示连接池中一个连接从返回连接池到再次被复用的时间间隔,表示数据访问频繁程度,对于使用较长的间隔可以尝试减少连接数 |
Wait | 获取连接的等待耗时 | 可以和PendingConnections结合分析连接池情况。 |
Wait和PendingConnections结合分析连接池情况
如果排队多等待短:此时表示数据访问频繁可以尝试扩大连接数;
如果排队少等待长:此时连接中存在慢查询或者比较大的事务;
如果排队多等待长:此时可能是数据访问压力过大且存在大量慢查询,但实际上如果频繁出现慢查询很有可能是程序或者业务上出现了问题,需要对业务和代码进行排查。这种时刻也能网络出现异常导致所有查询都变得非常慢;
输出度量说明
属性 | 解释 |
---|---|
count | 指标记录次数 |
min | 最小记录数 |
min | 最大记录数 |
mean | 平均值 |
stddev | 标准差 |
median | 中位数 |
p75 | 75百分位数 |
p95 | 95百分位数 |
p98 | 98百分位数 |
p99 | 99百分位数 |
p999 | 99.9百分位数 |
mean_rate | 平均耗时 |
m1 | 1分钟内记录平均数 |
m5 | 5分钟记录平均数 |
m15 | 15分钟记录平均数 |
duration_unit | 统计单位 |
rate_unit | 记录单位(events/second 为 事件次数/每秒钟) |
手动方式获取连接池指标信息
有时候因为业务需要我们可以从
DataSource
中直接获取指标数据进行处理
@Autowired
DataSource dataSource;
@Value("${datasource.name:test}")
String poolName;
@RequestMapping("/getInfo")
public String getInfo() throws SQLException {
String indexName = poolName + ".pool.";
MetricRegistry metricRegistry = (MetricRegistry) ((HikariDataSource) dataSource).getMetricRegistry();
SortedMap<String, Gauge> gauges = metricRegistry.getGauges();
Gauge activeConnections = gauges.get(indexName + "ActiveConnections");
Object activeConnectionsV = activeConnections.getValue();
log.info("activeConnections : " + activeConnectionsV);
Gauge IdleConnections = gauges.get(indexName + "IdleConnections");
Object IdleConnectionsV = IdleConnections.getValue();
log.info("IdleConnections : " + IdleConnectionsV);
Gauge MaxConnections = gauges.get(indexName + "MaxConnections");
Object MaxConnectionsV = MaxConnections.getValue();
log.info("MaxConnections : " + MaxConnectionsV);
Gauge MinConnections = gauges.get(indexName + "MinConnections");
Object MinConnectionsV = MinConnections.getValue();
log.info("MinConnections : " + MinConnectionsV);
Gauge PendingConnections = gauges.get(indexName + "PendingConnections");
Object PendingConnectionsV = PendingConnections.getValue();
log.info("PendingConnections : " + PendingConnectionsV);
Gauge TotalConnections = gauges.get(indexName + "TotalConnections");
Object TotalConnectionsV = TotalConnections.getValue();
log.info("TotalConnections : " + TotalConnectionsV);
SortedMap<String, Histogram> histograms = ((MetricRegistry) metricRegistry).getHistograms();
Histogram ConnectionCreation = histograms.get(indexName + "ConnectionCreation");
Object ConnectionCreationV = ConnectionCreation.getCount();
Snapshot snapshot = ConnectionCreation.getSnapshot();
log.info("ConnectionCreation : " + ConnectionCreationV);
Histogram Usage = histograms.get(indexName + "Usage");
Object UsageV = Usage.getCount();
log.info("Usage : " + UsageV);
SortedMap<String, Meter> meters = ((MetricRegistry) metricRegistry).getMeters();
Meter meter = meters.get(indexName + "ConnectionTimeoutRate");
long count = meter.getCount();
log.info("ConnectionTimeoutRate : " + count);
SortedMap<String, Timer> timers = ((MetricRegistry) metricRegistry).getTimers();
Timer timer = timers.get(indexName + "Wait");
long count1 = timer.getCount();
log.info("Wait : " + count1);
return "";
}