原文地址: http://cookbook.logstash.net/recipes/statsd-metrics/
----
本文目标:帮助人们理解StatsD metrics
目标人群:所有想使用StatsD插件的人们
序:StatsD Metrics
我将通过例子来向大家说明这些StatsD metrics的意义:increment,timing,count
示例场景1(count metric)
在10秒的周期内,我们每秒接收一个数字。让我们假设初始时间是“t1”。以表格的形式展现出来就是这样:
Moment | Number |
—- | :———— |
t1 | 1 |
t1+1s | 2 |
t1+2s | 3 |
t1+3s | 4 |
t1+4s | 5 |
t1+5s | 6 |
t1+6s | 7 |
t1+7s | 8 |
t1+8s | 9 |
t1+9s | 10 |
现在,这个count metric在我们的10秒的周期内的值就等于所有这些数字的和
1+2+3+4+5+6+7+8+9+10 = 55
示例场景2(increment metric)
在10秒的周期内,我们每秒钟接收一个状态,状态就是一个数字形式。
比如,我们以HTTP状态作为例子:200,404,302 。我们还是假设初始的时间是“t1” 。
以表格的形式展现出来就是这样:
Moment | Status |
—- | :———— |
t1 | 200 |
t1+1s | 200 |
t1+2s | 404 |
t1+3s | 200 |
t1+4s | 200 |
t1+5s | 302 |
t1+6s | 200 |
t1+7s | 302 |
t1+8s | 200 |
t1+9s | 200 |
- Status 200: 7 times
- Status 404: 1 times
- Status 302: 2 times
这个场景是示例1和2的结合。我们在10秒钟的周期内,每秒收到一个状态,但是与此同时,每个状态都带有第二个数字。让我们假定这个数字就是HTTP请求的相应时间。
以表格的形式展现出来就是这样:
Moment | Status | Response Time |
—- | :—— | :————- |
t1 | 200 | 15ms |
t1+1s | 200 | 10ms |
t1+2s | 404 | 10ms |
t1+3s | 200 | 20ms |
t1+4s | 200 | 30ms |
t1+5s | 302 | 10ms |
t1+6s | 200 | 15ms |
t1+7s | 302 | 10ms |
t1+8s | 200 | 10ms |
t1+9s | 200 | 20ms |
在我们的例子中,结果就是这样的:
- Highest: 30ms
- Lowest: 10ms
- Mean: 15+10+10+20+30+10+15+10+10+20 / 10 = 15ms
示例场景4(apache log)
下面是我们从apache访问日志文件中摘出来的一段数据:
#Remote_host Request_time Request Status Response_bytes Response_time
10.10.10.1 [13/Feb/2013:10:27:02 +0200] "GET / HTTP/1.1" 200 "566" 10000
10.10.10.1 [13/Feb/2013:10:27:02 +0200] "GET /icons/blank.gif HTTP/1.1" 304 "195" 5000
10.10.10.1 [13/Feb/2013:10:27:02 +0200] "GET /icons/folder.gif HTTP/1.1" 304 "123" 4000
10.10.10.1 [13/Feb/2013:10:27:03 +0200] "GET / HTTP/1.1" 200 "520" 11000
10.10.10.1 [13/Feb/2013:10:27:03 +0200] "GET /icons/folder.gif HTTP/1.1" 304 "151" 6000
10.10.10.1 [13/Feb/2013:10:27:03 +0200] "GET /icons/blank.gif HTTP/1.1" 304 "158" 5000
10.10.10.1 [13/Feb/2013:10:27:03 +0200] "GET / HTTP/1.1" 200 "502" 12000
10.10.10.1 [13/Feb/2013:10:27:03 +0200] "GET /icons/folder.gif HTTP/1.1" 304 "226" 4000
10.10.10.1 [13/Feb/2013:10:27:03 +0200] "GET /icons/blank.gif HTTP/1.1" 304 "107" 5000
让我们在一个logstash配置文件中指定这段摘录:
output {
statsd {
type => "apache-access-ext"
host => "localhost"
port => 8125
namespace => "logstash"
timing => [ "apache.%{sitename}.servetime", "%{reqmusecst}" ]
increment => "apache.%{sitename}.response.%{response}"
count => [ "apache.%{sitename}.bytes", "%{bytes}" ]
}
}
其中:
× reqmusecst - 是列“Response_time”对应的值
× response - 是列“Status”对应的值
× bytes - 是列“Response_bytes”对应的值
× sitename - 值是site1
对于我们10秒周期(10:27:00h to 10:27:10h)内的数据,StatsD将会产生如下的结果:
- stats_count.logstash.101010_1.apache.site1.response.200 = 3 (we have received 3 times status 200 for our period of 10 seconds)
- stats_count.logstash.101010_1.apache.site1.response.304 = 6 (we have received 6 times status 304 for our period of 10 seconds)
- stats_count.logstash.101010_1.apache.site1.bytes = 566+195+123+520+151+158+502+226+107 = 2548 bytes
- stats.timers.logstash.101010_1.apache.site1.lower = 4000 (lowest response time is 4000 ms for our period of 10 seconds)
- stats.timers.logstash.101010_1.apache.site1.upper = 12000 (highest response time is 12000 ms for our period of 10 seconds)
- stats.timers.logstash.101010_1.apache.site1.mean = (10000 + 5000 + 4000 + 11000 + 6000 + 5000 + 12000 + 4000 + 5000) / 9 = 6888 (mean response time for our period of 10 seconds)
- stats.timers.logstash.101010_1.apache.site1.count = 9 ( total number of responses for our period of 10 seconds)
StatsD计算了一些附加的数据:
- stats.logstash.101010_1.apache.site1.response.200 = 3 / 10 = 0.3 (number of responses with status 200 per second)
- stats.logstash.101010_1.apache.site1.response.304 = 6 / 10 = 0.6 (number of responses with status 304 per second)
- stats.logstash.101010_1.apache.site1.bytes = 2548 / 10 = 254.8 (bytes per second)
- stats.timers.logstash.101010_1.apache.site1.count_ps = 9 / 10 = 0.9 (responses per second)