想要统计每天BO有多少用户在线?每个小时用户在线数多少?通过BO的审计功能可以实现,但是审计会降低性能。如果前端有Apache的做负载均衡只要开启了日志,我们便可以轻松的通过awk来分析日志,得到我们想要的数据。下面的代码中完成了我的3个需求:
1. 统计每天系统用户上线数多少
2.统计每个小时用户在线数多少
3.统计报表保存动作平均开销是多少?
通过gawk轻松搞定。
1. Apache 日志格式
日志格式用的是common,类似如下格式:
10.1.1.1 - - [09/Dec/2011:07:11:26 -1200] "GET /OpenDocument/opendoc/openDocument.jsp?iDocID=144758&boRefresh=Y HTTP/1.1" 200 3382
2. GAWK程序
通过IP地址和时间就能搞定前连个需求,通过jsp的页面判断用户进行了什么操作这样3个需求都能满足,代码如下:
#! /usr/bin/gawk -f
#$1 is ip
#$4 is date
#year: substr($4,9,4)
#month: Mons[substr($4,5,3)]
#day: substr($4,2,2)
#time ltime = substr($4,14,8); gsub(/:/," ",ltime)
#request url $7
function getTime(date){
year = substr(date,9,4)
month = Mons[substr(date,5,3)]
day = substr(date,2,2)
ltime = substr(date,14,8)
gsub(/:/," ",ltime)
return mktime(year " " month " " day " " ltime)
}
BEGIN{
Mons["Jan"] = 1; Mons["Feb"] = 2; Mons["Mar"] = 3;
Mons["Apr"] = 4; Mons["May"] = 5; Mons["Jun"] = 6;
Mons["Jul"] = 7; Mons["Aug"] = 8; Mons["Sep"] = 9;
Mons["Oct"] = 10; Mons["Nov"] = 11; Mons["Dec"] = 12
}
{
currIp = $1
currDate = getTime($4)
hour = substr($4,14,2)
#get how many user on line per hour
user[hour,currIp]
#get how many user on line today
ip[currIp]
#get avg report saved time
if ($7 ~ /\cdz_adv\/checkProcessSave.jsp/){
startTime = currDate
}
if ($7 ~ /reportSaveAlert.html\?/){
totalTime += currDate - startTime
times += 1
}
}
END{
print length(ip) " users on line today."
#print user per hour
for (i in user){
split(i,lists,SUBSEP)
count[lists[1]] += 1
}
for (i in count)
print i,":",count[i]," users"
#print ave report saved time
if (times > 0)
print totalTime / times "s avg report saved time."
else
print "0 report is saved."
}
用了原始数据之后便可以自己做个dashborad分析,做顾问不简单啊,啥都得懂。