开发工具篇第七讲:阿里云日志查询与分析

公司使用的是阿里云日志服务

1、日志应用场景

  • 1、应用报错,问题排查
  • 2、校验定时任务等是否按时、正常运行
  • 3、真线监控
  • 4、分析真线数据(用于压测or性能分析)

2、常用日志库

操作手册:日志服务

test环境:http://corp.cai-inc.com/devops/dev_k8slog.html

  • zcy-dev-app-log(应用日志)

staging环境:http://corp.cai-inc.com/devops/staginglog.html

  • k8s-staging-app-java(应用日志)
  • zcy-staging-tracelog(dubbo日志)
  • zcy-staging-nginx-acces(rest接口日志)

生产环境:http://corp.cai-inc.com/devops/logapp.html

  • k8s-prod-app-log(应用日志)
  • prod-tracelog(dubbo日志)
  • zcy-prod-nginx-access(rest接口日志)
  • zcy-front-store(前端埋点日志)

3、日志字段解析

表1:app_log: 应用日志

字段释义
_container_ip_容器地址
_container_name_容器名称
_image_name_镜像名称
_namespace_环境
_pod_name_实例名称
_pod_uid_实例uid
time日志产生时间
class产生日志的类名
javaclassnum产生日志的类名:行数
levelINFO、WARN、ERROR、DEBGUG
message日志内容
traceid链路id

表2:trace_log: dubbo日志

字段释义
appName应用名称
invokeType身份(provider、consumer)
methodName方法名
params入参
paramtypes入参类型
serviceNamedubbo接口名称
traceName本应用内方法名
traceId链路id

表3:access_log: rest接口日志

字段释义
cookie_uidcookie里面的uid
http_host主域
http_user_agent请求头:user-agent
uri接口路径
request_method请求方法
query_string请求(Get接口)
request_body请求(post接口)
resp_body接口响应
status状态码
request_id链路id

4、常用语法

1、联合查询:

* |
SELECT
   "A库”.a字段,
   "B库".b字段

FROM  "A库"

  INNER JOIN "B库" ON "A库".traceid = "B库".request_id 
  # (具体是traceid还是request_id,取决于本库的链路id)

WHERE “A库 or B库".key = 'value'

2、模糊查询:

# *匹配任意字符串;?匹配任意单个字符
* and _container_name_: 应用名称 and message: xxxxxxxx*(或者?匹配单个字符) 

3、查询响应时间:

# 查询以/api/loan开头的接口,以响应时间倒序输出
* and uri:/api/loan* | select from_unixtime(__time__) as time, max_by(uri,request_time) as uri, avg(upstream_response_time) as RT group by __time__ order by RT desc  

4、查询复杂字符串:

  • 1)含有空格,冒号等:

    • 用单(双)引号将value包起来:
    * and _container_name_: web-loan and message:"接收结果消息:Message [topic=ZCY_SUPPLIER_CREATE_SHOP_Production"
    
  • 2)含有单(双)引号,对象等:

    • 先进行转义,再用单(双)引号将value包起来:
    * and _container_name_: web-loan and message:"{\"shopName\":\"湖南鑫排云数字科技有限公司\",\"userName\":\"hnxpysm123\",\"userId\":10009303682,\"orgId\":160790343460864}"
    

5、常用函数

  • 1、计数函数:

    • count\count_if函数:用于计算符合条件的数据共有多少条
    * | SELECT COUNT('%项目采购消息%') as Pcount
    
    # 统计message中,有多少条包含 “项目采购消息”
    * | SELECT COUNT_if(message = '项目采购消息') as Pcount  
    
  • 2、正则表达式函数:

    • regexp_extract_all函数:正则表达式函数
# 查询访问用户chrome浏览器版本,并计算出不同版本号的访问次数
* and http_user_agent: Chrome 
| select  regexp_extract(http_user_agent, 'Chrome/\d+(\.\d+)*')  as version, count(*) as count 
group by version order by count desc  
  • 3、安全检测函数:

    • security_check_ip函数:安全检测函数
# 检测访问来源IP是否安全(依托于全球白帽子共享安全资产库)
* | SELECT http_x_forwarded_for  WHERE security_check_ip(split_part(http_x_forwarded_for, ',' ,1)) = 1 

6、实战

6.1、查询dubbo接口耗时
  • 常用来判断接口是否有性能问题
* and appName: item-microservice and invokeType: provider 
| select serviceName, methodName, split_part(traceName, ':', 1) as outerApp,COUNT(*) as count, avg(spendTime) as maxSpendTime  
FROM prod-tracelog 
where spendTime > 200 
group by serviceName, methodName, outerApp 
order by maxSpendTime desc
6.2、通过 nginx 日志统计 rest 接口耗时
* | select case 
when split_part(upstream_addr,':',2) = '8021' then 'web-protocol' 
when split_part(upstream_addr,':',2) = '9005' then 'web-aggregated' 
when split_part(upstream_addr,':',2) = '8060' then 'web-agreement' 
when split_part(upstream_addr,':',2) = '8082' then 'web-item-admin' 
when split_part(upstream_addr,':',2) = '8109' then 'web-spi' 
end as appname, approx_percentile(request_time, array[0.9]) as rtp90,
approx_percentile(request_time, array[0.95]) as rtp95,
approx_percentile(request_time, array[0.99]) as rtp99,
MAX(request_time) as max,min(request_time) as min,avg(request_time) as avg,
uri from zcy-prod-nginx-access 
where split_part(upstream_addr,':',2) in ('8021', '9005', '8060', '8082', '8109') GROUP by appname,uri 
order by rtp95 desc 
limit 100
6.3、通过 tracelog 统计统计dubbo接口耗时
  • 通常比对发版前后,系统整体的耗时变化情况
* | select appName,serviceName,methodName, COUNT(*) as queryCount, approx_percentile(spendTime, 0.99) as rtp99,
approx_percentile(spendTime,0.95) as rtp95,
approx_percentile(spendTime, 0.9) as rtp90,
MAX(spendTime) as max,min(spendTime) as min,avg(spendTime) as avg 
from prod-tracelog 
where appName in ('item-dump-microservice', 'item-search-microservice', 'item-microservice', 'item-service', 'web-item-admin', 'zcy-agreement-center', 'zcy-agreement-web', 'zcy-protocol-web', 'zcy-protocol-center' ) 
GROUP by appName,serviceName,methodName 
order by rtp99 desc 
limit 100
  • 例如:查询标准中心接口耗时
* | select appName,serviceName,methodName, COUNT(*) as queryCount, approx_percentile(spendTime, 0.99) as rtp99,
approx_percentile(spendTime,0.95) as rtp95,
approx_percentile(spendTime, 0.9) as rtp90,
MAX(spendTime) as max,min(spendTime) as min,avg(spendTime) as avg 
from prod-tracelog 
where appName in ( 'item-standard-center' ) and serviceName in ('cn.gov.zcy.service.category.facade.FrontCategoryReadServiceFacade')  
GROUP by appName,serviceName,methodName 
order by rtp99 desc 
limit 100
6.4、查询日志中特定报错的脚本
* and message:"java.lang.NumberFormatException" and message not "/api/micro/category/agCategoryTree" 
| SELECT COUNT(*) as number, max(message) as m1,_container_name_ as app,max(traceid) as traceid 
WHERE strpos('item-microservice-center##beacon-center##web-aggregated## web-item-admin##item-standard-center',_container_name_)>0 
GROUP BY app LIMIT 1
6.5、本查询出当前方法的QPS
* and serviceName: “cn.gov.zcy.service.external.item.facade.OpenItemWriteServiceFacade” 
 and methodName : “createItem”
 and invokeType : provider 
| SELECT date_format(time-time % 1 ,%m-%d %H:%i:%s’) as time, count(1) as totalQuery 
GROUP BY time 
ORDER BY totalQuery DESC 
limit 20
  • 2
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

程序员 jet_qi

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值