paper链接:http://www.academia.edu/download/36281506/jmzhu_icse2015.pdf
Abstract
作者先进行了背景介绍,
在这篇paper里提出了一个 learning to log 的架构,旨在提供logging的指导;
其中的一个实现就是他们做出的一个工具:LogAdviser
;从已有的logging实例中学习 where to log 这个问题。
目标问题的三个因素为:
- 结构特征
- 文本特征
- 语义特征
然后应用机器学习(特征选择,分类器训练)的方法求解。
评估:
- 2 from MS
- 2 from Github
一共有 19.1M LOC(Line of Code) and 100.6K
loggingstatements
LOC的解释
结果很好。
Introduction
前面说logging不能太少也不能太多。。还举例论证。。废话有点多
作者之前做过调研
就连 MS 也没有对logging的明确严格的标准。
在一些论坛的帖子里发现了一些开发者们讨论最好的logging实践经验
developers still need to make their own decisions on where to log and what to log,
which in most cases depend on their own domain knowledge
所以logging是一个重要的问题
Observations and Motivation
上述系统的log是经得起考验的
Observations:
Pervasiveness of logging(log的广泛性)
主要讲了…
a line of logging code in every 58 LOCwhere to log
exceptions
return-valuecheck snippets
- why not to log everything
还把这个问题去问starkoverflow。。。
- Logging decision and the context
logging decision is
highly dependent on the context of this code snippet, including
the exception type
Motivation
现有的log基于developer的专业知识
我们希望提出一个工具来更好的提供有价值的log建议;降低对开发者专业知识的要求程度
Learing to Log
overview
Instances collection(选训练集)
- exception snippets
records the exception context after an exception is captured in the catch block
在catch模块里,抛出异常的时候记录异常信息 - return-value-check snippets
the situation where an unexpected value (e.g., -1/null/false/empty) is returned from a function call
函数调用时异常返回值时记录信息
Label identification (标label)
logged
包含logging 语句
unlogged
不包含 logging 语句
searching some keywords in all method names, such as
log/logging, trace, write/writeline
Feature extraction (特征提取)
The details on feature extraction are described
in Section III-B
Feature selection (特征选择)
Model training (模型训练)
classification model
Decision Tree
Logging suggestion(预测)
predictive model to perform accurate logging predictions
Structural features
error type
(每种错误的频率 做特征)
associated methods
帮助理解函数功能和操作
采用函数名作为特征
通过调用的先后顺序 BFS
比如: System.IO.Path.GetFullPath
- namespace,
- class name,
- its (short) method name.
Textual features
代码中的变量名,变量类型,函数名。。。
与上述的Structrual features 结合 组成句子
词袋模型
分词,去停用词,tf-idf….
Syntactic features
- SettingFlag. We identify whether there is an assignment statement
with an assigned value like -1/null/false/empty.- Throw. Weidentify whether there is a throw statement.
- Return. We identify whether any special value (e.g., -1/null/false/empty)
is returned.- RecoverFlag. We check whether there is a new try statement inside.
- OtherOperation. We check whether there is any other operations included except the above five
ones.- EmptyBlock. We find that the developers sometimes catch and then do nothing. We thus identify whether the catch block is empty.
以上都是布尔变量
& NumOfMethods
Feature selection
特征太多,维度太大
- 设置一个频率的最小阈值
信息增益 (决策树)
reduce the feature dimensionality to around 1000
Noise Handling
implicitly assume good logging quality in the training data
CLNI
logged 和unlogged比例严重失衡
SMOTE合成一些logged数据达到平衡
评估
RQ1: What is the accuracy of LogAdvisor?
RQ2: What is the effect of different learning models?
RQ3: What is the effect of noise handling?
RQ4: How does LogAdvisor perform in the cross-project learning scenario?
用DT:
- good performance
- ease of interpretation(可解释性强)
10-fold cross evaluation
balanced accuracy (BA) [19], which
is the average of the proportion of logged instances and the
proportion of unlogged instances that are correctly classified.
(正确分类的)
Results of RQ1: Prediction Accuracy
Baseline:
- Random
- Errlog
每种都超过0.5了
Results of RQ2: The Effect of Different Learning Models
Results of RQ3: The Effect of Noise Handling
经过处理之后Noise的data的比例很小
然后通过调阈值,让噪声数据的比例到5%
这样所有的实例中Noise Handling的效果都会好一些
Results of RQ4: Cross-Project Evaluation
User Study
省时又省钱
用户都说好
讨论:
- logging的质量
- 不同的软件系统(我们只做了C#, 还有其他的语言,系统等等)
- what to log
错误信息,栈空间等等,正在做LogEnhancer,能够自动填充log信息 - 潜在的提高空间:
- 影响是否log的其他因素
- Interdependence of logging statements(logging之间的相互依赖)
- Runtime logging(这个内存中的log该什么时候打)
总结:
简单将数据挖掘算法应用到log当中,没有很难的算法。
算是对一个新领域的尝试,虽然2012年就有人分析过log,但是这篇应该是很早一批将log着手实验分析的。