顶刊论文Human Competitiveness of Genetic Programming in Spectrum-Based Fault Localisation... [TOSEM 2017


今天我们来会一会 软件工程领域顶级期刊 TOSEM 2017年论文——“Human Competitiveness of Genetic Programming in Spectrum Based Fault Localisation: Theoretical and Empirical Analysis”。

1 作者信息


其中第一作者Shin Yoo是2014 年ICST “Ask the Mutants: Mutating Faulty Programs for Fault Localization” 的作者,那个就是把GP用到缺陷定位中,万万没想到发了顶刊。

2 摘要写的有点 强势,各种first time,写的很稳很自信

1)原来FL 缺陷定位属于 基于搜索的软件工程? SBSE
We report on the application of Genetic Programming to Software Fault Localisation, a problem in the area of
Search Based Software Engineering (SBSE).

英 [kənˈveɪ] 美 [kənˈve]
to make ideas, feelings, etc. known to sb
(formal) 传送;运送;输送
to take, carry or transport sb/sth from one place to another

英 [‘kaʊntəpɑ:ts] 美 [‘kaʊntəpɑ:ts]
与对方地位相当的人,与另一方作用相当的物( counterpart的名词复数 );同仁
Someone’s or something’s counterpart is another person or thing that has a similar function or position in a different place.

We give both empirical and theoretical evidence for the human
competitiveness of the evolved fault localisation formulæ under the single fault scenario, compared to those
generated by human ingenuity and reported in many papers, published over more than a decade.

英 [ˌɪndʒəˈnju:əti] 美 [ˌɪndʒəˈnu:əti]
the ability to invent things or solve problems in clever new ways
[U] 独创力;聪明才智;心灵手巧

@1 Though there have been previous human competitive results claimed for SBSE problems, this is the frst time that evolved solutions have been formally proved to be human competitive. 这个的意思就是 本文做了empirical study,所以 就算是“formally proved”???
@2 We further prove that no future human investigation could outperform the evolved solutions. 这句话没看懂
@3 We complement these proofs with an empirical analysis of both human and evolved solutions, which indicates that the evolved solutions are not only theoretically human competitive, but also convey similar practical benefts to human-evolved counterparts. 不仅理论上强,而且实际上也很强?…还是不太确定自己的理解,尤其是human competitiveness,很神奇的词组。

3 文章工作

This paper presents a theoretically optimal, human competitive and practical approach to Spectrum Based Fault Localisation (SBFL) [24, 28] using Genetic Programming (GP) [31, 43].

Our work is situated within a growing trend in software engineering, Search Based Software Engineering
(SBSE) [4, 16, 23, 35], which uses computational search techniques (with a particular emphasis on
evolutionary computation [22]). It provides the frst provably and theoretically optimal results in
the feld of SBSE.

从这里就可以看出来,mark harman的之前工作和这篇论文很相关,因为这篇论文疯狂引用该作者的文章。

4 SBFL这么优秀的吗

SBFL is important because it offers automated assistance to the debugging process, which is currently labour-intensive, expensive and time-consuming. SBFL has been advocated as a technique for helping humans fnd faults faster [20, 42] and also as a supporting technology for automated program repair [32, 50], which automatically fxes certain classes of fault (also using techniques such as GP).

[42] 则被引用了304次。

5 再次看到SFL是在语句层次!震惊。

The SBFL suspiciousness formula defnes the ‘suspiciousness’ of each statement in terms of
observations from software testing, thereby forming the ‘key ingredient’ of SBFL. The suspiciousness
formula is also known as a risk formula, in the sense that it seeks to capture the ‘risk’ that the
statement causes the bug.


6 终于知道真实错误语句怎么写了!truly faulty statement

A good risk formula will tend to elevate the reported suspiciousness of
truly faulty statements and depress that of innocent statements.

7 再次看到作者的工作

We report on a GP solution that searches for formulæ, which we have implemented, showing that it finds known maximal formulæ (previously found by humans) and also novel maximal formulæ (not previously found by humans).

We report on a set of experiments on real software systems to evaluate the formulæ found by humans and by GP.

Our empirical evaluation indicates that one class of formulæ (found by GP and also by humans) performs best overall. Finally, we prove that, under the single fault scenario, there does not exist a superior formula to the current known maximal formulæ found by humans and/or by GP.

Therefore, GP-evolved formulæ are not only human competitive, but no further human analysis could yield superior alternatives.

While human competitiveness of SBSE has been empirically shown before [7, 8, 15, 40], we believe this is the frst claim backed by a formal mathematical proof

英 [su:ˈpɪəriə(r)] 美 [su:ˈpɪriə(r)]
better in quality than sb/sth else; greater than sb/sth else
higher in rank, importance or position

8 这篇软件移植的文章必读

9 作者厉害的地方:在introduction的末尾不忘补充一段说明自己的工作很有意义

SBFL is an area of software engineering that has been well studied by humans over many
years, and for which human ingenuity has produced publishable advances that have subsequently
turned out to include both sub-optimal as well as optimal results. It is an important area that has
motivated (and continues to motivate) many leading researchers to attack the problem of fnding
suitable formulæ with attractive theoretical and practical properties. We believe that this makes it
exciting and encouraging that GP has been able to fnd results that are provably human competitive,
theoretically unbeatable, and also practically valuable


4.1.1 Subject Programs. In these experiments, we use fve subject programs from Software
Infrastructure Repository (SIR) [13]. Table 4 describes the functionalities and sizes of these programs:
the size is measured in Source Lines of Code, excluding whitespaces, using SLOCCount [47]5. Table 4
also presents the size test suites employed. We adopted all test cases provided by SIR, including the
“universe” test plan and the additional test cases.

好像2017年 ICSE 那篇improving fault localization techniques已经说了这个GP出来的缺陷定位技术 在人工bug里面还行,但是在real faults里面不好使。

11 仓促小结






