前言
今天读文章读到一篇IEEE Computer期刊的文章:Software-Testing Contests: Observations and Lessons Learned
,里面还附上了一个software testing database,我觉得很有意义,所以在此记录下。
文章大致内容
文章链接:
https://ieeexplore.ieee.org/abstract/document/8848154/authors#authors
引用:
@article{wang_software-testing_2019,
title = {Software-{Testing} {Contests}: {Observations} and {Lessons} {Learned}},
volume = {52},
number = {10},
journal = {Computer},
author = {Wang, Xingya and Sun, Weisong and Hu, Linghuan and Zhao, Yuan and Wong, W Eric and Chen, Zhenyu},
year = {2019},
pages = {61–69}
}
具体的我就不多说了,讲下框架:
背景: 作者之前组织了很多软件测试的竞赛,所以在此分享下经验,并且还依托竞赛的数据做了一个实证。
RQs:
- Does branch coverage have a strong correlation with mutation score in unit testing?
- Does test order at class level have an impact on the effectiveness of unit testing?
Evaluation metrics:
- branch coverage
- mutation score
- combination of both scores.
conclusion:
Mutation testing has been proposed to measure the fault detection strength of test cases based on the mutation score. However, mutation testing might not be feasible due to its high execution cost. Our analysis using 846 manually created test suites shows that there is a significant and moderate to strongly positive correlation between branch coverage and mutation score. This suggests that branch coverage can still be used as an alternative when mutation testing is not feasible.
In addition to the correlation analysis between branch coverage and mutation score, we also analyzed the testing effectiveness of different test orders and their popularity. Three interesting observations were drawn from the analysis results. First, the answering the easiest question first strategy performed better than the answering the most difficult question first strategy. Second, forward UML-based test orders performed better than backward UML-based test orders. Third, the test order “Other” achieved the highest average score of 37.94, and it is noticeably higher than the second highest average score of 27.20. This observation raises the question of whether there were specific test orders we did not identify. In our experimental design, we analyzed eight test orders based on feedback from practitioners in the industry, researchers in academia, and some contestants. We are confident that we did not miss any specific test order in our analysis. If this is true, this could suggest that flexible test order could achieve good effectiveness. Nevertheless, we will conduct more experiments to further investigate this observation.
软件测试竞赛的数据集( Software Testing Contest Data Repository )
Our contests are more valuable because we have created a data repository, the Software Testing Contest Data Repository (STCDR) at http://www.iselab.cn/contest/data/. It includes data collected from STC 2016 and STC 2017 that can be accessed by the public.
网址:http://www.iselab.cn/contest/data/
小结
这个database我觉得还是有意义的,虽然作者已经挖掘了一部分规律出来,但是未来还有更多操作空间。