Abstract
findings:
1. 三分之一非死锁bug 是违背了开发者的设计顺序
2. 34%的bug包含了多变量,现有工具不一定能检查出
3. 92%的bug可以通过改变小组的内存地址顺序发现
4. 73%非死锁bug不能通过简单增加或者改变锁修复,修复在第一次尝试不一定正确
1. Introduction
1.1 motivation
many open ,usolved issues
1. 并发bug的发现: 现有的bug工具能够发现真实世界的所有并发缺陷吗? 一共存在什么类型的并发缺陷?是否还有未发现的并发缺陷,现有的诊断修复工具能怎样帮助修复bug,开发者需要什么样的信息才能帮助他们修复bug?
2. 并发程序测试和模型检测:现有的技术聚焦在程序的序列性sequential,无法有效定位程序的并发情况,(多线程或多程序的交错)
并发测试的主要挑战在于发现程序交错的区域,与bug-exposing input 和bug triggering execution interleaving 都有关系,考虑所有情况不现实
open question in concurrency testing:只测试小数目的代表性交错 依旧可以发现大部分bug
最后, the manifestation conditions of real world concurrency bugs,what conditions are needed, besides program inputs, threads variables and accesses are involved
3. 并发编程语言设计:transactional memory™ 使用TM能防止哪些部分的bug?TM现实世界设计需要注意什么?除了TM还有什么语言支持能帮助开发者编写正确代码
purify\ valgrind \ccured
few studies conducted on real world concurrency bug characteristics.
because:
- difficult to collect ,difficult to get understood and solved
- not easy to understand
1.2 Contributions
- characteristic study
- bug patterns
1、2. 非死锁bug中97%可归为以下两类:atomicty-violation or order-violation,32%是顺序违背bug - manifestation
3. 96%的并发bug如果强制两个线程按部分顺序执行,能被发现——pairwise testing
4. 22%死锁bug是一个线程锁住自己——单线程死锁发现与测试技术
5. 66%非死锁bug机制与单变量并发入口有关
6. 34%非死锁bug机制与多变量并发入口有关——新技术需要定位多变量的
7. 97%死锁bug是两个线程循环等待最多2个资源——对资源获取与释放上的pairwise tesing
8. 92%并发bug发生在最多4个内存地址的顺序上——在小范围内存地址上顺序测试,能把交换区域从指数级降到多项式级
9. 73%的非死锁bug是通过技术修复而不是加锁或修改锁——需要为bug发现和诊断工具增加除了锁以外的bug模式、机制信息
10. 61%的死锁bug在防止一个线程获取一个资源的情况下能被修复,但是可能会导致非死锁的并发bug——死锁bug修复要小心 - bug avoidance
11. TM可以帮助39%的并发bug
12. TM在some concerns are addressed下能帮助42%的并发bug
13. 因为bug patterns不同,TM无法帮助19%的bug——开发C/C++下的order semantics
2 Methodology
2.1 bug sources
- MySQL,Apache,Mozilla,OpenOffice
- database、web server、client applicaitons
- bugs:
- 通过关键字随机选取500个bug reports,筛选由开发者对并发执行错误设定导致的bug,105个
- Bug Fix Study
- 非死锁
- condition check(while flag; 乐观并发)
- code switch
- design change
- lock strategy()
- 非死锁
2.2 characteristic categories
- bug pattern
- 区分原子违背和顺序违背可参考同步模型(读者写者、生产者消费者)
- do not classify data race as a bug pattern(datarace 可以导致并发缺陷,但也可能是良性竞争,datarace free也不等于并发缺陷 free)
- manifestation
- threads
- variable
- accesses
- bug fix strategy
- how TM can help
- others: failure impact, bug diagnosis process
2.3 Threats to validity
1. representativeness of applications
2. concurrency bugs used in our study
3. examination methodology
- 应用代表性: chooses server and client-based concurrent applications,written in C/C++
not reflect other types: scientific app,OS,other programing languadges(e.g.,Java) - bug代表性:selected from bug database ,good samples of fixed bugs
non-fixed or non-reported bugs: might be different, but not likely as important as the reported bugs - 实验方法:examined every piece of infomation ,programmers’ clear explanations,forum discussions,source code patches,multiple version of source code, and bug-triggering test case
总结: 两大类应用,bugs的特性在这四个应用中一致,不强调量化特性,得出的结论与研究方法和所选应用结合。
3 bug pattern study
finding1
finding2
分类 order 、actomicity 、others
例子:
1. atomicity
2. Use before init
3. others(deadlock)
4. write-write order violation
5.order
note that the above order bugs are different from data race bugs and atomicity violation bugs.
4 bug manifestation study
threads
finding 3: 大多数是2个线程之间的交换导致错误
并不是越大的并发量就能更快捷发现并发错误,只是增加碰到指定顺序的概率。最后错误相关的线程还是2个。
不仅跟本程序的内存地址有关,还有跟env有关(例如其他程序调用本程序对同个地址进行修改)(解决这种问题需要特定系统支持)
finding 4:
variables
finding5 :如图1、2、4 single variable
**6. **
finding6
finding7 死锁 最多两个资源的释放和获取
accesses
finding8.1 非死锁最多调整4个内存访问顺序可固定触发
finding8.2 死锁最多调整4个资源的获取释放
5 bug fix study
改了不一定有效
5.1 fix strategies
finding 9 adding or changing locks is not the major fix strategies
锁不保护顺序、加锁影响效率和导致死锁
- condition check( while-flag to fix order-related bugs,better performing than lock-based fixes)
- code switch(order-related 先写共享变量的赋值语句)
- algorithm/data-structure design change
finding10
5.2 mistakes during bug fixing
5.3 discussion:bug avoidance
TM
focus on the basic actomicity and isolation properties of TM
finding 11
finding 12
finding 13 TM适用于原子违背
6 other characteristics
bug impacts
some concurrency bugs are very difficult to repeat
test cases are critical to bug diagnosis
programmers lack diagnosis tools
memory bug——valgrind purify
7 related work
bug characteristic studies
improving concurrent program reliability