Whatsapp ANR的一个分析,MediaProvider的 synchronized 和 beginTransaction 死锁导致的奇葩问题

2 篇文章 0 订阅
2 篇文章 0 订阅
最近收到一个whatsapp ANR的问题,其trace大是这样的:

"main" prio=5 tid=1 Blocked
  | group="main" sCount=1 dsCount=0 obj=0x75543b80 self=0xa5a05400
  | sysTid=24292 nice=-10 cgrp=default sched=0/0 handle=0xa8ed0534
  | state=S schedstat=( 1047561789 813894852 1943 ) utm=60 stm=44 core=0 HZ=100
  | stack=0xbe351000-0xbe353000 stackSize=8MB
  | held mutexes=
  at com.whatsapp.data.h.Q(MessageStore.java:6069)
  - waiting to lock <0x06cb743e> (a com.whatsapp.data.d) held by thread 41
  at com.whatsapp.data.h.s(MessageStore.java:6060)
  at com.whatsapp.kj$e.a(ConversationsFragment.java:1305)
  at com.whatsapp.kj$e.getView(ConversationsFragment.java:1120)
  at android.widget.HeaderViewListAdapter.getView(HeaderViewListAdapter.java:220)
  at android.widget.AbsListView.obtainView(AbsListView.java:2497)
  at android.widget.ListView.makeAndAddView(ListView.java:2012)
  at android.widget.ListView.fillDown(ListView.java:721)
  at android.widget.ListView.fillFromTop(ListView.java:782)
  at android.widget.ListView.layoutChildren(ListView.java:1772)
  at android.widget.AbsListView.onLayout(AbsListView.java:2255)
  at com.whatsapp.observablelistview.ObservableListView.onLayout(ObservableListView.java:242)
  at android.view.View.layout(View.java:17969)
.
.
.

Thread 41是长这样的:

"WhatsApp Worker #2" prio=5 tid=41 Native
  | group="main" sCount=1 dsCount=0 obj=0x2ac0a700 self=0x8a7c4000
  | sysTid=24337 nice=10 cgrp=default sched=0/0 handle=0x89325920
  | state=S schedstat=( 488869957 208541146 1891 ) utm=27 stm=21 core=3 HZ=100
  | stack=0x89223000-0x89225000 stackSize=1038KB
  | held mutexes=
  at android.os.BinderProxy.transactNative(Native method)
  at android.os.BinderProxy.transact(Binder.java:622)
  at android.content.ContentProviderProxy.delete(ContentProviderNative.java:542)
  at android.content.ContentResolver.delete(ContentResolver.java:1424)
  at com.whatsapp.data.h.b(MessageStore.java:4770)
  at com.whatsapp.data.h.a(MessageStore.java:3092)
  - locked <0x06cb743e> (a com.whatsapp.data.d)
  at com.whatsapp.data.h.v(MessageStore.java:9276)
  - locked <0x06cb743e> (a com.whatsapp.data.d)
  at com.whatsapp.aio.run(unavailable:-1)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607)
  at com.whatsapp.util.br$2.a(WhatsAppWorkers.java:53)
  at com.whatsapp.util.bt.run(unavailable:-1)
  at java.lang.Thread.run(Thread.java:761)


又看了下CPU,IO等,负载并不高,不像是系统忙的情况。  好像是APP自己的问j题。可以,等下,这是什么东西:

04-07 10:40:04.431  7022 12063 W SQLiteConnectionPool: The connection pool for database '/data/user/0/com.android.providers.media/databases/external.db' has been unable to grant a connection to thread 494 (MediaScannerService) with flags 0x2 for 6510.3594 seconds.
04-07 10:40:04.431  7022 12063 W SQLiteConnectionPool: Connections: 0 active, 1 idle, 3 available.

04-07 13:21:36.643  7022  7022 W SQLiteConnectionPool: The connection pool for database '/data/user/0/com.android.providers.media/databases/internal.db' has been unable to grant a connection to thread 1 (main) with flags 0x6 for 750.02905 seconds.
04-07 13:21:36.643  7022  7022 W SQLiteConnectionPool: Connections: 0 active, 1 idle, 1 available.

吓死人了,数据库Mediaprovider 6000多秒都等不到数据库连接。 再回头一看whatsapp的log,这不就是在访问某个provider时卡住了么? 难道??

可是,为会么Mediaprovider无法从池中取得数据库连接呢?  很不好的是,我看到的trace中没有提供MediaProvider的stack。

由于此问题重现路径未知,设想了各种可能性后,最大的能是死锁。  在比较了当前代码和原生Android代码后,发现MTK在MeidaProvider.delete的synchronized (sGetTableAndWhereParam) 中多加了个db.beginTransaction()


public int delete(Uri uri, String userWhere, String[] whereArgs)
{
...

    synchronized (sGetTableAndWhereParam){
             db.beginTransaction();

    }
....

}

但不好的是,
public ContentProviderResult[] applyBatch(ArrayList<ContentProviderOperation> operations){}
    idb.beginTransaction();
    edb.beginTransaction();
 //然后这里可能间接的调用 delete() . . . }

至此,死锁的条件形成:

资源1: db.beginTransaction();

资源2:synchronized (sGetTableAndWhereParam)

线程1: MediaProvider.delete(),   占有:sGetTableAndWhereParam,申请beginTransaction

线程2:MediaProvider.applyBatch()-->delete(),占有beginTransaction,申请sGetTableAndWhereParam

一个模拟此死锁的代码:  synchronized 和 beginTransaction 死锁的一个例子

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值