SharedPreference apply 引起的 ANR 问题

最新推荐文章于 2023-08-22 07:30:00 发布

snail201211

最新推荐文章于 2023-08-22 07:30:00 发布

阅读量2.7k

点赞数 2

分类专栏： Android学习笔记

Android学习笔记专栏收录该内容

35 篇文章 1 订阅

订阅专栏

转发：
作者：字节跳动技术团队
链接：https://www.jianshu.com/p/9ae0f6842689
来源：简书
简书著作权归作者所有，任何形式的转载都请联系作者获得授权并注明出处。

项目中 ANR 率居高不下，从统计上来看排在前面的有几个都是 SharedPreference（以下简称 SP）引起的。接下来我们抽丝剥茧的来分析其产生原因及如何解决。

crash 堆栈信息如下。从 crash 收集平台上来看，有几个类似的堆栈信息。唯一的区别就是 ActivityThread 的入口方法。除了 ActivityThread 的 handleSleeping方法之外，还有 handleServiceArgs、handleStopService、handleStopActivity。

image

ActivityThread 的这几个方法是 Activity 或 Service 的生命周期变化的时候调用的。从堆栈信息来看，组件生命周期变化，导致调用 QueueWork 中的队列处于等待状态，等待超时则发生 ANR。那么 QueuedWork 的工作机制是什么样的呢，我们从源码入手来进行分析。

SP 的 apply 到底做了什么

首先从问题的源头开始，SP 的 apply 方法。

apply 方法，首先创建了一个 awaitCommit 的 Runnable，然后加入到 QueuedWork 中，awaitCommit 中包含了一个等待锁，需要在其它地方释放。我们在上面看到的 QueuedWork.waitToFinish() 其实就是等待这个队列中的 awaitCommit 全部释放。

然后通过 SharedPreferencesImpl.this.enqueueDiskWrite 创建了一个任务来执行真正的 SP 持久化。

其实无论是 SP 的 commit 还是 apply 最终都会调用 enqueueDiskWrite 方法，区别是 commit 方法调用传递的第二个参数为 null。此方法内部也是根据第二个参数来区分 commit 和 apply 的，如果是 commit 则会同步的执行 writeToFileapply则会将 writeToFile 加入到一个任务队列中异步的执行，从这里也可以看出 commit 和 apply 的真正区别。

writeToFile 执行完成会释放等待锁，之后会回调传递进来的第二个参数 Runnable 的 run 方法，并将 QueuedWork 中的这个等待任务移除。

总结来看，SP 调用 apply 方法，会创建一个等待锁放到 QueuedWork 中，并将真正数据持久化封装成一个任务放到异步队列中执行，任务执行结束会释放锁。Activity onStop 以及 Service 处理 onStop，onStartCommand 时，执行 QueuedWork.waitToFinish() 等待所有的等待锁释放。

如何解决，清空等待队列

从上述分析来看，SP 操作仅仅把 commit 替换为 apply 不是万能的，apply 调用次数过多容易引起 ANR。所有此类 ANR 都是经由 QueuedWork.waitToFinish() 触发的，如果在调用此函数之前，将其中保存的队列手动清空，那么是不是能解决问题呢，答案是肯定的。

Activity 的 onStop，以及 Service 的 onStop 和 onStartCommand 都是通过 ActivityThread 触发的，ActivityThread 中有一个 Handler 变量，我们通过 Hook 拿到此变量，给此 Handler 设置一个 callback，Handler 的 dispatchMessage 中会先处理 callback。

1.在 Callback 中调用队列的清理工作

2.队列清理需要反射调用 QueuedWork。

清理等待锁会产生什么问题

SP 无论是 commit 还是 apply 都会产生 ANR，但从 Android 之初到目前 Android8.0，Google 一直没有修复此 bug，我们贸然处理会产生什么问题呢。Google 在 Activity 和 Service 调用 onStop 之前阻塞主线程来处理 SP，我们能猜到的唯一原因是尽可能的保证数据的持久化。因为如果在运行过程中产生了 crash，也会导致 SP 未持久化，持久化本身是 IO 操作，也会失败。我们清理了等待锁队列，会对数据持久化造成什么影响呢，下面我们通过一组实验来验证。

进程启动的时候，产生一个随机数字。用 commit 和 apply 两种方式来存此变量。第二次进程启动，获取以两种方式存取的值并做比较，如果相同表示 apply 持久化成功，如果不相同表示 apply 持久化失败。

实验一：开启等待锁队列的清理。

实验二：关闭等待锁队列的清理。

线上同时开启两个实验，在实验规模相同的情况下，统计 apply 失败率。

实验一，失败率为 1.84%。

实验二，失败率为为 1.79%

可见，apply 机制本身的失败率就比较高，清理等待锁队列对持久化造成的影响不大。

目前头条 app 已经全量开启清理等待锁策略，上线至今没有发现此策略产生的用户反馈。

SharedPreference如何阻塞主线程

https://www.jianshu.com/p/63ee8587de3f

最近发现我们的很多anr的原因都指向了SharedPreference，那么带着一些疑问，作如下探索：

sharedPreference为什么会阻塞主线程？
sharedPreference有没有内存缓存，他是如何读和写的？会立即写入文件吗？
他是如何保证数据同步的，如何才能避免sharedPreference引起的anr？

从sharedPreference的创建，到读取，到写入

sp的创建

先来看看sharedPreference是如何创建的，在ContextImple.getSharedPreference（）中，

@Override
    public SharedPreferences getSharedPreferences(File file, int mode) {
        checkMode(mode);
        SharedPreferencesImpl sp;
        synchronized (ContextImpl.class) {
            final ArrayMap<File, SharedPreferencesImpl> cache = getSharedPreferencesCacheLocked();
            sp = cache.get(file);
            if (sp == null) {
                sp = new SharedPreferencesImpl(file, mode);
                cache.put(file, sp);
                return sp;
            }
        }
        if ((mode & Context.MODE_MULTI_PROCESS) != 0 ||
            getApplicationInfo().targetSdkVersion < android.os.Build.VERSION_CODES.HONEYCOMB) {
            // If somebody else (some other process) changed the prefs
            // file behind our back, we reload it.  This has been the
            // historical (if undocumented) behavior.
            sp.startReloadIfChangedUnexpectedly();
        }
        return sp;
    }

他会有缓存，并不是每次都去文件中读写，有一个以sharedPreference的名称为key(通过名称缓存一个file，以这个file为key)，对应这个sharedPerference的内容为value的静态的map来缓存整个应用中的sp，所以我们最好不要创建过多的小的sp，尽量合并，不然这个静态的map会很大。

然后看看sp的构造函数：

    SharedPreferencesImpl(File file, int mode) {
        mFile = file;
        mBackupFile = makeBackupFile(file);
        mMode = mode;
        mLoaded = false;
        mMap = null;
        startLoadFromDisk();//
    }
    
//初始化的时候会开一个线程去读取xml文件。
    private void startLoadFromDisk() {
        synchronized (this) {
            mLoaded = false;
        }
        new Thread("SharedPreferencesImpl-load") {
            public void run() {
                loadFromDisk();
            }
        }.start();
    }

从构造函数中可以看出来：他会开一个线程去读取文件数据，也就是上次存储的文件，读到内存中。（由此可以看出，sp是有内存缓存的）

sp的读取：

每次读取都会对当前的sp对象加锁，然后判断是否load本地文件成功

@Nullable
    public String getString(String key, @Nullable String defValue) {
        synchronized (this) {
            awaitLoadedLocked();
            String v = (String)mMap.get(key);
            return v != null ? v : defValue;
        }
    }

这里的awaitLoadedLocked()就是等待sp的创建，其实在sp的构造方法中已经开了一个线程去load本地文件，这里只是等待他load完成。

load完成之后就可以从内存中去取了。

sp的写操作：

我们一般使用editor对sp去进行写操作。

先来看看editor如何创建出来的：

    public Editor edit() {
        // TODO: remove the need to call awaitLoadedLocked() when
        // requesting an editor.  will require some work on the
        // Editor, but then we should be able to do:
        //
        //      context.getSharedPreferences(..).edit().putString(..).apply()
        //
        // ... all without blocking.
        synchronized (this) {
            awaitLoadedLocked();
        }

        return new EditorImpl();
    }

这里可以看出来，就算是你不读，只写，他也需要等到读取本地文件完成。

editor里用一个map将改动的东西存起来，当提交的时候他会把他先提交到内存，然后再形成一个异步的提交。

editor里可以暂时存放多个key的改动，然后形成一次提交，如果我们可以将多个提交合并成一次提交，尽量合并，因为每一次调用apply或者commit都会形成一个新的提交，创建各种锁。

主要来看一下他的apply方法：

public void apply() {
            final MemoryCommitResult mcr = commitToMemory();
            final Runnable awaitCommit = new Runnable() {
                    public void run() {
                        try {
                            //阻塞调用者，谁调用，阻塞谁
                            mcr.writtenToDiskLatch.await();
                        } catch (InterruptedException ignored) {
                        }
                    }
                };

            QueuedWork.add(awaitCommit);

            Runnable postWriteRunnable = new Runnable() {
                    public void run() {
                        awaitCommit.run();
                        QueuedWork.remove(awaitCommit);
                    }
                };

            SharedPreferencesImpl.this.enqueueDiskWrite(mcr, postWriteRunnable);

            // Okay to notify the listeners before it's hit disk
            // because the listeners should always get the same
            // SharedPreferences instance back, which has the
            // changes reflected in memory.
            notifyListeners(mcr);
        }

这里会先创建一个awaitCommit的Runnable，主要是用来阻塞调用者（writtenToDiskLatch.await()谁调用阻塞谁），然后将这个awaitCommit加到QueuedWrok的队列中，然后又创建了一个postWriteRunnable，里面主要是做清除工作。然后最后一句enqueueDiskWrite（）这个方法：

private void enqueueDiskWrite(final MemoryCommitResult mcr,
                                  final Runnable postWriteRunnable) {
        final Runnable writeToDiskRunnable = new Runnable() {
                public void run() {
                    synchronized (mWritingToDiskLock) {
                        writeToFile(mcr);
                    }
                    synchronized (SharedPreferencesImpl.this) {
                        mDiskWritesInFlight--;
                    }
                    if (postWriteRunnable != null) {
                        postWriteRunnable.run();
                    }
                }
            };

        final boolean isFromSyncCommit = (postWriteRunnable == null);

        // Typical #commit() path with fewer allocations, doing a write on
        // the current thread.
        if (isFromSyncCommit) {
            boolean wasEmpty = false;
            synchronized (SharedPreferencesImpl.this) {
                wasEmpty = mDiskWritesInFlight == 1;
            }
            if (wasEmpty) {
                writeToDiskRunnable.run();
                return;
            }
        }

        QueuedWork.singleThreadExecutor().execute(writeToDiskRunnable);
    }

这里又创建了一个Runnable，我们来理清一下他们之间的调用关系。

sp.png

从上图可以看到，其实那个加入到单线程线程池中的异步写文件操作（writeToDiskRunnable）才真正成为了一个异步任务，其他的两个runnable只是被调用了run方法。

一个异步写操作：先调用写入文件，写入完成调用setDiskWriteResult()这里将计数锁减一，表示当前这个写操作完成。然后调用postWriteRunnable做清除队列操作，这里会调用awaitCommit这个runnable里的await()但是因为刚刚的锁已经解除了，所以这里不会阻塞。这样就表示一次apply的异步任务完成。

但是他为什么要把awaitCommit这个Runnable存放到一个静态的队列中去呢？这里就是阻塞主线程的关键了。

在QueuedWork这个类的主要内容：

/**
 * Internal utility class to keep track of process-global work that's
 * outstanding and hasn't been finished yet.
 *
 * This was created for writing SharedPreference edits out
 * asynchronously so we'd have a mechanism to wait for the writes in
 * Activity.onPause and similar places, but we may use this mechanism
 * for other things in the future.
 *
 * @hide
 */
 
    // The set of Runnables that will finish or wait on any async
    // activities started by the application.
    private static final ConcurrentLinkedQueue<Runnable> sPendingWorkFinishers =
            new ConcurrentLinkedQueue<Runnable>();
            
    /**
     * Add a runnable to finish (or wait for) a deferred operation
     * started in this context earlier.  Typically finished by e.g.
     * an Activity#onPause.  Used by SharedPreferences$Editor#startCommit().
     *
     * Note that this doesn't actually start it running.  This is just
     * a scratch set for callers doing async work to keep updated with
     * what's in-flight.  In the common case, caller code
     * (e.g. SharedPreferences) will pretty quickly call remove()
     * after an add().  The only time these Runnables are run is from
     * waitToFinish(), below.
     */
    public static void add(Runnable finisher) {
        sPendingWorkFinishers.add(finisher);
    }
    
    
    /**
     * Finishes or waits for async operations to complete.
     * (e.g. SharedPreferences$Editor#startCommit writes)
     *
     * Is called from the Activity base class's onPause(), after
     * BroadcastReceiver's onReceive, after Service command handling,
     * etc.  (so async work is never lost)
     */
    public static void waitToFinish() {
        Runnable toFinish;
        while ((toFinish = sPendingWorkFinishers.poll()) != null) {
            toFinish.run();
        }
    }

这里可以看出，他是要保证写入的内容不会丢失，所以才会将每个apply的await存起来，然后依次调用，如果有没有完成的，则阻塞调用者也就是主线程。

那，到底是在哪里调用的呢？

那我们就来找在我们的崩溃日志中，多次出现的

at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:202)
at android.app.SharedPreferencesImpl$EditorImpl$1.run(SharedPreferencesImpl.java:364)
at android.app.QueuedWork.waitToFinish(QueuedWork.java:88)
at android.app.ActivityThread.handleStopActivity(ActivityThread.java:3246)
at android.app.ActivityThread.access$1100(ActivityThread.java:141)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1239)

这里看到ActivityThread。handleStopActivity()这个方法，果然在这个方法中能找到调用QueueWork中的await的地方：

    private void handleStopActivity(IBinder token, boolean show, int configChanges, int seq) {
        ActivityClientRecord r = mActivities.get(token);
        if (!checkAndUpdateLifecycleSeq(seq, r, "stopActivity")) {
            return;
        }
        r.activity.mConfigChangeFlags |= configChanges;

        StopInfo info = new StopInfo();
        performStopActivityInner(r, info, show, true, "handleStopActivity");

        if (localLOGV) Slog.v(
            TAG, "Finishing stop of " + r + ": show=" + show
            + " win=" + r.window);

        updateVisibility(r, show);

        // Make sure any pending writes are now committed.
        if (!r.isPreHoneycomb()) {
            QueuedWork.waitToFinish();
        }

        // Schedule the call to tell the activity manager we have
        // stopped.  We don't do this immediately, because we want to
        // have a chance for any other pending work (in particular memory
        // trim requests) to complete before you tell the activity
        // manager to proceed and allow us to go fully into the background.
        info.activity = r;
        info.state = r.state;
        info.persistentState = r.persistentState;
        mH.post(info);
        mSomeActivitiesChanged = true;
    }

这个方法会在什么时候调用呢？

当系统给app发送了命令之后会调用

再看一下这个handleStopActivity调用了哪些方法：
handleStopActivity的调用链

ActivityThread.handleStopActivity
    ActivityThread.performStopActivityInner
        ActivityThread.callCallActivityOnSaveInstanceState
            Instrumentation.callActivityOnSaveInstanceState
                Activity.performSaveInstanceState
                    Activity.onSaveInstanceState

        ActivityThread.performStop
            Activity.performStop
                Instrumentation.callActivityOnStop
                    Activity.onStop

    updateVisibility

    H.post(StopInfo)
        AMP.activityStopped
            AMS.activityStopped
                ActivityStack.activityStoppedLocked
                AMS.trimApplications
                    ProcessRecord.kill
                    ApplicationThread.scheduleExit
                        Looper.myLooper().quit()

                    AMS.cleanUpApplicationRecordLocked
                    AMS.updateOomAdjLocked

看到当handleStopActivity被调用之后会回调一些我们熟悉的方法

Activity.onSaveInstanceState
Activity.onStop

总结一下：

使用了apply方式异步写sp的时候每次apply()调用都会形成一次提交，每次有系统消息发生的时候（handleStopActivity， handlePauseActivity）都会去检查已经提交的apply写操作是否完成，如果没有完成则阻塞主线程。

作者：ironman_
链接：https://www.jianshu.com/p/63ee8587de3f
来源：简书
简书著作权归作者所有，任何形式的转载都请联系作者获得授权并注明出处。

snail201211

关注

2
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
SharedPreference apply 引起的 ANR 问题

转发：作者：字节跳动技术团队链接：https://www.jianshu.com/p/9ae0f6842689来源：简书简书著作权归作者所有，任何形式的转载都请联系作者获得授权并注明出处。项目中 ANR 率居高不下，从统计上来看排在前面的有几个都是 SharedPreference（以下简称 SP）引起的。接下来我们抽丝剥茧的来分析其产生原因及如何解决。crash 堆栈信息如...
复制链接

扫一扫

专栏目录