问题现象:
车机大屏偶先插拔带音乐的U盘,导致车机系统短暂黑屏的情况。
日志中可以看到vold进程unmount了两次分区,一次是U盘分区,一次是/storage/emulated/0分区:
I vold : Start killProcesses: /mnt/media_rw/050F-4BB4
I vold : Start killProcesses: /storage/emulated/0
问题分析:
/storage/emulated/0分区是/sdcard的映射,理论上不应该在系统正常运行期间被unmount掉,查询上面的日志,发现有这样一条:
StorageUserConnection: Service: [ComponentInfo{com.android.providers.media.module/com.android.providers.media.fuse.ExternalStorageServiceImpl}] disconnected. User [0]
看这个日志的意思是StorageManagerService检测到com.android.providers.media.module进程不在了,对应的代码是这样的:
// StorageUserConnection.java
public void onServiceDisconnected(ComponentName name) {
// Service crashed or process was killed, #onServiceConnected will be called
// Don't need to re-bind.
Slog.i(TAG, "Service: [" + name + "] disconnected. User [" + mUserId + "]");
handleDisconnection();
}
private void handleDisconnection() {
// Clear all sessions because we will need a new device fd since
// StorageManagerService will reset the device mount state and #startSession
// will be called for any required mounts.
// Notify StorageManagerService so it can restart all necessary sessions
close();
resetUserSessions();
}
跟踪resetUserSessions可以看到这么一条调用栈:
StorageUserConnection::resetUserSessions
StorageManagerService::resetUser
--------------- binder 调用 -------------------
StorageManagerService::resetIfBootedAndConnected
StorageSessionController::onReset
IVold::unmount
--------------- binder 调用 -------------------
VolumeBase::unmount
EmulatedVolume::doUnmount
KillProcessesUsingPath
KillProcessesWithOpenFiles
也就是因为com.android.providers.media.module进程被kill了,导致被SystemServer中的StorageManagerService这个BinderService检测到了(通过bindService时传入的mServiceConnection回调到的)
日志再往上找,看下com.android.providers.media.module进程为什么会被kill,看到下面这几条日志:
I vold : Start KillProcessesUsingPath: public:8,1 /mnt/media_rw/050F-4BB4
I vold : Start killProcesses: /mnt/media_rw/050F-4BB4
W vold : Found symlink /proc/2487/fd/93 referencing /mnt/media_rw/050F-4BB4
W vold : Sending Interrupt to pid 2487 (rs.media.module, /system/bin/app_process64)
可以看到是因为拔掉U盘的时候,触发了Vold的动作,检测到com.android.providers.media.module进程在访问U盘分区中的内容,就把它kill了。
这里就有点不对了,com.android.providers.media.module按理也是android系统中一个重要的进程,不应该就因为拔一个U盘就重启了吧,阅读KillProcessesWithOpenFiles代码,看到有一个判断:
int KillProcessesWithOpenFiles(const std::string& prefix, int signal, bool killFuseDaemon) {
...
if (found) {
if (!IsFuseDaemon(pid) || killFuseDaemon) { // 判断进程是否为fusedaemon
pids.insert(pid);
} else {
LOG(WARNING) << "Found FUSE daemon with open file. Skipping...";
}
}
...
}
// TODO: Use a better way to determine if it's media provider app.
bool IsFuseDaemon(const pid_t pid) {
auto path = StringPrintf("/proc/%d/mounts", pid);
char* tmp;
if (lgetfilecon(path.c_str(), &tmp) < 0) { // 这里的判断是否存在不确定性?
return false;
}
bool result = android::base::StartsWith(tmp, kMediaProviderAppCtx)
|| android::base::StartsWith(tmp, kMediaProviderCtx);
freecon(tmp);
return result;
}
因为是偶先的,因此将怀疑点定在IsFuseDaemon里面lgetfilecon函数调用不稳定,没有识别出mediaprovider进程
后续准备换掉lgetfilecon判断是否为mediaprovider进程的方式,改为从/proc/$pid目录读取cmd文件内容来判断进程名的方式。