系统升级软件流程
本章节结合源码剖析Recovery系统升级流程,流程中相关技术难点或者细节会单独文章介绍,文中相应位置会附上链接。
从APP检测到服务器推送OTA升级包到设备启动到新版本系统的整个软件流程如下图所示,
文章将围绕图中涉及到的模块详细讲解。
软件流程
1. App下载升级包并调用RecoverySystem接口
检测是否有OTA推送并从服务器下载升级包的业务逻辑由oem厂商自行实现,下面从触发升级开始分析。
App下载完升级包后调用 framework RecoverySystem 类的 installPackage 接口传入下载好的升级包路径,app的任务到此即结束。
// android.os.RecoverySystem
RecoverySystem.installPackage(Context context, File packageFile)
2. Framework RecoverySystem 触发升级
Google AOSP RecoverySystem
函数 installPackage():
// frameworks/base/core/java/android/os/RecoverySystem.java
public static void installPackage(Context context, File packageFile, boolean processed)
throws IOException {
synchronized (sRequestLock) {
/* 1. 构造固定格式的 recovery 升级指令 */
LOG_FILE.delete();
// Must delete the file in case it was created by system server.
UNCRYPT_PACKAGE_FILE.delete();
String filename = packageFile.getCanonicalPath();
Log.w(TAG, "!!! REBOOTING TO INSTALL " + filename + " !!!");
// If the package name ends with "_s.zip", it's a security update.
boolean securityUpdate = filename.endsWith("_s.zip");
// 如果升级包存储于data分区,则需要对升级包特殊处理,原因和原理见下文介绍。
// If the package is on the /data partition, the package needs to
// be processed (i.e. uncrypt'd). The caller specifies if that has
// been done in 'processed' parameter.
if (filename.startsWith("/data/")) {
// 如果升级包已经被处理过则检查处理后输出文件是否存在即可
if (processed) {
if (!BLOCK_MAP_FILE.exists()) {
Log.e(TAG, "Package claimed to have been processed but failed to find "
+ "the block map file.");
throw new IOException("Failed to find block map file");
}
} else {
// 升级包预处理是由服务 uncryptd 完成的,其输入为文件 UNCRYPT_PACKAGE_FILE,
// 输出为文件 BLOCK_MAP_FILE, 此处初始化这两个文件。 uncryptd 详见下文介绍。
FileWriter uncryptFile = new FileWriter(UNCRYPT_PACKAGE_FILE);
try {
uncryptFile.write(filename + "\n");
} finally {
uncryptFile.close();
}
// UNCRYPT_PACKAGE_FILE needs to be readable and writable
// by system server.
if (!UNCRYPT_PACKAGE_FILE.setReadable(true, false)
|| !UNCRYPT_PACKAGE_FILE.setWritable(true, false)) {
Log.e(TAG, "Error setting permission for " + UNCRYPT_PACKAGE_FILE);
}
BLOCK_MAP_FILE.delete();
}
// 预处理的升级包参数改为 "@+BLOCK_MAP_FILE(/cache/recovery/block.map)",
// 为什么这么做,仅仅是约定而已,原理见下文介绍 uncryptd 。
// If the package is on the /data partition, use the block map
// file as the package name instead.
filename = "@/cache/recovery/block.map";
}
final String filenameArg = "--update_package=" + filename + "\n";
final String localeArg = "--locale=" + Locale.getDefault().toLanguageTag() + "\n";
final String securityArg = "--security\n";
String command = filenameArg + localeArg;
if (securityUpdate) {
command += securityArg;
}
/* 2. 通过 RECOVERY_SERVICE 把升级指令写入到 BCB(也就是misc分区头部)*/
RecoverySystem rs = (RecoverySystem) context.getSystemService(
Context.RECOVERY_SERVICE);
if (!rs.setupBcb(command)) {
throw new IOException("Setup BCB failed");
}
/* 3. 通过 POWER_SERVICE 触发重启 */
// Having set up the BCB (bootloader control block), go ahead and reboot
PowerManager pm = (PowerManager) context.getSystemService(Context.POWER_SERVICE);
String reason = PowerManager.REBOOT_RECOVERY_UPDATE;
// On TV, reboot quiescently if the screen is off
if (context.getPackageManager().hasSystemFeature(PackageManager.FEATURE_LEANBACK)) {
WindowManager wm = (WindowManager) context.getSystemService(Context.WINDOW_SERVICE);
if (wm.getDefaultDisplay().getState() != Display.STATE_ON) {
reason += ",quiescent";
}
}
// 进入关机重启流程
pm.reboot(reason);
throw new IOException("Reboot failed (no permissions?)");
}
}
分析 installPackage 一共干了以下3件事:
2.1 构造 Recovery 升级指令
通过升级包存储位置的绝对路径判断存储设备是 data 分区还是其他存储介质(U盘、TF卡等)决定是否对升级包预处理 。
如果需要预处理,则把参数 update_package 的值改为固定值 “@/cache/recovery/block.map”,同时把参数 update_package、locale、security 等格式化成固定格式的升级指令字符串。
1)uncryptd 为什么预处理升级包?
Android data 分区的数据会被加密(FDE/FBE),AOSP recovery 没有实现分区解密功能,因此 recovery 无法访问data分区的数据,也就无法从 data 分区文件系统直接 load 升级包。(recovery模式下:FDE加密的设备无法挂载data分区,FBE加密的设备看到的data分区文件内容是乱码)。
- 所以在进入recovery 前先把升级包数据解密,解密后会把升级包在存储介质中的存储信息写入固定文件 /cache/recovery/block.map 中,进入recovery后,从 block.map 文件中解析出升级包位置信息即可 load 升级包数据。
- 当然如果升级包保存在未加密的TF卡、U盘中,那么无需多升级包做额外处理,recovery可以从存储器文件系统直接 load 数据)
2)uncryptd 如何预处理升级包?
输入: UNCRYPT_PACKAGE_FILE 输入参数。
输出:处理结果写入到文件BLOCK_MAP_FILE(/cache/recovery/block.map)。
当然此处只是准备好输入输出文件,预处理操作是在下文第 2.3 步重启设备时执行的,见下文介绍。
UNCRYPT_PACKAGE_FILE :升级包的实际在文件系统的路径;
BLOCK_MAP_FILE :升级包数据在存储器中块分布信息(主要就是块号);
3)升级指令字符串中各个参数的格式为 –key=value 或者 –key,同时以换行符分隔。
4)函数参数 processed 的作用
uncryptd 在关机时对升级包做预处理解密,当升级包的 size 比较大时会造成关机耗时,因此可以事先预处理好升级包,再调用 installPackage 时processed 置为 true,那么在关机时就不会启动 uncryptd,从而不影响关机速度。
2.2 通过 RECOVERY_SERVICE 把升级指令写入到BCB
把格式化后的指令字符串写入 misc 分区头部的 BCB (bootloader control block)区域。
RecoverySystem:
// frameworks/base/core/java/android/os/RecoverySystem.java
// 1. RecoverySystem 调用 setupBcb
public static void installPackage(Context context, File packageFile, boolean processed) {
...
RecoverySystem rs = (RecoverySystem) context.getSystemService(Context.RECOVERY_SERVICE);
rs.setupBcb(command)
...
}
// 2. 调 RecoverySystemService 的 setupBcb 接口
private boolean setupBcb(String command) {
return mService.setupBcb(command);
}
RecoverySystemService:
// frameworks/base/services/core/java/com/android/server/recoverysystem/RecoverySystemService.java
// 1. 调用 setupOrClearBcb 把升级指令字符串 command 写入 BCB
public boolean setupBcb(String command) {
if (DEBUG) Slog.d(TAG, "setupBcb: [" + command + "]");
return setupOrClearBcb(true, command);
}
// 2. setupOrClearBcb 实际上是启动 native 服务 setup-bcb 并通过它把指令字符串写入 BCB
private boolean setupOrClearBcb(boolean isSetup, String command) {
// 2.1 检查 uncrypt/setup-bcb/clear-bcb 服务是否正在运行,
// 如果处于runing状态则说明在这之前已经触发工作了,中止本次操作。
final boolean available = checkAndWaitForUncryptService();
if (!available) {
Slog.e(TAG, "uncrypt service is unavailable.");
return false;
}
// 2.2 通过 isSetup 判断往 BCB 写入还是擦除参数,启动不同的服务
// (本质上 uncrypt/setup-bcb/clear-bcb 都是同一个binary,
// 只是传入不同参数执行不同任务而已,详见下文讲解)
if (isSetup) {
mInjector.systemPropertiesSet("ctl.start", "setup-bcb");
} else {
mInjector.systemPropertiesSet("ctl.start", "clear-bcb");
}
// 2.3 启动 setup-bcb 或者 clear-bcb 服务后通过socket 与其通信
// Connect to the uncrypt service socket.
UncryptSocket socket = mInjector.connectService();
if (socket == null) {
Slog.e(TAG, "Failed to connect to uncrypt socket");
return false;
}
try {
// 如果是写 BCB 参数则把升级指令通过 socket 传输给服务 setup-bcb
// Send the BCB commands if it's to setup BCB.
if (isSetup) {
socket.sendCommand(command);
}
// 从 socket 读取 setup-bcb/clear-bcb 执行的结果
// Read the status from the socket.
int status = socket.getPercentageUncrypted();
// Ack receipt of the status code. uncrypt waits for the ack so
// the socket won't be destroyed before we receive the code.
socket.sendAck();
// setup-bcb/clear-bcb 定义好的成功返回值,执行成功返回100
// (100仅仅是 setup-bcb 服务定义的正确返回值,无计量等特殊含义)
if (status == 100) {
Slog.i(TAG, "uncrypt " + (isSetup ? "setup" : "clear")
+ " bcb successfully finished.");
} else {
// Error in /system/bin/uncrypt.
Slog.e(TAG, "uncrypt failed with status: " + status);
}
}
}
此处不展开介绍 native 服务 setup-bcb/clear-bcb 如何写入和擦除BCB数据,详见文章:(待续)
2.3 通过 POWER_SERVICE 触发重启设备
该步骤的重点是启动 uncryptd 预处理升级包。
RecoverySystem:
// frameworks/base/core/java/android/os/RecoverySystem.java
public static void installPackage(Context context, File packageFile, boolean processed) {
...
PowerManager pm = (PowerManager) context.getSystemService(Context.POWER_SERVICE);
String reason = PowerManager.REBOOT_RECOVERY_UPDATE;
pm.reboot(reason);
}
PowerManager:
// frameworks/base/core/java/android/os/PowerManager.java
public void reboot(@Nullable String reason) {
mService.reboot(false, reason, true);
}
PowerManagerService:
// frameworks/base/services/core/java/com/android/server/power/PowerManagerService.java
public void reboot(boolean confirm, @Nullable String reason, boolean wait) {
shutdownOrRebootInternal(HALT_MODE_REBOOT, confirm, reason, wait);
}
private void shutdownOrRebootInternal(final @HaltMode int haltMode, final boolean confirm,
@Nullable final String reason, boolean wait) {
...
// 启动关机线程 ShutdownThread
Runnable runnable = new Runnable() {
@Override
public void run() {
if (haltMode == HALT_MODE_REBOOT) {
ShutdownThread.reboot(getUiContext(), reason, confirm);
}
}
};
// ShutdownThread must run on a looper capable of displaying the UI.
Message msg = Message.obtain(UiThread.getHandler(), runnable);
msg.setAsynchronous(true);
UiThread.getHandler().sendMessage(msg);
// PowerManager.reboot() is documented not to return so just wait for the inevitable.
if (wait) {
while (true) {
runnable.wait();
}
}
}
ShutdownThread:
// frameworks/base/services/core/java/com/android/server/power/ShutdownThread.java
public final class ShutdownThread extends Thread {
ShutdownThread sInstance = new ShutdownThread()
// 1. 重启机器
public static void reboot(final Context context, String reason, boolean confirm) {
mReboot = true;
mRebootSafeMode = false;
mRebootHasProgressBar = false;
mReason = reason;
shutdownInner(context, confirm);
}
// 2. 弹出关机进度条弹窗(uncryptd 处理升级包比较耗时)
private static void shutdownInner(final Context context, boolean confirm) {
beginShutdownSequence(context) {
sInstance.mProgressDialog = showShutdownDialog(context);
sInstance.start()
}
}
// 3. 如上文所述,UNCRYPT_PACKAGE_FILE 存在以及 BLOCK_MAP_FILE 不存在 则说明需要uncryptd
// 预处理升级包,此时标记本次重启需要给用户进度条弹窗,同时该标记 mRebootHasProgressBar 在
// 下文也会作为是否启动 uncryptd 的标志。
private static ProgressDialog showShutdownDialog(Context context) {
// mReason could be "recovery-update" or "recovery-update,quiescent".
if (mReason != null && mReason.startsWith(PowerManager.REBOOT_RECOVERY_UPDATE)) {
// We need the progress bar if uncrypt will be invoked during the
// reboot, which might be time-consuming.
mRebootHasProgressBar = RecoverySystem.UNCRYPT_PACKAGE_FILE.exists()
&& !(RecoverySystem.BLOCK_MAP_FILE.exists());
}
...
}
// 4. ShutdownThread 线程的任务实现
/**
* Makes sure we handle the shutdown gracefully.
* Shuts off power regardless of radio state if the allotted time has passed.
*/
public void run() {
// 记录本次重启原因
{
String reason = (mReboot ? "1" : "0") + (mReason != null ? mReason : "");
SystemProperties.set(SHUTDOWN_ACTION_PROPERTY, reason);
}
// 此处进入 uncryptd 开始预处理升级包
if (mRebootHasProgressBar) {
sInstance.setRebootProgress(MOUNT_SERVICE_STOP_PERCENT, null);
// If it's to reboot to install an update and uncrypt hasn't been
// done yet, trigger it now.
uncrypt();
}
// 最后关机 or 重启
rebootOrShutdown(mContext, mReboot, mReason);
}
// 5.
public static void rebootOrShutdown(final Context context, boolean reboot, String reason) {
if (reboot) {
Log.i(TAG, "Rebooting, reason: " + reason);
PowerManagerService.lowLevelReboot(reason);
Log.e(TAG, "Reboot failed, will attempt shutdown instead");
reason = null;
}
...
}
}
函数 uncrypt():
通过 RecoverySystem 启动 uncryptd 预处理升级包,同时监听处理进度,更新弹窗显示的进度条。
// frameworks/base/services/core/java/com/android/server/power/ShutdownThread.java
private void uncrypt() {
Log.i(TAG, "Calling uncrypt and monitoring the progress...");
// 定义uncryptd 预处理升级包进度监听器,更新关机进度条进度值
final RecoverySystem.ProgressListener progressListener =
new RecoverySystem.ProgressListener() {
@Override
public void onProgress(int status) {
if (status >= 0 && status < 100) {
// Scale down to [MOUNT_SERVICE_STOP_PERCENT, 100).
status = (int)(status * (100.0 - MOUNT_SERVICE_STOP_PERCENT) / 100);
status += MOUNT_SERVICE_STOP_PERCENT;
CharSequence msg = mContext.getText(
com.android.internal.R.string.reboot_to_update_package);
sInstance.setRebootProgress(status, msg);
} else if (status == 100) {
CharSequence msg = mContext.getText(
com.android.internal.R.string.reboot_to_update_reboot);
sInstance.setRebootProgress(status, msg);
} else {
// Ignored
}
}
};
// 通过RecoverySystem的processPackage接口启动uncryptd预处理升级包
final boolean[] done = new boolean[1];
done[0] = false;
Thread t = new Thread() {
@Override
public void run() {
RecoverySystem rs = (RecoverySystem) mContext.getSystemService(
Context.RECOVERY_SERVICE);
String filename = null;
try {
filename = FileUtils.readTextFile(RecoverySystem.UNCRYPT_PACKAGE_FILE, 0, null);
// 把调用 RecoverySystem.installPackage 准备好的UNCRYPT_PACKAGE_FILE和
// 进度监听器传入processPackage接口,最终uncryptd会把 UNCRYPT_PACKAGE_FILE
// 的内容作为输入预处理升级包同时通过 progressListene r反馈处理进度
rs.processPackage(mContext, new File(filename), progressListener);
} catch (IOException e) {
Log.e(TAG, "Error uncrypting file", e);
}
done[0] = true;
}
};
t.start();
try {
t.join(MAX_UNCRYPT_WAIT_TIME);
} catch (InterruptedException unused) {
}
if (!done[0]) {
Log.w(TAG, "Timed out waiting for uncrypt.");
final int uncryptTimeoutError = 100;
String timeoutMessage = String.format("uncrypt_time: %d\n" + "uncrypt_error: %d\n",
MAX_UNCRYPT_WAIT_TIME / 1000, uncryptTimeoutError);
try {
FileUtils.stringToFile(RecoverySystem.UNCRYPT_STATUS_FILE, timeoutMessage);
} catch (IOException e) {
Log.e(TAG, "Failed to write timeout message to uncrypt status", e);
}
}
}
此处不展开介绍 uncryptd 如何预处理升级包,详见文章:(待续)
3. BootLoader 读取 BCB 启动到 Recovery System
Bootloader阶段代码AOSP非实现,由芯片平台产商提供,此处只粗略介绍高通平台升级时BootLoader流程,其他平台(MTK、三星 Exynos)虽然代码实现不一样,但是流程基本一致。
函数 LinuxLoaderEntry (…):
bootloader 启动 kernel 的入口。
// Bootloader load Linux kernel 入口
LinuxLoaderEntry (IN EFI_HANDLE ImageHandle, IN EFI_SYSTEM_TABLE *SystemTable) {
// 1. 从 boot reason 确认启动到那种模式
// 平时 "adb reboot recovery" 就是在这里决定启动到 recovery 模式
// 而升级的话是通过后面第2步决定的。
Status = GetRebootReason (&BootReason);
// 2. 从 misc 分区读取 BCB 内容确认是否要进入 Recovery
Status = RecoveryInit (&BootIntoRecovery);
if (!BootIntoFastboot) {
BootInfo Info = {0};
// 3. 设置启动参数
// BootIntoRecovery 为 true 则启动到 Recovery system
// 否则启动到 Main system。
Info.MultiSlotBoot = MultiSlotBoot;
Info.BootIntoRecovery = BootIntoRecovery;
Info.BootReasonAlarm = BootReasonAlarm;
// 4. 分区镜像签名校验
Status = LoadImageAndAuth (&Info);
// 5. 从存储器 load kernel到内存并跳转到 kernel
BootLinux (&Info);
}
}
函数 RecoveryInit (…):
作用:根据 misc 分区 BCB 内容判断是否启动到 recovery 模式。
实现:RecoveryInit 直接把 misc 分区头部的 raw 数据填充 RecoveryMessage 结构体 (RecoveryMessage 即 BCB 从存储器到内存中的数据表示),然后判断 command 字段是否等于字符串 “boot-recovery” 来决定是否启动到 recovery system 还是 main system(由前文可知 misc 分区头部的 BCB 数据是框架 RecoverySystem类 通过服务 setup-bcb 写入的)。
#define RECOVERY_BOOT_RECOVERY "boot-recovery"
/* Recovery Message */
struct RecoveryMessage {
CHAR8 command[32];
CHAR8 status[32];
CHAR8 recovery[1024];
};
EFI_STATUS
RecoveryInit (BOOLEAN *BootIntoRecovery)
{
EFI_STATUS Status;
struct RecoveryMessage *Msg = NULL;
EFI_GUID Ptype = gEfiMiscPartitionGuid;
MemCardType CardType = UNKNOWN;
VOID *PartitionData = NULL;
UINT32 PageSize;
CardType = CheckRootDeviceType ();
if (CardType == NAND) {
Status = GetNandMiscPartiGuid (&Ptype);
if (Status != EFI_SUCCESS) {
return Status;
}
}
GetPageSize (&PageSize);
/* Get the first 2 pages of the misc partition.
* If the device type is NAND then read the recovery message from page 1,
* Else read from the page 0
*/
Status = ReadFromPartition (&Ptype, (VOID **)&PartitionData, (PageSize * 2));
if (Status != EFI_SUCCESS) {
DEBUG ((EFI_D_ERROR, "Error Reading from misc partition: %r\n", Status));
return Status;
}
if (!PartitionData) {
DEBUG ((EFI_D_ERROR, "Error in loading Data from misc partition\n"));
return EFI_INVALID_PARAMETER;
}
Msg = (CardType == NAND) ?
(struct c *) ((CHAR8 *) PartitionData + PageSize) :
(struct RecoveryMessage *) PartitionData;
// Ensure NULL termination
Msg->command[sizeof (Msg->command) - 1] = '\0';
if (Msg->command[0] != 0 && Msg->command[0] != 255)
DEBUG ((EFI_D_VERBOSE, "Recovery command: %d %a\n", sizeof (Msg->command),
Msg->command));
if (!AsciiStrnCmp (Msg->command, RECOVERY_BOOT_RECOVERY,
AsciiStrLen (RECOVERY_BOOT_RECOVERY))) {
*BootIntoRecovery = TRUE;
}
FreePool (PartitionData);
PartitionData = NULL;
Msg = NULL;
return Status;
}
函数 BootLinux (…):
把存储在磁盘上不同分区的 ramdisk、kernel 加载到固定的内存区域中,并设置传递给 kernel 的 cmdline,最后通过指向 kernel 在内存中的首地址的函数指针跳转到 kernel 执行,此后启动流程进入 kernel 阶段。
从软件架构篇可知, recovery system 和 main system 的 kernel、ramdisk 会从不同的分区加载到内存。recovery system 的 kernel 和 ramdisk 是从 recovery 分区加载,而 main system 的 kernel 和 ramdisk 是从 boot 分区加载。两者的区别在于ramdisk 里面打包的目录结构、配置文件,执行程序等不一样,但 kernel 实际上是完全一致的,只是运行时因为 cmdline 不同流程会有差异。
EFI_STATUS
BootLinux (BootInfo *Info) {
....
LinuxKernel = (LINUX_KERNEL) (UINT64)BootParamlistPtr.KernelLoadAddr;
LinuxKernel ((UINT64)BootParamlistPtr.DeviceTreeLoadAddr, 0, 0, 0);
}
4. Kernel 加载 ramdisk,启动 init 并拉起 recovery 进程
(待续)
5. 进入 Recovery 升级流程
Android Q 开始 Google 在 recovery 模式下增加了 fastbootd,用于使用动态分区的设备烧写system、vendor等分区,因此 main 函数里面调用 StartFastboot 或者 start_recovery 进入到不同的子模式中。
函数 main() :
通过参数决定进入 user fastboot模式(StartFastboot) 还是 recovery 模式(start_recovery),同时在退出 fastboot/recovery 模式后根据返回值决定重启或者关机。
fastbootd:
在用户态打开一个usb端口同时实现了bootloader fastboot 数据传输协议的服务,在这个模式下可以使用fastboot.exe烧写设备分区镜像,本文不做详细介绍。
// bootable/recovery/recovery_main.cpp
int main(int argc, char** argv) {
// 初始化 log
// We don't have logcat yet under recovery; so we'll print error on screen and log to stdout
// (which is redirected to recovery.log) as we used to do.
android::base::InitLogging(argv, &UiLogger);
// 将程序标准输出重定向到临时log文件 /tmp/recovery.log
// redirect_stdio should be called only in non-sideload mode. Otherwise we may have two logger
// instances with different timestamps.
redirect_stdio(Paths::Get().temporary_log_file().c_str());
// 从 fstab load 分区信息
load_volume_table();
// 从 misc 分区把存储在 BCB 里的升级指令取出并保存到数组 args
std::vector<std::string> args = get_args(argc, argv, &stage);
while (true) {
// We start adbd in recovery for the device with userdebug build or a unlocked bootloader.
std::string usb_config =
fastboot ? "fastboot" : IsRoDebuggable() || IsDeviceUnlocked() ? "adb" : "none";
std::string usb_state = android::base::GetProperty("sys.usb.state", "none");
if (usb_config != usb_state) {
if (!SetUsbConfig("none")) {
LOG(ERROR) << "Failed to clear USB config";
}
if (!SetUsbConfig(usb_config)) {
LOG(ERROR) << "Failed to set USB config to " << usb_config;
}
}
// 通过 args 里的参数识别到 recovery 模式,进入 start_recovery,并传入从 misc 分区 BCB
// 读到升级指令数组。
auto ret = fastboot ? StartFastboot(device, args) : start_recovery(device, args);
// 升级结束,关机 or 重启 等
switch (ret) {
case Device::REBOOT:
ui->Print("Rebooting...\n");
Reboot("userrequested,recovery");
break;
}
}
// Should be unreachable.
return EXIT_SUCCESS;
}
5.1 从 misc 分区 BCB 读取升级指令
函数 get_args():
从函数注释可知升级指令有三个来源,依次读取解析,只要其中一个地方读取到指令则直接返回。
get_args 依次从下面三个地方获取升级指令:
- 进程启动参数
- misc 分区 BCB
- COMMAND_FILE (/cache/recovery/command)
升级流程实际上都是从 第2项 misc 分区 BCB 读取的。
// bootable/recovery/recovery_main.cpp
// Parses the command line argument from various sources; and reads the stage field from BCB.
// command line args come from, in decreasing precedence:
// - the actual command line
// - the bootloader control block (one per line, after "recovery")
// - the contents of COMMAND_FILE (one per line)
static std::vector<std::string> get_args(const int argc, char** const argv, std::string* stage) {
CHECK_GT(argc, 0);
bootloader_message boot = {};
std::string err;
// 1. 把 misc 分区头部 BCB 数据填充到 bootloader_message 结构体 boot
if (!read_bootloader_message(&boot, &err)) {
LOG(ERROR) << err;
// If fails, leave a zeroed bootloader_message.
boot = {};
}
if (stage) {
*stage = std::string(boot.stage);
}
std::string boot_command;
if (boot.command[0] != 0) {
if (memchr(boot.command, '\0', sizeof(boot.command))) {
boot_command = std::string(boot.command);
} else {
boot_command = std::string(boot.command, sizeof(boot.command));
}
LOG(INFO) << "Boot command: " << boot_command;
}
if (boot.status[0] != 0) {
std::string boot_status = std::string(boot.status, sizeof(boot.status));
LOG(INFO) << "Boot status: " << boot_status;
}
// 2. 把进程启动参数作为默认升级指令参数 (通常为空)
std::vector<std::string> args(argv, argv + argc);
// 3. 如果进程启动参数为空,则从 misc BCB 的 "recovery" 字段获取升级指令
// --- if arguments weren't supplied, look in the bootloader control block
if (args.size() == 1) {
boot.recovery[sizeof(boot.recovery) - 1] = '\0'; // Ensure termination
std::string boot_recovery(boot.recovery);
std::vector<std::string> tokens = android::base::Split(boot_recovery, "\n");
if (!tokens.empty() && tokens[0] == "recovery") {
for (auto it = tokens.begin() + 1; it != tokens.end(); it++) {
// Skip empty and '\0'-filled tokens.
if (!it->empty() && (*it)[0] != '\0') args.push_back(std::move(*it));
}
LOG(INFO) << "Got " << args.size() << " arguments from boot message";
} else if (boot.recovery[0] != 0) {
LOG(ERROR) << "Bad boot message: \"" << boot_recovery << "\"";
}
}
// 4. 如果前面均没有获取到参数则从 COMMAND_FILE 获取参数
// --- if that doesn't work, try the command file (if we have /cache).
if (args.size() == 1 && HasCache()) {
std::string content;
if (ensure_path_mounted(COMMAND_FILE) == 0 &&
android::base::ReadFileToString(COMMAND_FILE, &content)) {
std::vector<std::string> tokens = android::base::Split(content, "\n");
// All the arguments in COMMAND_FILE are needed (unlike the BCB message,
// COMMAND_FILE doesn't use filename as the first argument).
for (auto it = tokens.begin(); it != tokens.end(); it++) {
// Skip empty and '\0'-filled tokens.
if (!it->empty() && (*it)[0] != '\0') args.push_back(std::move(*it));
}
LOG(INFO) << "Got " << args.size() << " arguments from " << COMMAND_FILE;
}
}
// 5. 把读到的参数更新或misc 分区 BCB,这个操作是针对从进程启动参数或者COMMAND_FILE获取
// 升级指令设计的,这样可以使得指令在正常退出前都保存在misc分区,即使中间出现中断等情况,设
// 备可以自动恢复完成指令,知道最后执行完毕主动擦除misc分区
// *** 此处增强了升级的可靠性 ****
// Write the arguments (excluding the filename in args[0]) back into the
// bootloader control block. So the device will always boot into recovery to
// finish the pending work, until FinishRecovery() is called.
std::vector<std::string> options(args.cbegin() + 1, args.cend());
if (!update_bootloader_message(options, &err)) {
LOG(ERROR) << "Failed to set BCB message: " << err;
}
// Finally, if no arguments were specified, check whether we should boot
// into fastboot or rescue mode.
if (args.size() == 1 && boot_command == "boot-fastboot") {
args.emplace_back("--fastboot");
} else if (args.size() == 1 && boot_command == "boot-rescue") {
args.emplace_back("--rescue");
}
return args;
}
5.2 把升级包 mmap 到内存
升级流程进入 start_recovery(),接着调用 mmap 把升级包数据从存储器映射到进程内存空间,见源码分析。
函数 start_recovery():
Device::BuiltinAction start_recovery(Device* device, const std::vector<std::string>& args) {
// 1. 从参数 "update_package" 里得到升级包的路径
static constexpr struct option OPTIONS[] = {
{ "update_package", required_argument, nullptr, 0 },
};
const char* update_package = nullptr;
auto args_to_parse = StringVectorToNullTerminatedArray(args);
// Parse everything before the last element (which must be a nullptr). getopt_long(3) expects a
// null-terminated char* array, but without counting null as an arg (i.e. argv[argc] should be
// nullptr).
while ((arg = getopt_long(args_to_parse.size() - 1, args_to_parse.data(), "", OPTIONS,
&option_index)) != -1) {
switch (arg) {
...
case 0: {
std::string option = OPTIONS[option_index].name;
if (option == "install_with_fuse") {
...
} else if (option == "update_package") {
update_package = optarg;
}
}
}
}
InstallResult status = INSTALL_SUCCESS;
// next_action indicates the next target to reboot into upon finishing the install. It could be
// overridden to a different reboot target per user request.
Device::BuiltinAction next_action = shutdown_after ? Device::SHUTDOWN : Device::REBOOT;
if (update_package != nullptr) {
// It's not entirely true that we will modify the flash. But we want
// to log the update attempt since update_package is non-NULL.
save_current_log = true;
if (int required_battery_level; retry_count == 0 && !IsBatteryOk(&required_battery_level)) {
ui->Print("battery capacity is not enough for installing package: %d%% needed\n",
required_battery_level);
// Log the error code to last_install when installation skips due to low battery.
log_failure_code(kLowBattery, update_package);
status = INSTALL_SKIPPED;
} else if (retry_count == 0 && bootreason_in_blacklist()) {
// Skip update-on-reboot when bootreason is kernel_panic or similar
ui->Print("bootreason is in the blacklist; skip OTA installation\n");
log_failure_code(kBootreasonInBlacklist, update_package);
status = INSTALL_SKIPPED;
} else {
// retry_count 用于记录升级过程中设备是否发生过重启
// It's a fresh update. Initialize the retry_count in the BCB to 1; therefore we can later
// identify the interrupted update due to unexpected reboots.
if (retry_count == 0) {
set_retry_bootloader_message(retry_count + 1, args);
}
if (update_package[0] == '@') {
ensure_path_mounted(update_package + 1);
} else {
ensure_path_mounted(update_package);
}
// 2. 由函数名 CreateMemoryPackage 可知,把升级包 mmap 到内存,并通过对象
// memory_package 管理 mmap 到内存中的升级包。
if (install_with_fuse) {
...
} else if (auto memory_package = Package::CreateMemoryPackage(
update_package,
std::bind(&RecoveryUI::SetProgress, ui, std::placeholders::_1));
memory_package != nullptr) {
// 3. InstallPackage :有函数名可知开始安装升级包
status = InstallPackage(memory_package.get(), update_package, should_wipe_cache,
retry_count, ui);
} else {
...
}
if (status != INSTALL_SUCCESS) {
ui->Print("Installation aborted.\n");
// 4. 有时在升级过程中会发生 I/O 错误 可能导致升级无法进行下去,通常这类
// 错误重启设备再次写数据就不会发生,因此 google 设计了一套升级中断并恢复升级的机制,
// 此处就是当系统出现 I/O 等错误时,重启设备,再次尝试升级。
// When I/O error or bspatch/imgpatch error happens, reboot and retry installation
// RETRY_LIMIT times before we abandon this OTA update.
static constexpr int RETRY_LIMIT = 4;
if (status == INSTALL_RETRY && retry_count < RETRY_LIMIT) {
copy_logs(save_current_log);
// retry_count 加1,重启恢复升级时通过该标记就知道此次升级是属于
// 重启后再次尝试升级,恢复机制生效。
retry_count += 1;
set_retry_bootloader_message(retry_count, args);
// Print retry count on screen.
ui->Print("Retry attempt %d\n", retry_count);
// Reboot back into recovery to retry the update.
Reboot("recovery");
}
}
}
}
...
}
Package::CreateMemoryPackage
该方法实质上就是调用 mmap 把升级包数据映射到进程内存,但是还记得框架对保存在data分区中升级包做了解密处理同时传给 recovery 的升级包路径是 “@/cache/recovery/block.map”。
这是一个很巧妙的操作,会单独讲解,详见:(待续)。
INSTALL_RETRY
这个是一个特殊的升级失败错误码,得益于 Google 设计了一套升级中断后恢复的机制,可以在升级过程中出现设备重启、进程被杀等中断(主动或者被动)场景后继续恢复升级。此处就是遇到系统 I/O 错误时,主动重启机器后再次尝试升级,升级中断恢复机制详见:待续)。
5.3 校验升级包完整性、合法性
InstallResult InstallPackage(Package* package, const std::string_view package_id,
bool should_wipe_cache, int retry_count, RecoveryUI* ui) {
...
bool updater_wipe_cache = false;
result = VerifyAndInstallPackage(package, &updater_wipe_cache, &log_buffer, retry_count,
&max_temperature, ui);
should_wipe_cache = should_wipe_cache || updater_wipe_cache;
...
}
static InstallResult VerifyAndInstallPackage(Package* package, bool* wipe_cache,
std::vector<std::string>* log_buffer, int retry_count,
int* max_temperature, RecoveryUI* ui) {
// Verify package.
if (!verify_package(package, ui)) {
log_buffer->push_back(android::base::StringPrintf("error: %d", kZipVerificationFailure));
return INSTALL_CORRUPT;
}
// Verify and install the contents of the package.
ui->Print("Installing update...\n");
if (retry_count > 0) {
ui->Print("Retry attempt: %d\n", retry_count);
}
ui->SetEnableReboot(false);
auto result = TryUpdateBinary(package, wipe_cache, log_buffer, retry_count, max_temperature, ui);
ui->SetEnableReboot(true);
ui->Print("\n");
return result;
}
bool verify_package(Package* package, RecoveryUI* ui) {
static constexpr const char* CERTIFICATE_ZIP_FILE = "/system/etc/security/otacerts.zip";
std::vector<Certificate> loaded_keys = LoadKeysFromZipfile(CERTIFICATE_ZIP_FILE);
if (loaded_keys.empty()) {
return false;
}
int err = verify_file(package, loaded_keys);
if (err != VERIFY_SUCCESS) {
return false;
}
return true;
}
校验升级包的签名是否合法,本质上对升级包做 RSA 签名校验。
- 首先服务器用私钥签名升级包,同时把证书嵌入到升级包尾部;
- 升级包校验时从尾部取出证书,再从证书中取出公钥;
- 接着通过设备里存储的公钥列表判断该公钥是否合法;
- 最后使用该公钥验签。
升级包签名校验技术细节详见:(待续)
5.4 Fork update-binary 子进程升级系统
Google 在设计升级流程时有很多灵活巧妙的地方,比如上文提到的升级中断恢复机制。接下来介绍的 update binary 也是非常巧妙的。recovery 进程 (/system/bin/recovery) 在整个升级过程中实际上只是充当流程控制的角色,升级的实际执行者是 update-binary,它被打包到升级包路径 META-INF/com/google/android/update-binary。
update-binary 运行流程如下:
- recovery mmap 升级包到内存(上文已介绍);
- recovery 调用 TryUpdateBinary () -> SetUpNonAbUpdateCommands() 把 update-binary 从升级包里面释放到设备路径 /tmp/update-binary 下;
- recovery fork 子进程启动 update-binary,同时建立管道和子进程建立进程间通信;
- update-binary 调用 mmap 把升级包映射到自己的内存空间,然后开始从升级包拿数据更新相关分区的块设备数据,升级系统;
- update-binary 通过管道向父进程 recovery 传递升级进度、数据等,接着 recovery 更新界面进度条;
- recovery 进程调用 waitpid(pid, &status, 0) 等待 update-binary 子进程升级结束,最后根据进程退出码 status 的值判断升级是否成功;
update-binary 打包到升级包的好处:
Android 系统升级过程实际上是比较复杂的,特别是基于存储块打 patch 的增量升级,很难保证不会出现bug,一旦出现严重bug,那么很可能导致用户手中的设备无法升级,这个影响就很大了。本来升级就是为了解决系统bug,但是这时recovery本身存在bug导致设别无法升级,那就很尴尬了。
update-binary 打包到升级包中,升级时释放到内存,再通过 update-binary 完成系统升级, 这样即使 update-binary 存在严重bug,再给用户推送新的升级包时解决掉就好,不影响系统升级到新版本。
函数 TryUpdateBinary():
// If the package contains an update binary, extract it and run it.
static InstallResult TryUpdateBinary(Package* package, bool* wipe_cache,
std::vector<std::string>* log_buffer, int retry_count,
int* max_temperature, RecoveryUI* ui) {
std::map<std::string, std::string> metadata;
auto zip = package->GetZipArchiveHandle();
if (!ReadMetadataFromPackage(zip, &metadata)) {
LOG(ERROR) << "Failed to parse metadata in the zip file";
return INSTALL_CORRUPT;
}
bool is_ab = android::base::GetBoolProperty("ro.build.ab_update", false);
if (is_ab) {
CHECK(package->GetType() == PackageType::kFile);
}
// Verify against the metadata in the package first.
if (is_ab && !CheckPackageMetadata(metadata, OtaType::AB)) {
log_buffer->push_back(android::base::StringPrintf("error: %d", kUpdateBinaryCommandFailure));
return INSTALL_ERROR;
}
ReadSourceTargetBuild(metadata, log_buffer);
// The updater in child process writes to the pipe to communicate with recovery.
android::base::unique_fd pipe_read, pipe_write;
// Explicitly disable O_CLOEXEC using 0 as the flags (last) parameter to Pipe
// so that the child updater process will recieve a non-closed fd.
if (!android::base::Pipe(&pipe_read, &pipe_write, 0)) {
PLOG(ERROR) << "Failed to create pipe for updater-recovery communication";
return INSTALL_CORRUPT;
}
// The updater-recovery communication protocol.
//
// progress <frac> <secs>
// fill up the next <frac> part of of the progress bar over <secs> seconds. If <secs> is
// zero, use `set_progress` commands to manually control the progress of this segment of the
// bar.
//
// set_progress <frac>
// <frac> should be between 0.0 and 1.0; sets the progress bar within the segment defined by
// the most recent progress command.
//
// ui_print <string>
// display <string> on the screen.
//
// wipe_cache
// a wipe of cache will be performed following a successful installation.
//
// clear_display
// turn off the text display.
//
// enable_reboot
// packages can explicitly request that they want the user to be able to reboot during
// installation (useful for debugging packages that don't exit).
//
// retry_update
// updater encounters some issue during the update. It requests a reboot to retry the same
// package automatically.
//
// log <string>
// updater requests logging the string (e.g. cause of the failure).
//
std::string package_path = package->GetPath();
std::vector<std::string> args;
if (auto setup_result =
is_ab ? SetUpAbUpdateCommands(package_path, zip, pipe_write.get(), &args)
: SetUpNonAbUpdateCommands(package_path, zip, retry_count, pipe_write.get(), &args);
!setup_result) {
log_buffer->push_back(android::base::StringPrintf("error: %d", kUpdateBinaryCommandFailure));
return INSTALL_CORRUPT;
}
pid_t pid = fork();
if (pid == -1) {
PLOG(ERROR) << "Failed to fork update binary";
log_buffer->push_back(android::base::StringPrintf("error: %d", kForkUpdateBinaryFailure));
return INSTALL_ERROR;
}
if (pid == 0) {
umask(022);
pipe_read.reset();
// Convert the std::string vector to a NULL-terminated char* vector suitable for execv.
auto chr_args = StringVectorToNullTerminatedArray(args);
execv(chr_args[0], chr_args.data());
// We shouldn't use LOG/PLOG in the forked process, since they may cause the child process to
// hang. This deadlock results from an improperly copied mutex in the ui functions.
// (Bug: 34769056)
fprintf(stdout, "E:Can't run %s (%s)\n", chr_args[0], strerror(errno));
_exit(EXIT_FAILURE);
}
pipe_write.reset();
std::atomic<bool> logger_finished(false);
std::thread temperature_logger(log_max_temperature, max_temperature, std::ref(logger_finished));
*wipe_cache = false;
bool retry_update = false;
char buffer[1024];
FILE* from_child = android::base::Fdopen(std::move(pipe_read), "r");
while (fgets(buffer, sizeof(buffer), from_child) != nullptr) {
std::string line(buffer);
size_t space = line.find_first_of(" \n");
std::string command(line.substr(0, space));
if (command.empty()) continue;
// Get rid of the leading and trailing space and/or newline.
std::string args = space == std::string::npos ? "" : android::base::Trim(line.substr(space));
if (command == "progress") {
std::vector<std::string> tokens = android::base::Split(args, " ");
double fraction;
int seconds;
if (tokens.size() == 2 && android::base::ParseDouble(tokens[0].c_str(), &fraction) &&
android::base::ParseInt(tokens[1], &seconds)) {
ui->ShowProgress(fraction * (1 - VERIFICATION_PROGRESS_FRACTION), seconds);
} else {
LOG(ERROR) << "invalid \"progress\" parameters: " << line;
}
} else if (command == "set_progress") {
std::vector<std::string> tokens = android::base::Split(args, " ");
double fraction;
if (tokens.size() == 1 && android::base::ParseDouble(tokens[0].c_str(), &fraction)) {
ui->SetProgress(fraction);
} else {
LOG(ERROR) << "invalid \"set_progress\" parameters: " << line;
}
} else if (command == "ui_print") {
ui->PrintOnScreenOnly("%s\n", args.c_str());
fflush(stdout);
} else if (command == "wipe_cache") {
*wipe_cache = true;
} else if (command == "clear_display") {
ui->SetBackground(RecoveryUI::NONE);
} else if (command == "enable_reboot") {
// packages can explicitly request that they want the user
// to be able to reboot during installation (useful for
// debugging packages that don't exit).
ui->SetEnableReboot(true);
} else if (command == "retry_update") {
retry_update = true;
} else if (command == "log") {
if (!args.empty()) {
// Save the logging request from updater and write to last_install later.
log_buffer->push_back(args);
} else {
LOG(ERROR) << "invalid \"log\" parameters: " << line;
}
} else {
LOG(ERROR) << "unknown command [" << command << "]";
}
}
fclose(from_child);
int status;
waitpid(pid, &status, 0);
logger_finished.store(true);
finish_log_temperature.notify_one();
temperature_logger.join();
if (retry_update) {
return INSTALL_RETRY;
}
if (WIFEXITED(status)) {
if (WEXITSTATUS(status) != EXIT_SUCCESS) {
LOG(ERROR) << "Error in " << package_path << " (status " << WEXITSTATUS(status) << ")";
return INSTALL_ERROR;
}
} else if (WIFSIGNALED(status)) {
LOG(ERROR) << "Error in " << package_path << " (killed by signal " << WTERMSIG(status) << ")";
return INSTALL_ERROR;
} else {
LOG(FATAL) << "Invalid status code " << status;
}
return INSTALL_SUCCESS;
}
bool SetUpNonAbUpdateCommands(const std::string& package, ZipArchiveHandle zip, int retry_count,
int status_fd, std::vector<std::string>* cmd) {
CHECK(cmd != nullptr);
// In non-A/B updates we extract the update binary from the package.
static constexpr const char* UPDATE_BINARY_NAME = "META-INF/com/google/android/update-binary";
ZipEntry binary_entry;
if (FindEntry(zip, UPDATE_BINARY_NAME, &binary_entry) != 0) {
LOG(ERROR) << "Failed to find update binary " << UPDATE_BINARY_NAME;
return false;
}
const std::string binary_path = Paths::Get().temporary_update_binary();
unlink(binary_path.c_str());
android::base::unique_fd fd(
open(binary_path.c_str(), O_CREAT | O_WRONLY | O_TRUNC | O_CLOEXEC, 0755));
if (fd == -1) {
PLOG(ERROR) << "Failed to create " << binary_path;
return false;
}
if (auto error = ExtractEntryToFile(zip, &binary_entry, fd); error != 0) {
LOG(ERROR) << "Failed to extract " << UPDATE_BINARY_NAME << ": " << ErrorCodeString(error);
return false;
}
// When executing the update binary contained in the package, the arguments passed are:
// - the version number for this interface
// - an FD to which the program can write in order to update the progress bar.
// - the name of the package zip file.
// - an optional argument "retry" if this update is a retry of a failed update attempt.
*cmd = {
binary_path,
std::to_string(kRecoveryApiVersion),
std::to_string(status_fd),
package,
};
if (retry_count > 0) {
cmd->push_back("retry");
}
return true;
}
update-binary:
源码路径:bootable/recovery/updater/
update-binary 作为升级的实际执行者,其内部流程也是很复杂的。从下文的源码可以看出,update-binary 从进程启动参数得到升级包的路径,然后构造好参数 Updater 后,调用Updater.RunUpdate 开始执行升级任务。
update-binary 本身很复杂,它如何完成系统升级的详见文章:(待续)。
// bootable/recovery/updater/updater_main.cpp
static void UpdaterLogger(android::base::LogId /* id */, android::base::LogSeverity /* severity */,
const char* /* tag */, const char* /* file */, unsigned int /* line */,
const char* message) {
fprintf(stdout, "%s\n", message);
}
int main(int argc, char** argv) {
// Various things log information to stdout or stderr more or less
// at random (though we've tried to standardize on stdout). The
// log file makes more sense if buffering is turned off so things
// appear in the right order.
setbuf(stdout, nullptr);
setbuf(stderr, nullptr);
// We don't have logcat yet under recovery. Update logs will always be written to stdout
// (which is redirected to recovery.log).
android::base::InitLogging(argv, &UpdaterLogger);
// Run the libcrypto KAT(known answer tests) based self tests.
if (BORINGSSL_self_test() != 1) {
LOG(ERROR) << "Failed to run the boringssl self tests";
return EXIT_FAILURE;
}
if (argc != 4 && argc != 5) {
LOG(ERROR) << "unexpected number of arguments: " << argc;
return EXIT_FAILURE;
}
char* version = argv[1];
if ((version[0] != '1' && version[0] != '2' && version[0] != '3') || version[1] != '\0') {
// We support version 1, 2, or 3.
LOG(ERROR) << "wrong updater binary API; expected 1, 2, or 3; got " << argv[1];
return EXIT_FAILURE;
}
int fd;
if (!android::base::ParseInt(argv[2], &fd)) {
LOG(ERROR) << "Failed to parse fd in " << argv[2];
return EXIT_FAILURE;
}
std::string package_name = argv[3];
bool is_retry = false;
if (argc == 5) {
if (strcmp(argv[4], "retry") == 0) {
is_retry = true;
} else {
LOG(ERROR) << "unexpected argument: " << argv[4];
return EXIT_FAILURE;
}
}
// Configure edify's functions.
RegisterBuiltins();
RegisterInstallFunctions();
RegisterBlockImageFunctions();
RegisterDynamicPartitionsFunctions();
RegisterDeviceExtensions();
auto sehandle = selinux_android_file_context_handle();
selinux_android_set_sehandle(sehandle);
Updater updater(std::make_unique<UpdaterRuntime>(sehandle));
if (!updater.Init(fd, package_name, is_retry)) {
return EXIT_FAILURE;
}
if (!updater.RunUpdate()) {
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
5.5 退出 update-binary 子进程、保存 log 并擦除 misc 分区 BCB
在上文介绍的函数 TryUpdateBinary 可以看到,recovery 进程 fork 出 update-binary 子进程后进入 while 循环从管道里读取从子进程传递过来的数据,解析成命令后执行对应的操作。
recovery 进程在打开管道的读端时没有参数 O_NONBLOCK,所以是阻塞式IO,那么只要子进程没有关闭写端,while 循环就不会退出,因此 recovery 要么被阻塞等待数据,要么读到数据解析命令执行操作,直到子进程退出。
FILE* from_child = android::base::Fdopen(std::move(pipe_read), "r");
while (fgets(buffer, sizeof(buffer), from_child) != nullptr) {
std::string line(buffer);
size_t space = line.find_first_of(" \n");
std::string command(line.substr(0, space));
if (command.empty()) continue;
// Get rid of the leading and trailing space and/or newline.
std::string args = space == std::string::npos ? "" : android::base::Trim(line.substr(space));
if (command == "progress") {
std::vector<std::string> tokens = android::base::Split(args, " ");
double fraction;
int seconds;
if (tokens.size() == 2 && android::base::ParseDouble(tokens[0].c_str(), &fraction) &&
android::base::ParseInt(tokens[1], &seconds)) {
ui->ShowProgress(fraction * (1 - VERIFICATION_PROGRESS_FRACTION), seconds);
} else {
LOG(ERROR) << "invalid \"progress\" parameters: " << line;
}
}
...
}
update-binary 进程执行完毕退出时,会关闭管道的写端,这时 recovery 进程退出监听子进程消息的 where 循环,接下来代码继续执行到:
int status;
waitpid(pid, &status, 0);
logger_finished.store(true);
finish_log_temperature.notify_one();
temperature_logger.join();
if (retry_update) {
return INSTALL_RETRY;
}
if (WIFEXITED(status)) {
if (WEXITSTATUS(status) != EXIT_SUCCESS) {
LOG(ERROR) << "Error in " << package_path << " (status " << WEXITSTATUS(status) << ")";
return INSTALL_ERROR;
}
} else if (WIFSIGNALED(status)) {
LOG(ERROR) << "Error in " << package_path << " (killed by signal " << WTERMSIG(status) << ")";
return INSTALL_ERROR;
} else {
LOG(FATAL) << "Invalid status code " << status;
}
return INSTALL_SUCCESS;
可以看出 recovery 调用 waitpid(pid, &status, 0),获取子进程的退出码。根据退出码来判断升级是否成功,接着流程从 install/install.cpp 回到 recovery.cpp。
Device::BuiltinAction start_recovery(Device* device, const std::vector<std::string>& args) {
...
// Determine the next action.
// - If the state is INSTALL_REBOOT, device will reboot into the target as specified in
// `next_action`.
// - If the recovery menu is visible, prompt and wait for commands.
// - If the state is INSTALL_NONE, wait for commands (e.g. in user build, one manually boots
// into recovery to sideload a package or to wipe the device).
// - In all other cases, reboot the device. Therefore, normal users will observe the device
// rebooting a) immediately upon successful finish (INSTALL_SUCCESS); or b) an "error" screen
// for 5s followed by an automatic reboot.
if (status != INSTALL_REBOOT) {
if (status == INSTALL_NONE || ui->IsTextVisible()) {
auto temp = PromptAndWait(device, status);
if (temp != Device::NO_ACTION) {
next_action = temp;
}
}
}
// Save logs and clean up before rebooting or shutting down.
FinishRecovery(ui);
return next_action;
}
InstallPackage() 的返回值有 INSTALL_SUCCESS、INSTALL_RETRY、INSTALL_SUCCESS,即源码中的 status 变量的值。
- 如果升级失败:进入函数 PromptAndWait(),界面上会显示提示信息,用户确认后才能进行下一步操作(这个步骤意义不大,不做进一步介绍)。
- 如果升级成功或者退出 PromptAndWait() 时:进入 FinishRecovery() ,做退出 recovery 的准备工作。
函数FinishRecovery():
// Clear the recovery command and prepare to boot a (hopefully working) system,
// copy our log file to cache as well (for the system to read). This function is
// idempotent: call it as many times as you like.
static void FinishRecovery(RecoveryUI* ui) {
std::string locale = ui->GetLocale();
// Save the locale to cache, so if recovery is next started up without a '--locale' argument
// (e.g., directly from the bootloader) it will use the last-known locale.
if (!locale.empty() && HasCache()) {
LOG(INFO) << "Saving locale \"" << locale << "\"";
if (ensure_path_mounted(LOCALE_FILE) != 0) {
LOG(ERROR) << "Failed to mount " << LOCALE_FILE;
} else if (!android::base::WriteStringToFile(locale, LOCALE_FILE)) {
PLOG(ERROR) << "Failed to save locale to " << LOCALE_FILE;
}
}
copy_logs(save_current_log);
// Reset to normal system boot so recovery won't cycle indefinitely.
std::string err;
if (!clear_bootloader_message(&err)) {
LOG(ERROR) << "Failed to clear BCB message: " << err;
}
// Remove the command file, so recovery won't repeat indefinitely.
if (HasCache()) {
if (ensure_path_mounted(COMMAND_FILE) != 0 || (unlink(COMMAND_FILE) && errno != ENOENT)) {
LOG(WARNING) << "Can't unlink " << COMMAND_FILE;
}
ensure_path_unmounted(CACHE_ROOT);
}
sync(); // For good measure.
}
FinishRecovery 两个关键的操作:
-
把当前输出到内存中的log文件 /tmp/recovery.log 转存到 /cache/recovery 下
recovery log 之所以先输出到内存文件 /tmp/recovery.log 中而不直接保存在/cache/recovery的原因:由 main 函数可知,recovery log 是通过重定向的方式实时输出到文件,如果直接保存到 /cache/recovery 下那么和 recovery 的常规动作“擦除cache分区(格式化分区)”相冲突,当执行 wipeCache 的时候会因为 cache 分区被占用无法卸载,导致擦除失败。 -
clear_bootloader_message 把misc分区BCB数据擦除
在升级流程结束时要及时擦除 misc 分区 BCB。因为再次重启设备,在 BootLoader 检测 BCB 数据的时不会又回到 recovery system,当然也不能过早擦除 misc 分区的 BCB,因为这是升级中断恢复机制的重要一环。
5.6 重启机器返回 main system
退出函数 start_recovery,流程又回到 recovery_main.cpp,根据 start_recovery 返回值,重启到目标系统(正常一般是 main system)。
// recovery_main.cpp
auto ret = fastboot ? StartFastboot(device, args) : start_recovery(device, args);
if (ret == Device::KEY_INTERRUPTED) {
ret = action.exchange(ret);
if (ret == Device::NO_ACTION) {
continue;
}
}
switch (ret) {
case Device::SHUTDOWN:
ui->Print("Shutting down...\n");
Shutdown("userrequested,recovery");
break;
case Device::SHUTDOWN_FROM_FASTBOOT:
ui->Print("Shutting down...\n");
Shutdown("userrequested,fastboot");
break;
case Device::REBOOT_BOOTLOADER:
ui->Print("Rebooting to bootloader...\n");
Reboot("bootloader");
break;
case Device::REBOOT_FASTBOOT:
ui->Print("Rebooting to recovery/fastboot...\n");
Reboot("fastboot");
break;
case Device::REBOOT_RECOVERY:
ui->Print("Rebooting to recovery...\n");
Reboot("recovery");
break;
case Device::REBOOT_RESCUE: {
// Not using `Reboot("rescue")`, as it requires matching support in kernel and/or
// bootloader.
bootloader_message boot = {};
strlcpy(boot.command, "boot-rescue", sizeof(boot.command));
std::string err;
if (!write_bootloader_message(boot, &err)) {
LOG(ERROR) << "Failed to write bootloader message: " << err;
// Stay under recovery on failure.
continue;
}
ui->Print("Rebooting to recovery/rescue...\n");
Reboot("recovery");
break;
}
case Device::ENTER_FASTBOOT:
if (android::fs_mgr::LogicalPartitionsMapped()) {
ui->Print("Partitions may be mounted - rebooting to enter fastboot.");
Reboot("fastboot");
} else {
LOG(INFO) << "Entering fastboot";
fastboot = true;
}
break;
case Device::ENTER_RECOVERY:
LOG(INFO) << "Entering recovery";
fastboot = false;
break;
case Device::REBOOT:
ui->Print("Rebooting...\n");
Reboot("userrequested,recovery");
break;
case Device::REBOOT_FROM_FASTBOOT:
ui->Print("Rebooting...\n");
Reboot("userrequested,fastboot");
break;
default:
ui->Print("Rebooting...\n");
Reboot("unknown" + std::to_string(ret));
break;
}
void Reboot(std::string_view target) {
std::string cmd = "reboot," + std::string(target);
// Honor the quiescent mode if applicable.
if (target != "bootloader" && target != "fastboot" &&
android::base::GetBoolProperty("ro.boot.quiescent", false)) {
cmd += ",quiescent";
}
if (!android::base::SetProperty(ANDROID_RB_PROPERTY, cmd)) {
LOG(FATAL) << "Reboot failed";
}
while (true) pause();
}
bool Shutdown(std::string_view target) {
std::string cmd = "shutdown," + std::string(target);
return android::base::SetProperty(ANDROID_RB_PROPERTY, cmd);
}
6. BootLoader 启动 Main System
此处流程和 3. BootLoader 读取 BCB 启动到 Recovery System 大同小异,只不过此时的 misc 分区 BCB 在退出 recovery 的时候已经被擦除了,因此启动的是 boot 分区的 kernel,接着启动到 main system。
7. Init 拉起 flash_recovery 服务升级 recovery 分区
vendor_flash_recovery 服务定义在 rc 文件,它的执行程序是 /vendor/bin/install-recovery.sh 。当然 flash_recovery 的定义不是唯一的,部分厂商就把他定义在 system,但是实际完成的任务和原理都一样,把 recovery 分区升级到新版本。
以 AOSP 为例,flash_recovery 定义在 vendor,名字改成了 vendor_flash_recovery :
# bootable/recovery/applypatch/vendor_flash_recovery.rc
service vendor_flash_recovery /vendor/bin/install-recovery.sh
class main
oneshot
7.1 flash_recovery 存在的意义是什么?
为什么不把 recovery 分区的镜像打包到升级包,recovery 升级系统的时候顺便把自己也升级了 ?原因主要有以下两点:
- 升级稳定性。
假如在升级 recovery 分区的时候发生异常重启,这时分区数据只写了一半,那么 recovery 分区的数据一定损坏了,这时上文提到的升级中断恢复机制就无法正常运行,因为再也无法启动到 recovery system,回过头来看目前的 flash_recovery 这套机制就很好的解决了这个问题。 - 系统安全性。
论坛上经常有发烧友通过刷入第三方recovery来烧写第三方rom或者获取手机数据,非常不安全,这套机制可以一定程度上解决这个问题,flash_recovery 每次重启时都会计算 recovery 分区数据的 SHA1 值是否和预期匹配,不匹配就会恢复 recovery 分区的数据。
那么问题来了,flash_recovery 是怎么升级或者恢复 recovery 分区的?请见 7.3介绍。
7.2 什么时候启动 flash_recovery 服务?
由 vendor_flash_recovery.rc 可以可知,vendor_flash_recovery 属于 main class,也就是当触发启动 main 类服务时 vendor_flash_recovery 也就开始工作。
从 init.rc 可以看到,如果分区未加密,则在触发 nonencrypted 时启动 main class 服务,否则由加解密流程属性 vold.decrypt 控制。
- on nonencrypted 在 builtins.cpp 中函数 queue_fs_event 触发。
- 属性 decrypt 在 system/vold/cryptfs.cpp 中被设置。
// system/core/init/builtins.cpp
static Result<void> queue_fs_event(int code, bool userdata_remount)
# system/core/rootdir/init.rc
on nonencrypted
class_start main
class_start late_start
on property:vold.decrypt=trigger_restart_min_framework
# A/B update verifier that marks a successful boot.
exec_start update_verifier
class_start main
on property:vold.decrypt=trigger_restart_framework
# A/B update verifier that marks a successful boot.
exec_start update_verifier
class_start_post_data hal
class_start_post_data core
class_start main
class_start late_start
setprop service.bootanim.exit 0
start bootanim
on property:vold.decrypt=trigger_shutdown_framework
class_reset late_start
class_reset main
class_reset_post_data core
class_reset_post_data hal
7.3 flash_recovery 怎么升级 recovery 分区?
启动服务 vendor_flash_recovery ,执行脚本 install-recovery.sh 升级 recovery 分区。
# bootable/recovery/applypatch/vendor_flash_recovery.rc
service vendor_flash_recovery /vendor/bin/install-recovery.sh
class main
oneshot
install-recovery.sh 脚本的内容如下所示:
#!/system/bin/sh
if ! applypatch --check EMMC:/dev/block/bootdevice/by-name/recovery:100663296:5859245349a4196c1e30f0ff21d727016e740e25; then
applypatch \
--patch /system/recovery-from-boot.p \
--source EMMC:/dev/block/bootdevice/by-name/boot:100663296:a362e080d203e34fbdcce47278cda2bda566409a \
--target EMMC:/dev/block/bootdevice/by-name/recovery:100663296:5859245349a4196c1e30f0ff21d727016e740e25 && \
log -t recovery "Installing new recovery image: succeeded" || \
log -t recovery "Installing new recovery image: failed"
else
log -t recovery "Recovery image already installed"
脚本升级 recovery 分区的步骤如下:
1). 计算 recovery 分区的 SHA1 值是否匹配;
SHA1 值是固化在脚本里面的。这个值在编译软件的时候就固定了,服务器编译生成 recovery 分区数据时,会计算内容的 SHA1 值,在生成 install-recovery.sh 脚本时直接写入到脚本里面。
2). 如果 SHA1 匹配,则说明 recovery 分区已经升级过了,结束;
applypatch --check EMMC:/dev/block/bootdevice/by-name/recovery:100663296:5859245349a4196c1e30f0ff21d727016e740e25
“–check” :指示 applypatch 执行分区数据校验操作,后面提供了分区路径、大小以及期待的 SHA1。
3). 如果 SHA1 不匹配,则说明 recovery 分区未升级或数据损坏,升级 recovery 分区。
applypatch 从 boot 分区 load 数据,并打上 patch (/system/recovery-from-boot.p)合成新的数据写到 recovery 分区。
applypatch \
--patch /system/recovery-from-boot.p \
--source EMMC:/dev/block/bootdevice/by-name/boot:100663296:a362e080d203e34fbdcce47278cda2bda566409a \
--target EMMC:/dev/block/bootdevice/by-name/recovery:100663296:5859245349a4196c1e30f0ff21d727016e740e25
“–patch” :patch 文件。该文件是服务器编译软件时,diff 工具根据 boot 和 recovery 镜像 raw data 生成的差分补丁文件;
“–source” : 源文件。从代码可知,参数包括设备路径、大小以及 SHA1(说明打 patch 的时候也会校验源文件数据的完整性和合法性);
“–target”:目标文件。源文件和 patch 作用后生成的数据会写到目标文件,同样参数包括设备路径、大小以及 SHA1,也就说明升级结束后会检查目标文件数据是否正确。可以看到 “–target” 参数和 “–check” 的参数一致。
为什么源文件是boot分区?它的大小、SHA1以及 patch 文件怎么来的?
在讲软件架构的时候已经介绍,boot 和 recovery 分区的内容其实就是 kernel+ramdisk(recovery镜像有时也会是 kernel+ramdisk+dtb)。但是它们之间 kernel 的内容基本一样,ramdisk大同小异,dtb占用空间很小,也就是 boot 和 recovery 分区的数据大部分是一样的,因此当 recovery 把系统升级到新版本后,没必要在系统分区内保存完整的 recovery 镜像来升级 recovery 分区,可以充分利用 boot 分区的数据,编译软件时在系统分区内保存一份它们之间的 pacth,升级 recovery 的时候只需要 load boot 分区的数据打上 patch 就可以还原 recovery 分区的数据,同时也节省了不少系统分区的空间。
这里面的核心程序 applypatch 实现也是挺复杂的,此处不展开介绍,详见:(待续)