Android AOSP 6.0.1 Process start 流程分析（一）

最新推荐文章于 2023-03-20 15:46:04 发布

TYYJ-洪伟

最新推荐文章于 2023-03-20 15:46:04 发布

阅读量663

点赞数

分类专栏： Android源码

本文链接：https://blog.csdn.net/tyyj90/article/details/105466909

版权

Android源码专栏收录该内容

71 篇文章 105 订阅

订阅专栏

Android 应用程序启动涉及进程启动流程，我们知道 Linux 内核通过 fork、vfork 和 clone 系统调用实现进程复制。

fork 是重量级调用，因为它建立了父进程的一个完整副本，然后作为子进程执行。为了减少与该调用相关的工作量，Linux 使用了写时复制技术。

vfork 类似于 fork，但并不创建父进程数据的副本。相反，父子进程之间共享数据。由于 fork 使用了写时复制技术，vfork 速度方面不再有优势，因此要避免使用它。

clone 产生线程，可以对父子进程之间的共享、复制进行精确的控制。

Android 应用启动创建新进程最终会调用 Linux 系统调用 fork 去实现。

回到 ActivityManagerService startProcessLocked 方法，其中调用了 Process 类的静态方法 start 去启动新进程。

先上时序图预览一下，实际上时序图是分析完画出来的。
在这里插入图片描述

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java

public final class ActivityManagerService extends ActivityManagerNative
        implements Watchdog.Monitor, BatteryStatsImpl.BatteryCallback {
    ......
    private final void startProcessLocked(ProcessRecord app, String hostingType,
            String hostingNameStr, String abiOverride, String entryPoint, String[] entryPointArgs) {
        ......

        try {
            ......
            if (entryPoint == null) entryPoint = "android.app.ActivityThread";
            ......
            Process.ProcessStartResult startResult = Process.start(entryPoint,
                    app.processName, uid, uid, gids, debugFlags, mountExternal,
                    app.info.targetSdkVersion, app.info.seinfo, requiredAbi, instructionSet,
                    app.info.dataDir, entryPointArgs);
            ......
        } catch (RuntimeException e) {
            ......
        }
    }
    ......
}

start 函数的作用：

启动一个新进程。

如果启用了进程，就会创建一个新进程，并在那里执行 processClass 的静态 main() 函数。该函数返回后，进程将继续运行。

如果没有启用进程，则在调用者的进程中创建一个新线程，并在那里调用 processClass 的 main()。

如果不是空字符串，则 niceName 参数是要提供给进程的自定义名称，而不是使用 processClass。这允许你轻松地创建可标识的进程，即使你使用相同的基本 processClass 来启动它们。

frameworks/base/core/java/android/os/Process.java

public class Process {
    ......
    public static final ProcessStartResult start(final String processClass,
                                  final String niceName,
                                  int uid, int gid, int[] gids,
                                  int debugFlags, int mountExternal,
                                  int targetSdkVersion,
                                  String seInfo,
                                  String abi,
                                  String instructionSet,
                                  String appDataDir,
                                  String[] zygoteArgs) {
        try {
            return startViaZygote(processClass, niceName, uid, gid, gids,
                    debugFlags, mountExternal, targetSdkVersion, seInfo,
                    abi, instructionSet, appDataDir, zygoteArgs);
        } catch (ZygoteStartFailedEx ex) {
            Log.e(LOG_TAG,
                    "Starting VM process through Zygote failed");
            throw new RuntimeException(
                    "Starting VM process through Zygote failed", ex);
        }
    }
    ......
}

start 仅仅中转了一下，实际调用 startViaZygote 方法。在 startViaZygote 方法方法中，把入参组织在一个 ArrayList 列表中，各个参数的含义基本是自解释的，最后调用了 zygoteSendArgsAndGetResult 函数。调用 zygoteSendArgsAndGetResult 函数之前先调用了 openZygoteSocketIfNeeded 函数。这两个函数通过名称就知道其作用了。

frameworks/base/core/java/android/os/Process.java

public class Process {
    ......
    /**
     * 通过“受精卵（zygote）机制”开始一个新的进程。
     *
     * @param processClass 要运行其静态 main() 方法的类名
     * @param niceName ps 命令中出现在的“漂亮”进程名
     * @param uid 一个 POSIX uid，新进程应该使用 setuid()
     * @param gid 一个 POSIX gid，新进程应该使用 setgid()
     * @param gids null 也可；新进程应将 group() 设置为补充组 ID 的列表
     * @param debugFlags 额外的标志
     * @param targetSdkVersion 应用程序的目标 SDK 版本
     * @param seInfo null 也可，新进程的 SELinux 信息
     * @param abi 进程应该使用 ABI
     * @param instructionSet null 也可，要使用的指令集
     * @param appDataDir null 也可，应用程序的数据目录
     * @param extraArgs 为 zygote 进程额外的参数
     * @return 描述尝试启动流程的结果的对象
     * @throws ZygoteStartFailedEx 如果进程由于任何原因启动失败
     */
    private static ProcessStartResult startViaZygote(final String processClass,
                                  final String niceName,
                                  final int uid, final int gid,
                                  final int[] gids,
                                  int debugFlags, int mountExternal,
                                  int targetSdkVersion,
                                  String seInfo,
                                  String abi,
                                  String instructionSet,
                                  String appDataDir,
                                  String[] extraArgs)
                                  throws ZygoteStartFailedEx {
        synchronized(Process.class) {
            ArrayList<String> argsForZygote = new ArrayList<String>();

            // --runtime-args, --setuid=, --setgid=,
            // and --setgroups= must go first
            argsForZygote.add("--runtime-args");
            argsForZygote.add("--setuid=" + uid);
            argsForZygote.add("--setgid=" + gid);
            if ((debugFlags & Zygote.DEBUG_ENABLE_JNI_LOGGING) != 0) {
                argsForZygote.add("--enable-jni-logging");
            }
            if ((debugFlags & Zygote.DEBUG_ENABLE_SAFEMODE) != 0) {
                argsForZygote.add("--enable-safemode");
            }
            if ((debugFlags & Zygote.DEBUG_ENABLE_DEBUGGER) != 0) {
                argsForZygote.add("--enable-debugger");
            }
            if ((debugFlags & Zygote.DEBUG_ENABLE_CHECKJNI) != 0) {
                argsForZygote.add("--enable-checkjni");
            }
            if ((debugFlags & Zygote.DEBUG_ENABLE_JIT) != 0) {
                argsForZygote.add("--enable-jit");
            }
            if ((debugFlags & Zygote.DEBUG_GENERATE_DEBUG_INFO) != 0) {
                argsForZygote.add("--generate-debug-info");
            }
            if ((debugFlags & Zygote.DEBUG_ENABLE_ASSERT) != 0) {
                argsForZygote.add("--enable-assert");
            }
            if (mountExternal == Zygote.MOUNT_EXTERNAL_DEFAULT) {
                argsForZygote.add("--mount-external-default");
            } else if (mountExternal == Zygote.MOUNT_EXTERNAL_READ) {
                argsForZygote.add("--mount-external-read");
            } else if (mountExternal == Zygote.MOUNT_EXTERNAL_WRITE) {
                argsForZygote.add("--mount-external-write");
            }
            argsForZygote.add("--target-sdk-version=" + targetSdkVersion);

            //TODO optionally enable debuger
            //argsForZygote.add("--enable-debugger");

            // --setgroups 是一个逗号分隔的列表
            if (gids != null && gids.length > 0) {
                StringBuilder sb = new StringBuilder();
                sb.append("--setgroups=");

                int sz = gids.length;
                for (int i = 0; i < sz; i++) {
                    if (i != 0) {
                        sb.append(',');
                    }
                    sb.append(gids[i]);
                }

                argsForZygote.add(sb.toString());
            }

            if (niceName != null) {
                argsForZygote.add("--nice-name=" + niceName);
            }

            if (seInfo != null) {
                argsForZygote.add("--seinfo=" + seInfo);
            }

            if (instructionSet != null) {
                argsForZygote.add("--instruction-set=" + instructionSet);
            }

            if (appDataDir != null) {
                argsForZygote.add("--app-data-dir=" + appDataDir);
            }

            argsForZygote.add(processClass);

            if (extraArgs != null) {
                for (String arg : extraArgs) {
                    argsForZygote.add(arg);
                }
            }

            return zygoteSendArgsAndGetResult(openZygoteSocketIfNeeded(abi), argsForZygote);
        }
    }
    ......
}

我们先来分析 openZygoteSocketIfNeeded。它尝试打开套接字到 Zygote 进程。如果已经打开，什么也不做。可能阻塞和重试。

frameworks/base/core/java/android/os/Process.java

public class Process {
    ......
    public static final String ZYGOTE_SOCKET = "zygote";
    ......
    /**
     * 与主 zygote 连接的状态。
     */
    static ZygoteState primaryZygoteState;
    /**
     * 与第二 zygote 连接的状态。
     */
    static ZygoteState secondaryZygoteState;
    ......
    private static ZygoteState openZygoteSocketIfNeeded(String abi) throws ZygoteStartFailedEx {
        if (primaryZygoteState == null || primaryZygoteState.isClosed()) {
            try {
                primaryZygoteState = ZygoteState.connect(ZYGOTE_SOCKET);
            } catch (IOException ioe) {
                throw new ZygoteStartFailedEx("Error connecting to primary zygote", ioe);
            }
        }

        if (primaryZygoteState.matches(abi)) {
            return primaryZygoteState;
        }

        // 与主 zygote 不匹配，尝试第二个
        if (secondaryZygoteState == null || secondaryZygoteState.isClosed()) {
            try {
            secondaryZygoteState = ZygoteState.connect(SECONDARY_ZYGOTE_SOCKET);
            } catch (IOException ioe) {
                throw new ZygoteStartFailedEx("Error connecting to secondary zygote", ioe);
            }
        }

        if (secondaryZygoteState.matches(abi)) {
            return secondaryZygoteState;
        }

        throw new ZygoteStartFailedEx("Unsupported zygote ABI: " + abi);
    }
    ......
}

下一步分析 ZygoteState 类的静态方法 connect。ZygoteState 类代表与 Zygote 进程通信的状态。假如传递给 connect 方法的实参是 ZYGOTE_SOCKET（它是字符串"zygote"），这代表要连接的远程地址。该方法中首先创建一个 LocalSocket 对象，接着调用其 connect 方法，连接成功之后，就可以得到套接字的输入和输出流，并将输入流其封装成 DataInputStream 对象，然后将输出流封装成 BufferedWriter 对象。以后就可以通过 DataInputStream 对象获得 Zygote 进程发送过来的消息，而通过 BufferedWriter 对象发送消息到 Zygote 进程。

public class Process {
    ......
    public static class ZygoteState {
        final LocalSocket socket;
        final DataInputStream inputStream;
        final BufferedWriter writer;
        final List<String> abiList;

        boolean mClosed;
        
        private ZygoteState(LocalSocket socket, DataInputStream inputStream,
                BufferedWriter writer, List<String> abiList) {
            this.socket = socket;
            this.inputStream = inputStream;
            this.writer = writer;
            this.abiList = abiList;
        }

        public static ZygoteState connect(String socketAddress) throws IOException {
            DataInputStream zygoteInputStream = null;
            BufferedWriter zygoteWriter = null;
            final LocalSocket zygoteSocket = new LocalSocket();

            try {
                zygoteSocket.connect(new LocalSocketAddress(socketAddress,
                        LocalSocketAddress.Namespace.RESERVED));

                zygoteInputStream = new DataInputStream(zygoteSocket.getInputStream());

                zygoteWriter = new BufferedWriter(new OutputStreamWriter(
                        zygoteSocket.getOutputStream()), 256);
            } catch (IOException ex) {
                try {
                    zygoteSocket.close();
                } catch (IOException ignore) {
                }

                throw ex;
            }

            String abiListString = getAbiList(zygoteWriter, zygoteInputStream);
            Log.i("Zygote", "Process: zygote socket opened, supported ABIS: " + abiListString);

            return new ZygoteState(zygoteSocket, zygoteInputStream, zygoteWriter,
                    Arrays.asList(abiListString.split(",")));
        }

        boolean matches(String abi) {
            return abiList.contains(abi);
        }

        public void close() {
            try {
                socket.close();
            } catch (IOException ex) {
                Log.e(LOG_TAG,"I/O exception on routine close", ex);
            }

            mClosed = true;
        }

        boolean isClosed() {
            return mClosed;
        }
    }
    ......
}

继续分析 LocalSocket 对象的 connect 方法。LocalSocket 类在 UNIX 域名称空间中创建一个（非服务器）套接字。

connect 方法，入参是一个 LocalSocketAddress 类型，如果已经连接过，则抛出 IOException 异常。implCreateIfNeeded 函数在需要的时候创建 LocalSocketImpl 对象。接下来调用 LocalSocketImpl 对象的 connect 方法。

frameworks/base/core/java/android/net/LocalSocket.java

public class LocalSocket implements Closeable {
    private final LocalSocketImpl impl;
    ......
    /**
     * 将此套接字连接到端点。 只能在尚未连接的实例上调用。
     *
     * @param endpoint 端点地址
     * @throws IOException 如果套接字处于无效状态或地址不存在
     */
    public void connect(LocalSocketAddress endpoint) throws IOException {
        synchronized (this) {
            if (isConnected) {
                throw new IOException("already connected");
            }

            implCreateIfNeeded();
            impl.connect(endpoint, 0);
            isConnected = true;
            isBound = true;
        }
    }
    ......
}

LocalSocketImpl 类用于 android.net.LocalSocket 和 android.net.LocalServerSocket 的套接字实现。只支持 AF_LOCAL 套接字。

根据上面的调用栈，传递过来的 timeout 等于 0。LocalSocketImpl 类的 connect 方法中，先判断文件描述符是否为空，为空则抛出 IOException 异常，接着调用内部 native 函数 connectLocal。

frameworks/base/core/java/android/net/LocalSocketImpl.java

class LocalSocketImpl
{
    ......
    private native void connectLocal(FileDescriptor fd, String name,
            int namespace) throws IOException;
    ......
    /**注释：超时目前已被忽略*/
    protected void connect(LocalSocketAddress address, int timeout)
                        throws IOException
    {        
        if (fd == null) {
            throw new IOException("socket not created");
        }

        connectLocal(fd, address.getName(), address.getNamespace().getId());
    }
    ......
}

通过 JNI 接口，最终 LocalSocketImpl connectLocal 方法由 socket_connect_local 本地方法实现。socket_connect_local 方法中首先获取文件描述符 fd，然后调用 socket_local_client_connect 方法。

frameworks/base/core/jni/android_net_LocalSocketImpl.cpp

static void
socket_connect_local(JNIEnv *env, jobject object,
                        jobject fileDescriptor, jstring name, jint namespaceId)
{
    int ret;
    int fd;

    fd = jniGetFDFromFileDescriptor(env, fileDescriptor);

    if (env->ExceptionCheck()) {
        return;
    }

    ScopedUtfChars nameUtf8(env, name);

    ret = socket_local_client_connect(
                fd,
                nameUtf8.c_str(),
                namespaceId,
                SOCK_STREAM);

    if (ret < 0) {
        jniThrowIOException(env, errno);
        return;
    }
}

socket_local_client_connect 方法定义在头文件 cutils/sockets.h 中，实现在 socket_local_client.c 中。先调用 socket_make_sockaddr_un 方法构建 sockaddr_un 结构体，然后调用 connect 实现真正的连接。这个 connect 方法定义在 sys/socket.h 头文件中，这是一个 Linux 标准系统调用方法。

system/core/libcutils/socket_local_client.c

/**
 * 在 fd 上连接到名为“name”的对等节点时，返回相同的 fd 或在错误时返回-1。
 */
int socket_local_client_connect(int fd, const char *name, int namespaceId, 
        int type UNUSED)
{
    struct sockaddr_un addr;
    socklen_t alen;
    int err;

    err = socket_make_sockaddr_un(name, namespaceId, &addr, &alen);

    if (err < 0) {
        goto error;
    }

    if(connect(fd, (struct sockaddr *) &addr, alen) < 0) {
        goto error;
    }

    return fd;

error:
    return -1;
}

具体是个什么类型的套接字？这需要从套接字的创建谈起。LocalSocket 类中 implCreateIfNeeded 函数在需要的时候创建 LocalSocketImpl 对象，有我们要的答案。

implCreateIfNeeded 方法中实际调用了 LocalSocketImpl 对象的 create 方法。我们需要注意一下它的入参，sockType 最终赋值为 SOCKET_STREAM，也就是创建了流 socket。SOCK_STREAM 表明使用的是 TCP 协议。

frameworks/base/core/java/android/net/LocalSocket.java

public class LocalSocket implements Closeable {
    private final LocalSocketImpl impl;
    ......
    private final int sockType;
    ......
    public static final int SOCKET_STREAM = 2;
    ......
    /**
     * 创建一个 AF_LOCAL/UNIX 域流套接字。
     */
    public LocalSocket() {
        this(SOCKET_STREAM);
    }

    /**
     * 创建一个具有给定套接字类型的 AF_LOCAL/UNIX 域流套接字
     */
    public LocalSocket(int sockType) {
        this(new LocalSocketImpl(), sockType);
        isBound = false;
        isConnected = false;
    }
    ......
    /*package*/ LocalSocket(LocalSocketImpl impl, int sockType) {
        this.impl = impl;
        this.sockType = sockType;
        this.isConnected = false;
        this.isBound = false;
    }
    ......
    private void implCreateIfNeeded() throws IOException {
        if (!implCreated) {
            synchronized (this) {
                if (!implCreated) {
                    try {
                        impl.create(sockType);
                    } finally {
                        implCreated = true;
                    }
                }
            }
        }
    }
    ......
}

接下来分析 LocalSocketImpl 类的 create 方法。create 方法作用是在底层操作系统中创建一个套接字，最终调用 Os 类的静态方法 socket 创建套接字。这里我们重点关注 AF_UNIX 。

UNIX（AF_UNIX）domain 允许在同一主机上的应用程序之间进行通信。（POSIX.1g 使用名称 AF_LOCAL 作为 AF_UNIX 同义词）

frameworks/base/core/java/android/net/LocalSocketImpl.java

class LocalSocketImpl
{
    ......
    public void create (int sockType) throws IOException {
        if (fd == null) {
            int osType;
            switch (sockType) {
                case LocalSocket.SOCKET_DGRAM:
                    osType = OsConstants.SOCK_DGRAM;
                    break;
                case LocalSocket.SOCKET_STREAM:
                    osType = OsConstants.SOCK_STREAM;
                    break;
                case LocalSocket.SOCKET_SEQPACKET:
                    osType = OsConstants.SOCK_SEQPACKET;
                    break;
                default:
                    throw new IllegalStateException("unknown sockType");
            }
            try {
                fd = Os.socket(OsConstants.AF_UNIX, osType, 0);
                mFdCreatedInternally = true;
            } catch (ErrnoException e) {
                e.rethrowAsIOException();
            }
        }
    }

    ......
}

Os 类的静态方法 socket 传入了三个参数分别是 OsConstants.AF_UNIX、OsConstants.SOCK_STREAM 和 0。此方法只是做了一个中转，最后调用了 Libcore 类的静态成员 os 的 socket 方法。

libcore/luni/src/main/java/android/system/Os.java

public final class Os {
  ......
  /**
   * See <a href="http://man7.org/linux/man-pages/man2/socket.2.html">socket(2)</a>.
   */
  public static FileDescriptor socket(int domain, int type, int protocol) throws ErrnoException { return Libcore.os.socket(domain, type, protocol); }
  ......
}

Libcore 类实现非常简单，首先将构造器设置成私有的，接着定义了一个成员变量 os，它实际是一个 BlockGuardOs 对象，这个对象实现了 Os 接口。

libcore/luni/src/main/java/libcore/io/Libcore.java

package libcore.io;

public final class Libcore {
    private Libcore() { }

    public static Os os = new BlockGuardOs(new Posix());
}

下面来分析 BlockGuardOs 对象的构造过程。其构造器内调用了 super 方法，另外我们也发现 BlockGuardOs 没有直接继承自 Os 接口，说明 ForwardingOs 类中有可能实现了 Os 接口。

我们看到BlockGuardOs 类中重写了socket 方法。其调用了 Libcore 类中初始化静态成员 os 时，传递给 BlockGuardOs 对象构建时的 Posix 对象。

libcore/luni/src/main/java/libcore/io/BlockGuardOs.java

public class BlockGuardOs extends ForwardingOs {
    public BlockGuardOs(Os os) {
        super(os);
    }
    ......
    @Override public FileDescriptor socket(int domain, int type, int protocol) throws ErrnoException {
        return tagSocket(os.socket(domain, type, protocol));
    }
    ......
}

查看 ForwardingOs 源码印证了我们的推测，它实现了 Os 接口。

libcore/luni/src/main/java/libcore/io/ForwardingOs.java

public class ForwardingOs implements Os {
    protected final Os os;

    public ForwardingOs(Os os) {
        this.os = os;
    }
    ......
    public FileDescriptor socket(int domain, int type, int protocol) throws ErrnoException { return os.socket(domain, type, protocol); }
    ......
}

Os 接口定义了一系列和套接字相关的函数，其中就包括 socket 函数。

libcore/luni/src/main/java/libcore/io/Os.java

public interface Os {
    ......
    public FileDescriptor socket(int domain, int type, int protocol) throws ErrnoException;
    ......
}

再来看 Posix 类的 socket 方法，经过以上分析我们知道最终 socket 实现是委托到 Posix 类的 socket 方法，它是一个 jni 方法。

libcore/luni/src/main/java/libcore/io/Posix.java

public final class Posix implements Os {
    Posix() { }
    ......
    public native FileDescriptor socket(int domain, int type, int protocol) throws ErrnoException;
    ......
}

马上来分析 Posix 类的 socket 方法对应的 native 实现。throwIfMinusOne 是一个辅助函数，当调用 socket(domain, type, protocol) 返回 -1 时，通过 native 抛出 java 异常。

libcore/luni/src/main/native/libcore_io_Posix.cpp

static jobject Posix_socket(JNIEnv* env, jobject, jint domain, jint type, jint protocol) {
    if (domain == AF_PACKET) {
        protocol = htons(protocol);  // Packet sockets specify the protocol in host byte order.
    }
    int fd = throwIfMinusOne(env, "socket", TEMP_FAILURE_RETRY(socket(domain, type, protocol)));
    return fd != -1 ? jniCreateFileDescriptor(env, fd) : NULL;
}

socket(domain, type, protocol) 定义在头文件 <sys/socket.h> 中。

int socket (int domain, int type, int protocol);

domain----通信的特性，每个域有自己的地址表示格式，AF打头，表示地址族（Address family）；
在这里插入图片描述
type----套接字的类型，进一步确定通信特征；

protocol----表示为给定域和套接字类型选择默认协议，当对同一域和套接字类型支持多个协议时，可以通过该字段来选择一个特定协议，通常默认为0。

在 Process startViaZygote 方法中，上面的流程分析了openZygoteSocketIfNeeded 函数，下一节继续分析 zygoteSendArgsAndGetResult 函数，通过它的函数名你就能猜出其作用了。