Linux kernel filp_open解析

/**
 * filp_open - open file and return file pointer
 *
 * @filename:    path to open
 * @flags:    open flags as per the open(2) second argument
 * @mode:    mode for the new file if O_CREAT is set, else ignored
 *
 * This is the helper to open a file from kernelspace if you really
 * have to.  But in generally you should not do this, so please move
 * along, nothing to see here..
 */
struct file *filp_open(const char *filename, int flags, umode_t mode)

在Linux内核中,可以使用flip_open来打开文件,跟应用层打开文件的方式基本一样,简要说一下两个参数flags和mode

和应用层的API 

int open(const char *pathname, int flags, mode_t mode);

类似,

访问http://man7.org/linux/man-pages/man2/open.2.html

获取详细信息。

O_APPEND
              The file is opened in append mode.  Before each write(2), the
              file offset is positioned at the end of the file, as if with
              lseek(2).  The modification of the file offset and the write
              operation are performed as a single atomic step.

              O_APPEND may lead to corrupted files on NFS filesystems if
              more than one process appends data to a file at once.  This is
              because NFS does not support appending to a file, so the
              client kernel has to simulate it, which can't be done without
              a race condition.

       O_ASYNC
              Enable signal-driven I/O: generate a signal (SIGIO by default,
              but this can be changed via fcntl(2)) when input or output
              becomes possible on this file descriptor.  This feature is
              available only for terminals, pseudoterminals, sockets, and
              (since Linux 2.6) pipes and FIFOs.  See fcntl(2) for further
              details.  See also BUGS, below.

       O_CLOEXEC (since Linux 2.6.23)
              Enable the close-on-exec flag for the new file descriptor.
              Specifying this flag permits a program to avoid additional
              fcntl(2) F_SETFD operations to set the FD_CLOEXEC flag.

              Note that the use of this flag is essential in some
              multithreaded programs, because using a separate fcntl(2)
              F_SETFD operation to set the FD_CLOEXEC flag does not suffice
              to avoid race conditions where one thread opens a file
              descriptor and attempts to set its close-on-exec flag using
              fcntl(2) at the same time as another thread does a fork(2)
              plus execve(2).  Depending on the order of execution, the race
              may lead to the file descriptor returned by open() being
              unintentionally leaked to the program executed by the child
              process created by fork(2).  (This kind of race is in
              principle possible for any system call that creates a file
              descriptor whose close-on-exec flag should be set, and various
              other Linux system calls provide an equivalent of the
              O_CLOEXEC flag to deal with this problem.)

       O_CREAT
              If pathname does not exist, create it as a regular file.

              The owner (user ID) of the new file is set to the effective
              user ID of the process.

              The group ownership (group ID) of the new file is set either
              to the effective group ID of the process (System V semantics)
              or to the group ID of the parent directory (BSD semantics).
              On Linux, the behavior depends on whether the set-group-ID
              mode bit is set on the parent directory: if that bit is set,
              then BSD semantics apply; otherwise, System V semantics apply.
              For some filesystems, the behavior also depends on the
              bsdgroups and sysvgroups mount options described in mount(8)).

              The mode argument specifies the file mode bits be applied when
              a new file is created.  This argument must be supplied when
              O_CREAT or O_TMPFILE is specified in flags; if neither O_CREAT
              nor O_TMPFILE is specified, then mode is ignored.  The
              effective mode is modified by the process's umask in the usual
              way: in the absence of a default ACL, the mode of the created
              file is (mode & ~umask).  Note that this mode applies only to
              future accesses of the newly created file; the open() call
              that creates a read-only file may well return a read/write
              file descriptor.

              The following symbolic constants are provided for mode:

              S_IRWXU  00700 user (file owner) has read, write, and execute
                       permission

              S_IRUSR  00400 user has read permission

              S_IWUSR  00200 user has write permission

              S_IXUSR  00100 user has execute permission

              S_IRWXG  00070 group has read, write, and execute permission

              S_IRGRP  00040 group has read permission

              S_IWGRP  00020 group has write permission

              S_IXGRP  00010 group has execute permission

              S_IRWXO  00007 others have read, write, and execute permission

              S_IROTH  00004 others have read permission

              S_IWOTH  00002 others have write permission

              S_IXOTH  00001 others have execute permission

              According to POSIX, the effect when other bits are set in mode
              is unspecified.  On Linux, the following bits are also honored
              in mode:

              S_ISUID  0004000 set-user-ID bit

              S_ISGID  0002000 set-group-ID bit (see inode(7)).

              S_ISVTX  0001000 sticky bit (see inode(7)).

       O_DIRECT (since Linux 2.4.10)
              Try to minimize cache effects of the I/O to and from this
              file.  In general this will degrade performance, but it is
              useful in special situations, such as when applications do
              their own caching.  File I/O is done directly to/from user-
              space buffers.  The O_DIRECT flag on its own makes an effort
              to transfer data synchronously, but does not give the
              guarantees of the O_SYNC flag that data and necessary metadata
              are transferred.  To guarantee synchronous I/O, O_SYNC must be
              used in addition to O_DIRECT.  See NOTES below for further
              discussion.

              A semantically similar (but deprecated) interface for block
              devices is described in raw(8).

       O_DIRECTORY
              If pathname is not a directory, cause the open to fail.  This
              flag was added in kernel version 2.1.126, to avoid denial-of-
              service problems if opendir(3) is called on a FIFO or tape
              device.

       O_DSYNC
              Write operations on the file will complete according to the
              requirements of synchronized I/O data integrity completion.

              By the time write(2) (and similar) return, the output data has
              been transferred to the underlying hardware, along with any
              file metadata that would be required to retrieve that data
              (i.e., as though each write(2) was followed by a call to
              fdatasync(2)).  See NOTES below.

       O_EXCL Ensure that this call creates the file: if this flag is
              specified in conjunction with O_CREAT, and pathname already
              exists, then open() fails with the error EEXIST.

              When these two flags are specified, symbolic links are not
              followed: if pathname is a symbolic link, then open() fails
              regardless of where the symbolic link points.

              In general, the behavior of O_EXCL is undefined if it is used
              without O_CREAT.  There is one exception: on Linux 2.6 and
              later, O_EXCL can be used without O_CREAT if pathname refers
              to a block device.  If the block device is in use by the
              system (e.g., mounted), open() fails with the error EBUSY.

              On NFS, O_EXCL is supported only when using NFSv3 or later on
              kernel 2.6 or later.  In NFS environments where O_EXCL support
              is not provided, programs that rely on it for performing
              locking tasks will contain a race condition.  Portable
              programs that want to perform atomic file locking using a
              lockfile, and need to avoid reliance on NFS support for
              O_EXCL, can create a unique file on the same filesystem (e.g.,
              incorporating hostname and PID), and use link(2) to make a
              link to the lockfile.  If link(2) returns 0, the lock is
              successful.  Otherwise, use stat(2) on the unique file to
              check if its link count has increased to 2, in which case the
              lock is also successful.

       O_LARGEFILE
              (LFS) Allow files whose sizes cannot be represented in an
              off_t (but can be represented in an off64_t) to be opened.
              The _LARGEFILE64_SOURCE macro must be defined (before
              including any header files) in order to obtain this
              definition.  Setting the _FILE_OFFSET_BITS feature test macro
              to 64 (rather than using O_LARGEFILE) is the preferred method
              of accessing large files on 32-bit systems (see
              feature_test_macros(7)).

       O_NOATIME (since Linux 2.6.8)
              Do not update the file last access time (st_atime in the
              inode) when the file is read(2).

              This flag can be employed only if one of the following
              conditions is true:

              *  The effective UID of the process matches the owner UID of
                 the file.

              *  The calling process has the CAP_FOWNER capability in its
                 user namespace and the owner UID of the file has a mapping
                 in the namespace.

              This flag is intended for use by indexing or backup programs,
              where its use can significantly reduce the amount of disk
              activity.  This flag may not be effective on all filesystems.
              One example is NFS, where the server maintains the access
              time.

       O_NOCTTY
              If pathname refers to a terminal device—see tty(4)—it will not
              become the process's controlling terminal even if the process
              does not have one.

       O_NOFOLLOW
              If pathname is a symbolic link, then the open fails, with the
              error ELOOP.  Symbolic links in earlier components of the
              pathname will still be followed.  (Note that the ELOOP error
              that can occur in this case is indistinguishable from the case
              where an open fails because there are too many symbolic links
              found while resolving components in the prefix part of the
              pathname.)

              This flag is a FreeBSD extension, which was added to Linux in
              version 2.1.126, and has subsequently been standardized in
              POSIX.1-2008.

              See also O_PATH below.

       O_NONBLOCK or O_NDELAY
              When possible, the file is opened in nonblocking mode.
              Neither the open() nor any subsequent operations on the file
              descriptor which is returned will cause the calling process to
              wait.

              Note that this flag has no effect for regular files and block
              devices; that is, I/O operations will (briefly) block when
              device activity is required, regardless of whether O_NONBLOCK
              is set.  Since O_NONBLOCK semantics might eventually be
              implemented, applications should not depend upon blocking
              behavior when specifying this flag for regular files and block
              devices.

              For the handling of FIFOs (named pipes), see also fifo(7).
              For a discussion of the effect of O_NONBLOCK in conjunction
              with mandatory file locks and with file leases, see fcntl(2).

       O_PATH (since Linux 2.6.39)
              Obtain a file descriptor that can be used for two purposes: to
              indicate a location in the filesystem tree and to perform
              operations that act purely at the file descriptor level.  The
              file itself is not opened, and other file operations (e.g.,
              read(2), write(2), fchmod(2), fchown(2), fgetxattr(2),
              ioctl(2), mmap(2)) fail with the error EBADF.

              The following operations can be performed on the resulting
              file descriptor:

              *  close(2).

              *  fchdir(2), if the file descriptor refers to a directory
                 (since Linux 3.5).

              *  fstat(2) (since Linux 3.6).

              *  fstatfs(2) (since Linux 3.12).

              *  Duplicating the file descriptor (dup(2), fcntl(2) F_DUPFD,
                 etc.).

              *  Getting and setting file descriptor flags (fcntl(2) F_GETFD
                 and F_SETFD).

              *  Retrieving open file status flags using the fcntl(2)
                 F_GETFL operation: the returned flags will include the bit
                 O_PATH.

              *  Passing the file descriptor as the dirfd argument of
                 openat() and the other "*at()" system calls.  This includes
                 linkat(2) with AT_EMPTY_PATH (or via procfs using
                 AT_SYMLINK_FOLLOW) even if the file is not a directory.

              *  Passing the file descriptor to another process via a UNIX
                 domain socket (see SCM_RIGHTS in unix(7)).

              When O_PATH is specified in flags, flag bits other than
              O_CLOEXEC, O_DIRECTORY, and O_NOFOLLOW are ignored.

              Opening a file or directory with the O_PATH flag requires no
              permissions on the object itself (but does require execute
              permission on the directories in the path prefix).  Depending
              on the subsequent operation, a check for suitable file
              permissions may be performed (e.g., fchdir(2) requires execute
              permission on the directory referred to by its file descriptor
              argument).  By contrast, obtaining a reference to a filesystem
              object by opening it with the O_RDONLY flag requires that the
              caller have read permission on the object, even when the
              subsequent operation (e.g., fchdir(2), fstat(2)) does not
              require read permission on the object.

              If pathname is a symbolic link and the O_NOFOLLOW flag is also
              specified, then the call returns a file descriptor referring
              to the symbolic link.  This file descriptor can be used as the
              dirfd argument in calls to fchownat(2), fstatat(2), linkat(2),
              and readlinkat(2) with an empty pathname to have the calls
              operate on the symbolic link.

              If pathname refers to an automount point that has not yet been
              triggered, so no other filesystem is mounted on it, then the
              call returns a file descriptor referring to the automount
              directory without triggering a mount.  fstatfs(2) can then be
              used to determine if it is, in fact, an untriggered automount
              point (.f_type == AUTOFS_SUPER_MAGIC).

              One use of O_PATH for regular files is to provide the
              equivalent of POSIX.1's O_EXEC functionality.  This permits us
              to open a file for which we have execute permission but not
              read permission, and then execute that file, with steps
              something like the following:

                  char buf[PATH_MAX];
                  fd = open("some_prog", O_PATH);
                  snprintf(buf, PATH_MAX, "/proc/self/fd/%d", fd);
                  execl(buf, "some_prog", (char *) NULL);

              An O_PATH file descriptor can also be passed as the argument
              of fexecve(3).

       O_SYNC Write operations on the file will complete according to the
              requirements of synchronized I/O file integrity completion (by
              contrast with the synchronized I/O data integrity completion
              provided by O_DSYNC.)

              By the time write(2) (or similar) returns, the output data and
              associated file metadata have been transferred to the underly‐
              ing hardware (i.e., as though each write(2) was followed by a
              call to fsync(2)).  See NOTES below.

       O_TMPFILE (since Linux 3.11)
              Create an unnamed temporary regular file.  The pathname argu‐
              ment specifies a directory; an unnamed inode will be created
              in that directory's filesystem.  Anything written to the
              resulting file will be lost when the last file descriptor is
              closed, unless the file is given a name.

              O_TMPFILE must be specified with one of O_RDWR or O_WRONLY
              and, optionally, O_EXCL.  If O_EXCL is not specified, then
              linkat(2) can be used to link the temporary file into the
              filesystem, making it permanent, using code like the follow‐
              ing:

                  char path[PATH_MAX];
                  fd = open("/path/to/dir", O_TMPFILE | O_RDWR,
                                          S_IRUSR | S_IWUSR);

                  /* File I/O on 'fd'... */

                  snprintf(path, PATH_MAX,  "/proc/self/fd/%d", fd);
                  linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file",
                                          AT_SYMLINK_FOLLOW);

              In this case, the open() mode argument determines the file
              permission mode, as with O_CREAT.

              Specifying O_EXCL in conjunction with O_TMPFILE prevents a
              temporary file from being linked into the filesystem in the
              above manner.  (Note that the meaning of O_EXCL in this case
              is different from the meaning of O_EXCL otherwise.)

              There are two main use cases for O_TMPFILE:

              *  Improved tmpfile(3) functionality: race-free creation of
                 temporary files that (1) are automatically deleted when
                 closed; (2) can never be reached via any pathname; (3) are
                 not subject to symlink attacks; and (4) do not require the
                 caller to devise unique names.

              *  Creating a file that is initially invisible, which is then
                 populated with data and adjusted to have appropriate
                 filesystem attributes (fchown(2), fchmod(2), fsetxattr(2),
                 etc.)  before being atomically linked into the filesystem
                 in a fully formed state (using linkat(2) as described
                 above).

              O_TMPFILE requires support by the underlying filesystem; only
              a subset of Linux filesystems provide that support.  In the
              initial implementation, support was provided in the ext2,
              ext3, ext4, UDF, Minix, and shmem filesystems.  Support for
              other filesystems has subsequently been added as follows: XFS
              (Linux 3.15); Btrfs (Linux 3.16); F2FS (Linux 3.16); and ubifs
              (Linux 4.9)

       O_TRUNC
              If the file already exists and is a regular file and the
              access mode allows writing (i.e., is O_RDWR or O_WRONLY) it
              will be truncated to length 0.  If the file is a FIFO or ter‐
              minal device file, the O_TRUNC flag is ignored.  Otherwise,
              the effect of O_TRUNC is unspecified.
  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
内容: 一. Bootloader 二.Kernel引导入口 三.核心数据结构初始化--内核引导第一部分 四.外设初始化--内核引导第二部分 五.init进程和inittab引导指令 六.rc启动脚本 七.getty和login 八.bash 附:XDM方式登录 本文以Redhat 6.0 Linux 2.2.19 for Alpha/AXP为平台,描述了从开机到登录的 Linux 启动全过程。该文对i386平台同样适用。 一. Bootloader 在Alpha/AXP 平台上引导Linux通常有两种方法,一种是由MILO及其他类似的引导程序引导,另一种是由Firmware直接引导。MILO功能与i386平台的LILO相近,但内置有基本的磁盘驱动程序(如IDE、SCSI等),以及常见的文件系统驱动程序(如ext2,iso9660等), firmware有ARC、SRM两种形式,ARC具有类BIOS界面,甚至还有多重引导的设置;而SRM则具有功能强大的命令行界面,用户可以在控制台上使用boot等命令引导系统。ARC有分区(Partition)的概念,因此可以访问到分区的首扇区;而SRM只能将控制转给磁盘的首扇区。两种firmware都可以通过引导MILO来引导Linux,也可以直接引导Linux的引导代码。 “arch/alpha/boot” 下就是制作Linux Bootloader的文件。“head.S”文件提供了对 OSF PAL/1的调用入口,它将被编译后置于引导扇区(ARC的分区首扇区或SRM的磁盘0扇区),得到控制后初始化一些数据结构,再将控制转给“main.c”中的start_kernel(), start_kernel()向控制台输出一些提示,调用pal_init()初始化PAL代码,调用openboot() 打开引导设备(通过读取Firmware环境),调用load()将核心代码加载到START_ADDR(见 “include/asm-alpha/system.h”),再将Firmware中的核心引导参数加载到ZERO_PAGE(0) 中,最后调用runkernel()将控制转给0x100000的kernel,bootloader部分结束。 “arch/alpha/boot/bootp.c”以“main.c”为基础,可代替“main.c”与“head.S” 生成用于BOOTP协议网络引导的Bootloader。 Bootloader中使用的所有“srm_”函数在“arch/alpha/lib/”中定义。 以上这种Boot方式是一种最简单的方式,即不需其他工具就能引导Kernel,前提是按照 Makefile的指导,生成bootimage文件,内含以上提到的bootloader以及vmlinux,然后将 bootimage写入自磁盘引导扇区始的位置中。 当采用MILO这样的引导程序来引导Linux时,不需要上面所说的Bootloader,而只需要 vmlinux或vmlinux.gz,引导程序会主动解压加载内核到0x1000(小内核)或0x100000(大内核),并直接进入内核引导部分,即本文的第二节。 对于I386平台 i386系统中一般都有BIOS做最初的引导工作,那就是将四个主分区表中的第一个可引导 分区的第一个扇区加载到实模式地址0x7c00上,然后将控制转交给它。 在“arch/i386/boot” 目录下,bootsect.S是生成引导扇区的汇编源码,它首先将自己拷贝到0x90000上,然后将紧接其后的setup部分(第二扇区)拷贝到0x90200,将真正的内核代码拷贝到0x100000。以上这些拷贝动作都是以bootsect.S、setup.S以及vmlinux在磁盘上连续存放为前提的,也就是说,我们的bzImage文件或者zImage文件是按照bootsect,setup, vmlinux这样的顺序组织,并存放于始于引导分区的首扇区的连续磁盘扇区之中。 bootsect.S完成加载动作后,就直接跳转到0x90200,这里正是setup.S的程序入口。 setup.S的主要功能就是将系统参数(包括内存、磁盘等,由BIOS返回)拷贝到 0x90000-0x901FF内存中,这个地方正是bootsect.S存放的地方,这时它将被系统参数覆盖。以后这些参数将由保护模式下的代码来读取。 除此之外,setup.S还将video.S中的代码包含进来,检测和设置显示器和显示模式。最 后,setup.S将系统转换到保护模式,并跳转到0x100000(对于bzImage格式的大内核是 0x100000,对于zImage格式的是0x1000)的内核引导代码,Bootloader过程结束。 对于2.4.x版内核 没有什么变化。 二.Kernel引导入口 在arch/alpha/vmlinux.lds 的链接脚本控制下,链接程序将vmlinux的入口置于 "arch/alpha/kernel/head.S"中的__start上,因此当Bootloader跳转到0x100000时, __start处的代码开始执行。__start的代码很简单,只需要设置一下全局变量,然后就跳转到start_kernel去了。start_kernel()是"init/main.c"中的asmlinkage函数,至此,启动过程转入体系结构无关的通用C代码中。 对于I386平台 在i386体系结构中,因为i386本身的问题,在 "arch/alpha/kernel/head.S"中需要更多的设置,但最终也是通过call SYMBOL_NAME(start_kernel)转到start_kernel()这个体系结构无关的函数中去执行了。 所不同的是,在i386系统中,当内核以bzImage的形式压缩,即大内核方式(__BIG_KERNEL__)压缩时就需要预先处理bootsect.S和setup.S,按照大核模式使用$(CPP) 处理生成bbootsect.S和bsetup.S,然后再编译生成相应的.o文件,并使用 "arch/i386/boot/compressed/build.c"生成的build工具,将实际的内核(未压缩的,含 kernel中的head.S代码)与"arch/i386/boot/compressed"下的head.S和misc.c合成到一起,其中的 head.S代替了"arch/i386/kernel/head.S"的位置,由Bootloader引导执行(startup_32入口),然后它调用misc.c中定义的decompress_kernel()函数,使用 "lib/inflate.c"中定义的gunzip()将内核解压到0x100000,再转到其上执行 "arch/i386/kernel/head.S"中的startup_32代码。 对于2.4.x版内核 没有变化。 三.核心数据结构初始化--内核引导第一部分 start_kernel()中调用了一系列初始化函数,以完成kernel本身的设置。 这些动作有的是公共的,有的则是需要配置的才会执行的。 在start_kernel()函数中, 输出Linux版本信息(printk(linux_banner)) 设置与体系结构相关的环境(setup_arch()) 页表结构初始化(paging_init()) 使用"arch/alpha/kernel/entry.S"中的入口点设置系统自陷入口(trap_init()) 使用alpha_mv结构和entry.S入口初始化系统IRQ(init_IRQ()) 核心进程调度器初始化(包括初始化几个缺省的Bottom-half,sched_init()) 时间、定时器初始化(包括读取CMOS时钟、估测主频、初始化定时器中断等,time_init()) 提取并分析核心启动参数(从环境变量中读取参数,设置相应标志位等待处理,(parse_options()) 控制台初始化(为输出信息而先于PCI初始化,console_init()) 剖析器数据结构初始化(prof_buffer和prof_len变量) 核心Cache初始化(描述Cache信息的Cache,kmem_cache_init()) 延迟校准(获得时钟jiffies与CPU主频ticks的延迟,calibrate_delay()) 内存初始化(设置内存上下界和页表项初始值,mem_init()) 创建和设置内部及通用cache("slab_cache",kmem_cache_sizes_init()) 创建uid taskcount SLAB cache("uid_cache",uidcache_init()) 创建文件cache("files_cache",filescache_init()) 创建目录cache("dentry_cache",dcache_init()) 创建与虚存相关的cache("vm_area_struct","mm_struct",vma_init()) 块设备读写缓冲区初始化(同时创建"buffer_head"cache用户加速访问,buffer_init()) 创建页cache(内存页hash表初始化,page_cache_init()) 创建信号队列cache("signal_queue",signals_init()) 初始化内存inode表(inode_init()) 创建内存文件描述符表("filp_cache",file_table_init()) 检查体系结构漏洞(对于alpha,此函数为空,check_bugs()) SMP机器其余CPU(除当前引导CPU)初始化(对于没有配置SMP的内核,此函数为空,smp_init()) 启动init过程(创建第一个核心线程,调用init()函数,原执行序列调用cpu_idle() 等待调度,init()) 至此start_kernel()结束,基本的核心环境已经建立起来了。 对于I386平台 i386平台上的内核启动过程与此基本相同,所不同的主要是实现方式。 对于2.4.x版内核 2.4.x中变化比较大,但基本过程没变,变动的是各个数据结构的具体实现,比如Cache。 四.外设初始化--内核引导第二部分 init()函数作为核心线程,首先锁定内核(仅对SMP机器有效),然后调用 do_basic_setup()完成外设及其驱动程序的加载和初始化。过程如下: 总线初始化(比如pci_init()) 网络初始化(初始化网络数据结构,包括sk_init()、skb_init()和proto_init()三部分,在proto_init()中,将调用protocols结构中包含的所有协议的初始化过程,sock_init()) 创建bdflush核心线程(bdflush()过程常驻核心空间,由核心唤醒来清理被写过的内存缓冲区,当bdflush()由kernel_thread()启动后,它将自己命名为kflushd) 创建kupdate核心线程(kupdate()过程常驻核心空间,由核心按时调度执行,将内存缓冲区中的信息更新到磁盘中,更新的内容包括超级块和inode表) 设置并启动核心调页线程kswapd(为了防止kswapd启动时将版本信息输出到其他信息中间,核心线调用kswapd_setup()设置kswapd运行所要求的环境,然后再创建 kswapd核心线程) 创建事件管理核心线程(start_context_thread()函数启动context_thread()过程,并重命名为keventd) 设备初始化(包括并口parport_init()、字符设备chr_dev_init()、块设备 blk_dev_init()、SCSI设备scsi_dev_init()、网络设备net_dev_init()、磁盘初始化及分区检查等等, device_setup()) 执行文件格式设置(binfmt_setup()) 启动任何使用__initcall标识的函数(方便核心开发者添加启动函数,do_initcalls()) 文件系统初始化(filesystem_setup()) 安装root文件系统(mount_root()) 至此do_basic_setup()函数返回init(),在释放启动内存段(free_initmem())并给内核解锁以后,init()打开 /dev/console设备,重定向stdin、stdout和stderr到控制台,最后,搜索文件系统中的init程序(或者由init=命令行参数指定的程序),并使用 execve()系统调用加载执行init程序。 init()函数到此结束,内核的引导部分也到此结束了,这个由start_kernel()创建的第一个线程已经成为一个用户模式下的进程了。此时系统中存在着六个运行实体: start_kernel()本身所在的执行体,这其实是一个"手工"创建的线程,它在创建了init()线程以后就进入cpu_idle()循环了,它不会在进程(线程)列表中出现 init线程,由start_kernel()创建,当前处于用户态,加载了init程序 kflushd核心线程,由init线程创建,在核心态运行bdflush()函数 kupdate核心线程,由init线程创建,在核心态运行kupdate()函数 kswapd核心线程,由init线程创建,在核心态运行kswapd()函数 keventd核心线程,由init线程创建,在核心态运行context_thread()函数 对于I386平台 基本相同。 对于2.4.x版内核 这一部分的启动过程在2.4.x内核中简化了不少,缺省的独立初始化过程只剩下网络 (sock_init())和创建事件管理核心线程,而其他所需要的初始化都使用__initcall()宏 包含在do_initcalls()函数中启动执行。 五.init进程和inittab引导指令 init进程是系统所有进程的起点,内核在完成核内引导以后,即在本线程(进程)空 间内加载init程序,它的进程号是1。 init程序需要读取/etc/inittab文件作为其行为指针,inittab是以行为单位的描述性(非执行性)文本,每一个指令行都具有以下格式: id:runlevel:action:process其中id为入口标识符,runlevel为运行级别,action为动作代号,process为具体的执行程序。 id一般要求4个字符以内,对于getty或其他login程序项,要求id与tty的编号相同,否则getty程序将不能正常工作。 runlevel 是init所处于的运行级别的标识,一般使用0-6以及S或s。0、1、6运行级别被系统保留,0作为shutdown动作,1作为重启至单用户模式,6 为重启;S和s意义相同,表示单用户模式,且无需inittab文件,因此也不在inittab中出现,实际上,进入单用户模式时,init直接在控制台(/dev/console)上运行/sbin/sulogin。 在一般的系统实现中,都使用了2、3、4、5几个级别,在 Redhat系统中,2表示无NFS支持的多用户模式,3表示完全多用户模式(也是最常用的级别),4保留给用户自定义,5表示XDM图形登录方式。7- 9级别也是可以使用的,传统的Unix系统没有定义这几个级别。runlevel可以是并列的多个值,以匹配多个运行级别,对大多数action来说,仅当runlevel与当前运行级别匹配成功才会执行。 initdefault是一个特殊的action值,用于标识缺省的启动级别;当init由核心激活 以后,它将读取inittab中的initdefault项,取得其中的runlevel,并作为当前的运行级别。如果没有inittab文件,或者其中没有initdefault项,init将在控制台上请求输入 runlevel。 sysinit、 boot、bootwait等action将在系统启动时无条件运行,而忽略其中的runlevel,其余的action(不含initdefault)都与某个runlevel相关。各个action的定义在inittab的man手册中有详细的描述。 在Redhat系统中,一般情况下inittab都会有如下几项: id:3:initdefault: #表示当前缺省运行级别为3--完全多任务模式; si::sysinit:/etc/rc.d/rc.sysinit #启动时自动执行/etc/rc.d/rc.sysinit脚本 l3:3:wait:/etc/rc.d/rc 3 #当运行级别为3时,以3为参数运行/etc/rc.d/rc脚本,init将等待其返回 0:12345:respawn:/sbin/mingetty tty0 #在1-5各个级别上以tty0为参数执行/sbin/mingetty程序,打开tty0终端用于 #用户登录,如果进程退出则再次运行mingetty程序 x:5:respawn:/usr/bin/X11/xdm -nodaemon #在5级别上运行xdm程序,提供xdm图形方式登录界面,并在退出时重新执行. 六.rc启动脚本 上一节已经提到init进程将启动运行rc脚本,这一节将介绍rc脚本具体的工作。 一般情况下,rc启动脚本都位于/etc/rc.d目录下,rc.sysinit中最常见的动作就是激活交换分区,检查磁盘,加载硬件模块,这些动作无论哪个运行级别都是需要优先执行的。仅当rc.sysinit执行完以后init才会执行其他的boot或bootwait动作。 如果没有其他boot、bootwait动作,在运行级别3下,/etc/rc.d/rc将会得到执行,命令行参数为3,即执行 /etc/rc.d/rc3.d/目录下的所有文件。rc3.d下的文件都是指向/etc/rc.d/init.d/目录下各个Shell脚本的符号连接,而这些脚本一般能接受start、stop、restart、status等参数。rc脚本以start参数启动所有以S开头的脚本,在此之前,如果相应的脚本也存在K打头的链接,而且已经处于运行态了(以/var/lock/subsys/下的文件作为标志),则将首先启动K开头的脚本,以stop 作为参数停止这些已经启动了的服务,然后再重新运行。显然,这样做的直接目的就是当init改变运行级别时,所有相关的服务都将重启,即使是同一个级别。 rc程序执行完毕后,系统环境已经设置好了,下面就该用户登录系统了。 七.getty和login 在rc返回后,init将得到控制,并启动mingetty(见第五节)。mingetty是getty的简化,不能处理串口操作。getty的功能一般包括: 打开终端线,并设置模式 输出登录界面及提示,接受用户名的输入 以该用户名作为login的参数,加载login程序 缺省的登录提示记录在/etc/issue文件中,但每次启动,一般都会由rc.local脚本根据系统环境重新生成。 注:用于远程登录的提示信息位于/etc/issue.net中。 login程序在getty的同一个进程空间中运行,接受getty传来的用户名参数作为登录的用户名。 如果用户名不是root,且存在/etc/nologin文件,login将输出nologin文件的内容,然后退出。这通常用来系统维护时防止非root用户登录。 只有/etc/securetty中登记了的终端才允许root用户登录,如果不存在这个文件,则root可以在任何终端上登录。/etc/usertty文件用于对用户作出附加访问限制,如果不存在这个文件,则没有其他限制。 当用户登录通过了这些检查后,login将搜索/etc/passwd文件(必要时搜索 /etc/shadow文件)用于匹配密码、设置主目录和加载shell。如果没有指定主目录,将默认为根目录;如果没有指定shell,将默认为 /bin/sh。在将控制转交给shell以前, getty将输出/var/log/lastlog中记录的上次登录系统的信息,然后检查用户是否有新邮件(/usr/spool/mail/ {username})。在设置好shell的uid、gid,以及TERM,PATH 等环境变量以后,进程加载shell,login的任务也就完成了。 八.bash 运行级别3下的用户login以后,将启动一个用户指定的shell,以下以/bin/bash为例继续我们的启动过程。 bash 是Bourne Shell的GNU扩展,除了继承了sh的所有特点以外,还增加了很多特性和功能。由login启动的bash是作为一个登录shell启动的,它继承了 getty设置的TERM、PATH等环境变量,其中PATH对于普通用户为"/bin:/usr/bin:/usr/local/bin",对于 root 为"/sbin:/bin:/usr/sbin:/usr/bin"。作为登录shell,它将首先寻找/etc/profile 脚本文件,并执行它;然后如果存在~/.bash_profile,则执行它,否则执行 ~/.bash_login,如果该文件也不存在,则执行~/.profile文件。然后bash将作为一个交互式shell执行~/.bashrc文件(如果存在的话),很多系统中,~/.bashrc都将启动 /etc/bashrc作为系统范围内的配置文件。 当显示出命令行提示符的时候,整个启动过程就结束了。此时的系统,运行着内核,运行着几个核心线程,运行着init进程,运行着一批由rc启动脚本激活的守护进程(如 inetd等),运行着一个bash作为用户的命令解释器。 附:XDM方式登录 如果缺省运行级别设为5,则系统中不光有1-6个getty监听着文本终端,还有启动了一个XDM的图形登录窗口。登录过程和文本方式差不多,也需要提供用户名和口令,XDM 的配置文件缺省为/usr/X11R6/lib/X11/xdm/xdm-config文件,其中指定了 /usr/X11R6/lib/X11/xdm/xsession作为XDM的会话描述脚本。登录成功后,XDM将执行这个脚本以运行一个会话管理器,比如gnome-session等。 除了XDM以外,不同的窗口管理系统(如KDE和GNOME)都提供了一个XDM的替代品,如gdm和kdm,这些程序的功能和XDM都差不多。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Simple-Soft

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值