3.1 Processes, Lightweight Processes, and Threads
A process is an instance of a program in execution.
From the kernel's point of view, the purpose of a process is to act as an entity to which system resources (CPU time, memory, etc.) are allocated.
Linux uses lightweight processes to offer better support for multithreaded applications.
A straightforward way to implement multithreaded applications is to associate a lightweight process with each thread.
3.2 Process Descriptor
3.2.1 Process State
As its name implies, the state field of the process descriptor describes what is currently happening to the process. It consists of array of flags, each of which describes a possible process state. The following are the possible process states: TASK_RUNNING, TASK_INTERRUPTIBLE, TASK_UNINTERRUPTIBLE, TASK_STOPPED, TASK_TRACED.
Two additional states of the process can be stored both in thestate field and in theexit_state field of the process descriptor; as the field name suggests, a process reaches one of these two states only when its execution is terminated: EXIT_ZOMBIE, EXIT_DEAD.
3.2.2 Identifying Process
The strict one-to-one correspondence between the process and process descriptor makes the 32-bit address of thetask_struct structure a useful means for the kernel to identify processes. These addresses are referred to as process descriptor pointers.
On the other hand, Unix-like operating systems allow users to identify processes by means of a number called theProcess ID (orPID), which is stored in thepid field of the process descriptor.
Linux associates a different PID with each process or lightweight process in the system.
The identifier shared by the threads is the PID of the thread group leader, that is, the PID of the first lightweight process in the group; it is stored in thetgid field of the process descriptors.
Process descriptors handling
For each process, Linux packs two different data structures in a single per-process memory area: a small data structure linked to the process descriptor, namely thethread_info structure, and the Kernel Mode process stack.
The kernel uses the alloc_thread_info andfree_thread_info macros to allocate and release the memory area storing athread_info structure and a kernel stack.
Identifing the current process
The close association between the thread_info structure and the Kernel Mode stack just described offers a key benefit in terms of efficiency; the kernel can easy obtain the address of thethread_info structure of the process currently running on a CPU from the value of the esp register.
Doubly linked lists
The Linux kernel defines the list_head data structure, whose only fieldsnext andprev represent the forward and back pointers of a generic doubly linked list element, respectively. It is important to note, however, that the pointers in alist_head field store the addresses of otherlist_head fields rather than the addresses of the whole data structures in which thelist_head structure is included.
The process list
Each task_struct structure includes a task field of type list_head whose prev and next fields point, respectively, to the previous and to the nexttask_struct element.
The list of TASK_RUNNING processes
The trick used to achieve the scheduler speedup consists of splitting the runqueue in many lists of runnable processes, one list per process priority. Eachtask_struct descriptor includes arun_list field of typelist_head.
3.2.3 Relationships Among Processes
Processes created by a program have a parent/child relationship. When a process creates multiple children, these children have sibling relationships.
Furthermore, there exist other relationships among processess: a process can be a leader of a process group or of a login session, it can be a leader of a thread group, and it can also trace the the execution of other processes.
3.2.4 How Processes Are Organized
Processes in a TASK_STOPPED, EXIT_ZOMBIE, or EXIT_DEAD state are not linked in specific lists.
Processes in a TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE state are subdivided into many classes, each of which corresponds to a specific event.
Wait queues
Wait queues implement conditional waits on events: a process wishing to wait for a specific event places itself in the proper wait queue and relinquishes control. Therefore, a wait queue represents a set of sleeping processes, which are woken up by the kernel when some condition becomes true.
Wait queues are implmented as doubly linked lists whose elements include pointers to process descriptors.
Handling wait queues
3.2.5 Process Resource Limits
Each process has an associated set of resource limits, which specify the amout of system resources it can use. These limits keep a user from overwhelming the system (its CPU, disk space, and so on).
The resource limits for the current process are stored in thecurrent->signal->rlim field, that is, in a field of the process's signal descriptor.
3.3 Process Switch
3.3.1 Hardware Context
The set of data that must be loaded into the registers before the process resumes its execution on the CPU is called the hardware context. The hardware context is a subset of the process execution context, which includes all information needed for the process execution. In Linux, a part of the hardware context of a process is stored in the process descriptor, while the remaining part is saved in the Kernel Mode stack.
Linux 2.6 uses software to perform a process switch for the following reasons:
- Step-by-step switching performed through a sequence ofmov instructions allows better control over the validity of the data being loaded.
- The amount of time required by the old approach and the new approach is about the same.
3.3.2 Task State Segment
The 80x86 architecture includes a specific segment type called the Task State Segment (TSS), to store hardware contexts. Although Linux doesn't use hardware context switches, it is nonetheless foreced to set up a TSS for each distinct CPU in the system. This is done for two main reasons:
- When an 80x86 CPU switches from User Mode to Kernel Mode, it fectches the address of the Kernel Mode stack from the TSS.
- When a User Mode process attempts to access an I/O port by means of anin or out instruction, the CPU may need to access an I/O Permission Bitmap stored in the TSS to verify whether the process is allowed to address the port.
3.3.3 Performing the Process Switch
Every process switch consists of two steps:
- Switching the Page Global Directory to install a new address.
- Switching the Kernel Mode stack and the hardware context, which provides all the information needed by the kernel to execute the new process, including the CPU registers.
3.3.4 Saving and Loading the FPU, MMX, and MMX Registers
3.4 Creating Processes
3.4.1 The clone(), fork(), and vfork() System Calls
3.4.2 Kernel Threads
3.5 Destroying Processes
3.5.1 Process Termination
3.5.2 Process Removal