Overview
What is a thread?
OS view: an independent stream of instructions that can be scheduled to run by the OS
Software developer view: a "procedure" that runs independently from the main program.
•
Sequential program: a single stream of instructions in a program.
•
Multi-threaded program: a program with multiple streams
multiple threads need to use multiple cores/CPUs
Benefits of threads
Take less time to create a new thread than a process -> Less time to terminate a thread than progress -> Switching between two threads takes less time than switching between processes -> threads enhance efficiency in communication between programs
from the textbook: Responsiveness/Resource sharing/Economy/Scalability
Thread
Threads are scheduled on a processor, and each thread can execute a set of instructions independent of other processes and threads.
e.g.
在Word编辑器中打开一个文件并键入文本(一个线程),文本自动格式化(另一个线程),文本自动指定拼写错误(另一个线程),文件自动保存到磁盘(另一个线程)。输入一个文件是一个Process
Thread Control Block
stores the information about a thread
It shares with other threads belonging to the same process its
code section, data section
and other operating-system resources, such as
open
files
and
signals
A traditional (or
heavyweight
) process has a
single thread of control
.
If a process has
multiple threads
of control, it
can perform more than one task at a time.
Multicore Programming
Concurrency
means multiple tasks which start, run, and complete in overlapping time periods, in no specific order
A system is parallel if it can perform more than one task simultaneously
In most instances, applications use a hybrid of these two strategies.
Multithreading models
Threading support
Multithreading can be supported by:
User threads are supported above the kernel and are managed without kernel support
• Kernel level supported and managed directly by the operating system
Kernel can create one (or more) thread(s) for the process.
• Even a kernel does not support threading, it can create one thread per process (i.e., it can create a process which is a single thread of execution).
Virtually all contemporary operating systems
support kernel threads.
User-level threads (ULT)
The unit of execution that is implemented by users and the kernel is not aware of the existence of these
threads
User-level threads are much faster than kernel level threads.
All thread management is done by the thread library in user space(efficient)
e.g.
threading in programming like in
Java, C#, Python
.
Advantages
• Thread switching does not involve the kernel: no mode switching
Therefore
fast
• Scheduling can be application specific: choose the best algorithm for the
situation.
• Can run on any OS. We only need a thread library
Disadvantages
• Most system calls
are
blocking for processes. So,
all threads
within a process
will be implicitly blocked
• The kernel can only assign processors
to
processes.
Two threads within the same
process
cannot
run
simultaneously
on
two
processors
Kernel-level threads (KLT)
The unit of execution that is scheduled by the kernel to execute on the CPU.
handled by the operating system directly and the thread management is done by the kernel.
e.g.
Windows XP/2000
/Solaris
Advantages
• The kernel can schedule multiple threads of the same process
on
multiple
processors
• Blocking at thread level, not process level
If a thread blocks, the
CPU can be assigned to
another thread
in the
same process
• Even the kernel routines can be multithreaded
Disadvantages
• Thread switching always involves the kernel. This
means 2 mode switches per
thread switch
• So, it is
slower
compared to User Level Threads
(But
faster
than a full
process switch)
Example: Solaris
Process includes the
user’s
address space, stack, and processcontrol block
User-level threads
(threads library)
•
invisible to the OS
•
are the interface for application parallelism
Kernel threads
•
the unit that can be dispatched on a processor
Lightweight processes
(LWP)
- layer between kernel threads and user threads
•
each LWP supports one or more ULTs and maps to exactly
one KLT
Task 2 is equivalent to a pure ULT approach
Tasks 1 and 3 map one or more ULT’s onto a fixed number of LWP’s (&KLT’s)
Note how task 3 maps a single ULT to a single LWP bound to a CPU
Multithreading Models
a relationship must exist between user threads andkernel thread(s)
->
Mapping
user level threads to kernel level threads
In a combined system, multiple threads within the same application can run in parallel on multiple processors.
Multithreading models
are three types
»
Many
–
to
–
One
»
One
–
to
–
One
»
Many
–
to - Many
Many-to-One Model
Maps many user-level threads to one kernel thread
The process can only run one user-level thread at a time because there is only one kernel-level thread associated with the process.
Thread management done at user space, by a thread library
e.g. Solaris Green Threads / GNU Portable Threads
One-to-One Model
Each user thread mapped to one kernel thread
Kernel may implement threading and can manage threads, schedule threads.
Kernel is aware of threads
Provides more concurrency; when a thread blocks, another can run
e.g. Windows NT/XP/2000 / Linux / Solaris 9 and later
Many-to Many Model
Allows many user level threads to be mapped to many kernel threads
Allows the operating system to create
a sufficient number of kernel
threads
Number of kernel threads may be specific to an either a particular application or a particular machine.
The user can create any number of threads and corresponding kernel level threads can run in parallel on multiprocessor.
e.g.
Solaris prior to version 9 /
Windows
NT/2000
with
the
ThreadFiber
package /
Solaris older than Solaris 9 -
Two
level Relationship Multithreading
Model
Thread Libraries
No matter which thread is implemented, threads can be created, used, and terminated via a set of functions that are part of a
Thread API
(a
thread
library
)
Thread library
provides programmer with API (Application Programming Interface应用程序接口) for creating and managing threads
Programmer just have to know the thread library interface (
API
)
Threads may be implemented in
user space
or
kernel space
library may be entirely in user space or may get kernel support for
threading
3 primary thread libraries:
POSIX threads
,
Java threads
,
Win32 threads
Two approaches for implementing thread library
:
•
To provide a
library
entirely
in user space
with no kernel support
all code and data structures for the library exist in user space
invoking a function in the library results in a
local function call
in user space and not a system call
•
To implement a
kernel-level library
supported directly by the
operating system
.
code and data structures for the library exist
in kernel space
invoking a function in the API for the library typically results in
a
system call to the kernel
Implicit threading
Managing threads
There are 2 categories: Explicit and Implicit threading. 显式和隐式线程
Explicit threading -
the
programmer
creates and manages threads
Implicit threading -
the
compilers and run-time libraries create and manage threads
Implicit threading
3 alternative approaches for designing multithreaded programs
•
Thread pool
-
create a number of threads at process startup and place them into a pool, where they sit and wait for work.
•
OpenMP
is a set of compiler directives available for C, C++, and Fortran programs that instruct the compiler to automatically generate
parallel code where appropriate.
•
Grand Central Dispatch (GCD)
- is an extension to C and C++ available on
Apple’s
MacOS X
and
iOS
operating systems to support
parallelism.
Threading issues/Designing multithreaded programs
issues to consider:
- Semantics of fork() and exec()
- Signal handling
- Thread cancellation
fork() system call
Creating a thread is done with a fork() system call
A
newly created
thread is called a
child thread
, and the thread that is initiated to create the new thread is considered
a
parent thread
.
exec() system call
The
exec()
system call family replaces the currently running thread with a new thread.
The original thread identifier remains the same, and all the internal
details, such as stack, data, and instructions.
The new thread replaces the executables
Semantics of fork() and exec()
1) If
exec()
will be called after
fork()
, there is no need to duplicate the threads. They will be replaced anyway.
2) If
exec()
will not be called, then it is logical to duplicate the threads as well; so that the child will have as many threads as the parent has.
Signal Handling
A signal is a software interrupt, or an event generated by a Unix/Linux system in response to a condition or an action.
The signal is handled by a
signal handler
(all signals are handled exactly once).
-
asynchronous signal
异步信号
is generated from outside the process
that receives it
-
synchronous signal 同步信号
is delivered to the same process that
caused the signal to occur
Thread Cancellation
Two general approaches:
Asynchronous cancellation terminates the target thread immediately
Deferred cancellation
allows the
target thread
to periodically check
if it should be cancelled
Cancelled thread has sent the cancellation request
From Single-thread to Multi-threaded
------------------------------------>
This is a single threaded program