在操作系统中,有两种不同的方法提供线程支持:用户层的用户线程,或内核层的内核线程。
其中用户线程在内核之上支持,并在用户层通过线程库来实现。不需要用户态/核心态切换,速度快。操作系统内核不知道多线程的存在,因此一个线程阻塞将使得整个进程(包括它的所有线程)阻塞。由于这里的处理器时间片分配是以进程为基本单位,所以每个线程执行的时间相对减少。
内核线程由操作系统直接支持。由操作系统内核创建、调度和管理。内核维护进程及线程的上下文信息以及线程切换。一个内核线程由于I/O操作而阻塞,不会影响其它线程的运行。
Java线程的实现是怎样的呢?我们通过SUN Java 6的源码了解其在Windows和Linux下的实现。
在Windows下的实现,os_win32.cpp中
bool os::create_thread(Thread * thread, ThreadType thr_type, size_t stack_size) {
unsigned thread_id;
// Allocate the OSThread object
OSThread * osthread = new OSThread(NULL, NULL);
if (osthread == NULL) {
return false ;
}
// Initial state is ALLOCATED but not INITIALIZED
{
MutexLockerEx ml(thread -> SR_lock(), Mutex::_no_safepoint_check_flag);
osthread -> set_state(ALLOCATED);
}
// Initialize support for Java interrupts
HANDLE interrupt_event = CreateEvent(NULL, true , false , NULL);
if (interrupt_event == NULL) {
delete osthread;
return NULL;
}
osthread -> set_interrupt_event(interrupt_event);
osthread -> set_interrupted( false );
thread -> set_osthread(osthread);
if (stack_size == 0 ) {
switch (thr_type) {
case os::java_thread:
// Java threads use ThreadStackSize which default value can be changed with the flag -Xss
if (JavaThread::stack_size_at_create() > 0 )
stack_size = JavaThread::stack_size_at_create();
break ;
case os::compiler_thread:
if (CompilerThreadStackSize > 0 ) {
stack_size = (size_t)(CompilerThreadStackSize * K);
break ;
} // else fall through:
// use VMThreadStackSize if CompilerThreadStackSize is not defined
case os::vm_thread:
case os::pgc_thread:
case os::cgc_thread:
case os::watcher_thread:
if (VMThreadStackSize > 0 ) stack_size = (size_t)(VMThreadStackSize * K);
break ;
}
}
// Create the Win32 thread
//
// Contrary to what MSDN document says, "stack_size" in _beginthreadex()
// does not specify stack size. Instead, it specifies the size of
// initially committed space. The stack size is determined by
// PE header in the executable. If the committed "stack_size" is larger
// than default value in the PE header, the stack is rounded up to the
// nearest multiple of 1MB. For example if the launcher has default
// stack size of 320k, specifying any size less than 320k does not
// affect the actual stack size at all, it only affects the initial
// commitment. On the other hand, specifying 'stack_size' larger than
// default value may cause significant increase in memory usage, because
// not only the stack space will be rounded up to MB, but also the
// entire space is committed upfront.
//
// Finally Windows XP added a new flag 'STACK_SIZE_PARAM_IS_A_RESERVATION'
// for CreateThread() that can treat 'stack_size' as stack size. However we
// are not supposed to call CreateThread() directly according to MSDN
// document because JVM uses C runtime library. The good news is that the
// flag appears to work with _beginthredex() as well.
#ifndef STACK_SIZE_PARAM_IS_A_RESERVATION
#define STACK_SIZE_PARAM_IS_A_RESERVATION (0x10000)
#endif
HANDLE thread_handle =
(HANDLE)_beginthreadex(NULL,
(unsigned)stack_size,
(unsigned (__stdcall * )( void * )) java_start,
thread,
CREATE_SUSPENDED | STACK_SIZE_PARAM_IS_A_RESERVATION,
& thread_id);
if (thread_handle == NULL) {
// perhaps STACK_SIZE_PARAM_IS_A_RESERVATION is not supported, try again
// without the flag.
thread_handle =
(HANDLE)_beginthreadex(NULL,
(unsigned)stack_size,
(unsigned (__stdcall * )( void * )) java_start,
thread,
CREATE_SUSPENDED,
& thread_id);
}
if (thread_handle == NULL) {
// Need to clean up stuff we've allocated so far
CloseHandle(osthread -> interrupt_event());
thread -> set_osthread(NULL);
delete osthread;
return NULL;
}
Atomic::inc_ptr((intptr_t * ) & os::win32::_os_thread_count);
// Store info on the Win32 thread into the OSThread
osthread -> set_thread_handle(thread_handle);
osthread -> set_thread_id(thread_id);
// Initial thread state is INITIALIZED, not SUSPENDED
{
MutexLockerEx ml(thread -> SR_lock(), Mutex::_no_safepoint_check_flag);
osthread -> set_state(INITIALIZED);
}
// The thread is returned suspended (in state INITIALIZED), and is started higher up in the call chain
return true ;
}
可以看出,SUN JVM在Windows下的实现,使用_beginthreadex来创建线程,注释中也说明了为什么不用“Window编程书籍推荐使用”的CreateThread函数。由此看出,Java线程在Window下的实现是使用内核线程。
摘自<<windows操作系统原理>>
内核线程:由操作系统内核创建和撤销,内核维护进程及线程的上下文信息以及线程的切换,一个内核线程由于I/O操作而阻塞,
不会影响其他线程的运行,windows NT和2000 支持内核线程。
用户线程:由应用进程利用线程库创建和管理,不依赖操作系统的核心,不需要用户态/内核态的切换,速度快,操作系统内核不知道
多线程的存在,因此一个线程阻塞将使得整个进程(包括它的所有的线程)阻塞,由于这里的处理器时间片分配是以进程为基本单位的。所以每个线程执行的时间相对减少。
而在Linux下又是怎样的呢?
在os_linux.cpp文件中的代码摘录如下:
bool os::create_thread(Thread * thread, ThreadType thr_type, size_t stack_size) {
assert(thread -> osthread() == NULL, " caller responsible " );
// Allocate the OSThread object
OSThread * osthread = new OSThread(NULL, NULL);
if (osthread == NULL) {
return false ;
}
// set the correct thread state
osthread -> set_thread_type(thr_type);
// Initial state is ALLOCATED but not INITIALIZED
osthread -> set_state(ALLOCATED);
thread -> set_osthread(osthread);
// init thread attributes
pthread_attr_t attr;
pthread_attr_init( & attr);
pthread_attr_setdetachstate( & attr, PTHREAD_CREATE_DETACHED);
// stack size
if (os::Linux::supports_variable_stack_size()) {
// calculate stack size if it's not specified by caller
if (stack_size == 0 ) {
stack_size = os::Linux::default_stack_size(thr_type);
switch (thr_type) {
case os::java_thread:
// Java threads use ThreadStackSize which default value can be changed with the flag -Xss
if (JavaThread::stack_size_at_create() > 0 ) stack_size = JavaThread::stack_size_at_create();
break ;
case os::compiler_thread:
if (CompilerThreadStackSize > 0 ) {
stack_size = (size_t)(CompilerThreadStackSize * K);
break ;
} // else fall through:
// use VMThreadStackSize if CompilerThreadStackSize is not defined
case os::vm_thread:
case os::pgc_thread:
case os::cgc_thread:
case os::watcher_thread:
if (VMThreadStackSize > 0 ) stack_size = (size_t)(VMThreadStackSize * K);
break ;
}
}
stack_size = MAX2(stack_size, os::Linux::min_stack_allowed);
pthread_attr_setstacksize( & attr, stack_size);
} else {
// let pthread_create() pick the default value.
}
// glibc guard page
pthread_attr_setguardsize( & attr, os::Linux::default_guard_size(thr_type));
ThreadState state;
{
// Serialize thread creation if we are running with fixed stack LinuxThreads
bool lock = os::Linux::is_LinuxThreads() && ! os::Linux::is_floating_stack();
if ( lock ) {
os::Linux::createThread_lock() -> lock_without_safepoint_check();
}
pthread_t tid;
int ret = pthread_create( & tid, & attr, ( void * ( * )( void * )) java_start, thread);
pthread_attr_destroy( & attr);
if (ret != 0 ) {
if (PrintMiscellaneous && (Verbose || WizardMode)) {
perror( " pthread_create() " );
}
// Need to clean up stuff we've allocated so far
thread -> set_osthread(NULL);
delete osthread;
if ( lock ) os::Linux::createThread_lock() -> unlock();
return false ;
}
// Store pthread info into the OSThread
osthread -> set_pthread_id(tid);
// Wait until child thread is either initialized or aborted
{
Monitor * sync_with_child = osthread -> startThread_lock();
MutexLockerEx ml(sync_with_child, Mutex::_no_safepoint_check_flag);
while ((state = osthread -> get_state()) == ALLOCATED) {
sync_with_child -> wait(Mutex::_no_safepoint_check_flag);
}
}
if ( lock ) {
os::Linux::createThread_lock() -> unlock();
}
}
// Aborted due to thread limit being reached
if (state == ZOMBIE) {
thread -> set_osthread(NULL);
delete osthread;
return false ;
}
// The thread is returned suspended (in state INITIALIZED),
// and is started higher up in the call chain
assert(state == INITIALIZED, " race condition " );
return true ;
}
Java在Linux下的线程的创建,使用了pthread线程库,而pthread就是一个用户线程库,因此结论是,Java在 Linux下是使用用户线程实现的。Linux 2.6内核的pthread实现为NPTL,和内核线程的映射是一对一。之前的Linux threads也是。
对于NPTL的一些介绍:
POSIX Thread Library (NPTL)使Linux内核可以非常有效的运行使用POSIX线程标准写的程序。这里有一个测试数据,在32位机下,NPTL成功启动100000个线程只用了2秒,而不使用NPTL将需要大约15分钟左右的时间。
历史
在内核2.6以前的调度实体都是进程,内核并没有真正支持线程。它是能过一个系统调用clone()来实现的,这个调用创建了一份调用进程 的拷贝,跟fork()不同的是,这份进程拷贝完全共享了调用进程的地址空间。LinuxThread就是通过这个系统调用来提供线程在内核级的支持的 (许多以前的线程实现都完全是在用户态,内核根本不知道线程的存在)。非常不幸的是,这种方法有相当多的地方没有遵循POSIX标准,特别是在信号处理, 调度,进程间通信原语等方面。
很显然,为了改进LinuxThread必须得到内核的支持,并且需要重写线程库。为了实现这个需求,开始有两个相互竞争的项目:IBM启动的 NGTP(Next Generation POSIX Threads)项目,以及Redhat公司的NPTL。在2003年的年中,IBM放弃了NGTP,也就是大约那时,Redhat发布了最初的 NPTL。
NPTL最开始在redhat linux 9里发布,现在从RHEL3起内核2.6起都支持NPTL,并且完全成了GNU C库的一部分。
设计
NPTL使用了跟LinuxThread相同的办法,在内核里面线程仍然被当作是一个进程,并且仍然使用了clone()系统调用(在NPTL库里调用)。但是,NPTL需要内核级的特殊支持来实现,比如需要挂起然后再唤醒线程的线程同步原语futex.
NPTL也是一个1*1的线程库,就是说,当你使用pthread_create()调用创建一个线程后,在内核里就相应创建了一个调度实体,在linux里就是一个新进程,这个方法最大可能的简化了线程的实现。
除NPTL的1*1模型外还有一个m*n模型,通常这种模型的用户线程数会比内核的调度实体多。在这种实现里,线程库本身必须去处理可能存在的调 度,这样在线程库内部的上下文切换通常都会相当的快,因为它避免了系统调用转到内核态。然而这种模型增加了线程实现的复杂性,并可能出现诸如优先级反转的 问题,此外,用户态的调度如何跟内核态的调度进行协调也是很难让人满意。
对于pthread的更多理解可以参考:http://archive.cnblogs.com/a/1930707/