快乐虾
http://blog.csdn.net/lights_joy/
lights@hb165.com
本文适用于
Cygwin checkout-2008-09-28
vs2008
欢迎转载,但请保留作者信息
1.1 文档的说法
cygwin为了支持多线程,使用了tls,但是它所实现的TLS和windows系统提供的TLS有所区别,cygwin没有使用windows中TLS相关的API。
关于这一部分的实现,其文档是这样解释的:
All cygwin threads have separate context in an object of class _cygtls. The
storage for this object is kept on the stack in the bottom CYGTLS_PADSIZE
bytes. Each thread references the storage via the Thread Environment Block
(aka Thread Information Block), which Windows maintains for each user thread
in the system, with the address in the FS segment register. The memory
is laid out as in the NT_TIB structure from <w32api/winnt.h>:
typedef struct _NT_TIB {
struct _EXCEPTION_REGISTRATION_RECORD *ExceptionList;
PVOID StackBase;
PVOID StackLimit;
PVOID SubSystemTib;
_ANONYMOUS_UNION union {
PVOID FiberData;
DWORD Version;
} DUMMYUNIONNAME;
PVOID ArbitraryUserPointer;
struct _NT_TIB *Self;
} NT_TIB,*PNT_TIB;
Cygwin sees it like this:
extern exception_list *_except_list asm ("%fs:0"); // exceptions.cc
extern char *_tlsbase __asm__ ("%fs:4"); // cygtls.h
extern char *_tlstop __asm__ ("%fs:8"); // cygtls.h
And accesses cygtls like this:
#define _my_tls (((_cygtls *) _tlsbase)[-1]) // cygtls.h
Initialization always goes through _cygtls::init_thread(). It works
in the following ways:
* In the main thread, _dll_crt0() provides CYGTLS_PADSIZE bytes on the stack
and passes them to initialize_main_tls(), which calls _cygtls::init_thread().
It then calls dll_crt0_1(), which terminates with cygwin_exit() rather than
by returning, so the storage never goes out of scope.
If you load cygwin1.dll dynamically from a non-cygwin application, it is
vital that the bottom CYGTLS_PADSIZE bytes of the stack are not in use
before you call cygwin_dll_init(). See winsup/testsuite/cygload for
more information.
* Threads other than the main thread receive DLL_THREAD_ATTACH messages
to dll_entry() (in init.cc).
- dll_entry() calls munge_threadfunc(), which grabs the function pointer
for the thread from the stack frame and substitutes threadfunc_fe(),
- which then passes the original function pointer to _cygtls::call(),
- which then allocates CYGTLS_PADSIZE bytes on the stack and hands them
to call2(),
- which allocates an exception_list object on the stack and hands it to
init_exceptions() (in exceptions.cc), which attaches it to the end of
the list of exception handlers, changing _except_list (aka
tib->ExceptionList), then passes the cygtls storage to init_thread().
call2() calls ExitThread() instead of returning, so the storage never
goes out of scope.
Note that the padding isn't necessarily going to be just where the _cygtls
structure lives; it just makes sure there's enough room on the stack when the
CYGTLS_PADSIZE bytes down from there are overwritten.
所有依赖于tls的全局变量都必须从_tlsbase这个指针开始计算,它的定义在cygtls.h中:
extern char *_tlsbase __asm__ ("%fs:4");
但在vs2008下并不支持这样的定义,为此不得不另寻它法。
1.2 _tlsbase
在windows的线程中,可以访问一个叫NT_TIB的结构体以达到类似的目的,在WinNT.h(7220)中定义了这个结构体:
typedef struct _NT_TIB {
struct _EXCEPTION_REGISTRATION_RECORD *ExceptionList;
PVOID StackBase;
PVOID StackLimit;
PVOID SubSystemTib;
union {
PVOID FiberData;
DWORD Version;
};
PVOID ArbitraryUserPointer;
struct _NT_TIB *Self;
} NT_TIB;
typedef NT_TIB *PNT_TIB;
fs这个寄存器保存指向存有NT_TIB结构体的段,因此,我们通过适当转换对cygwin的定义进行改写:
CONTEXT context;
LDT_ENTRY ldt;
HANDLE hThread = ::GetCurrentThread();
memset(&ldt, 0, sizeof(LDT_ENTRY));
memset(&context, 0, sizeof(CONTEXT));
context.ContextFlags = CONTEXT_FULL | CONTEXT_DEBUG_REGISTERS;
GetThreadContext(hThread,&context);
GetThreadSelectorEntry(hThread, context.SegFs, &ldt);
DWORD dwFSBase = ( ldt.HighWord.Bits.BaseHi << 24) |
(ldt.HighWord.Bits.BaseMid << 16) |
ldt.BaseLow;
NT_TIB tib;
memset(&tib, 0, sizeof(NT_TIB));
DWORD dwBytes;
HANDLE hProcess = ::GetCurrentProcess();
ReadProcessMemory( hProcess, (LPCVOID)dwFSBase, &tib, sizeof(NT_TIB), &dwBytes);
这段代码选取得fs这个段寄存器的值,再将之转换为线性地址,接着读出NT_TIB结构体的内容,在有了NT_TIB之后,直接取其StackBase就是cygwin里面的_tlsbase这个指针的值。
1.3 _my_tls
Cygwin使用了一个叫_cygtls的结构体来保存每个线程的内部数据,为了达到每个线程都有不同数据的目的,按照文档的说法,它直接将这个结构体放入每个线程栈的底部。
#define _my_tls (((_cygtls *) _tlsbase)[-1]) // cygtls.h
这样,通过_my_tls,每个线程就可以访问到自己的内部数据了。
但在实际的代码中,_my_tls是这样定义的:
#define _my_tls (*((_cygtls *) (_tlsbase - CYGTLS_PADSIZE)))
也就是说,它将从线程栈的中间取一段空间出来。
const int CYGTLS_PADSIZE = 18000; /* FIXME: Find some way to autogenerate
this value */
之所以要留这么一段空间,是因为在windows创建线程之时,虽然将ESP指向栈的底部,但是线程开始运行之后,必然会造成ESP指针向下增长,如果直接将_cygtls这个结构体放在栈的底部,必然造成某些成员被改写。
同样由于这个原因,cygwin要求在使用cywin.dll之前人为地将ESP向下增长,直到_my_tls之下,这样再使用栈的时候就不会覆盖_my_tls的内容了。
要达到这个目的也很简单,只要在调用cywin.dll之前定义一个简单的数组:
char padding[CYGTLS_PADSIZE];
这样栈指针自然就往下增长了!这个数组的大小只要比CYGTLS_PADSIZE大就可以了。当然,在程序里面不可往此数组写东西,否则就会覆盖_my_tls的内容!
1 参考资料
在vs2008下使用cygwin(23):stdin,stdout和stderr(2008-10-21)
在vs2008下使用cygwin(22):使用tls(2008-10-20)