state-threads官方文档编程注意篇翻译

酷咪哥

已于 2023-02-11 10:00:32 修改

阅读量1.4k

点赞数

分类专栏：网络流媒体技术文章标签： unix linux windows

于 2016-11-14 20:31:17 首次发布

本文链接：https://blog.csdn.net/weixin_35804181/article/details/53164056

版权

网络流媒体技术专栏收录该内容

23 篇文章 8 订阅

订阅专栏

##Porting移植
The State Threads library uses OS concepts that are available in some form on most UNIX platforms, making the library very portable across many flavors of UNIX. However, there are several parts of the library that rely on platform-specific features. Here is the list of such parts:
State Threads库可移植到大多数类UNIX平台上，但是该库的部分需要依赖平台特性，以下列出了这些部分：

  *Thread context initialization: Two ingredients of the jmp_buf data structure (the program counter and the stack pointer) have to be manually set in the thread creation routine. The jmp_buf data structure is defined in the setjmp.h header file and differs from platform to platform. Usually the program counter is a structure member with PC in the name and the stack pointer is a structure member with SP in the name. One can also look in the Netscape's NSPR library source which already has this code for many UNIX-like platforms (mozilla/nsprpub/pr/include/md/*.h files).
  Note that on some BSD-derived platforms _setjmp(3)/_longjmp(3) calls should be used instead of setjmp(3)/longjmp(3) (that is the calls that manipulate only the stack and registers and do not save and restore the process's signal mask). 
  Starting with glibc 2.4 on Linux the opacity of the jmp_buf data structure is enforced by setjmp(3)/longjmp(3) so the jmp_buf ingredients cannot be accessed directly anymore (unless special environmental variable LD_POINTER_GUARD is set before application execution). To avoid dependency on custom environment, the State Threads library provides setjmp/longjmp replacement functions for all Intel CPU architectures. Other CPU architectures can also be easily supported (the setjmp/longjmp source code is widely available for many CPU architectures).
    *线程上下文初始化：在创建线程时，必须手动设置jmp_buf数据结构的两个部分（程序计数器和堆栈指针）。setjmp.h文件中jmp_buf数据结构在不同平台有不同的定义。一般，程序计数器是一个名称带有PC结构体的成员。堆栈指针是一个名称带有SP结构体的成员。可以参考NSPR库的代码。
  注意：在一些BSD平台上，应该使用setjmp(3)/longjmp(3)替代_setjmp(3)/_longjmp(3)
  linux上的glibc库从2.4开始后，被setjmp(3)/longjmp(3)使用的jmp_buf数据结构不能被直接访问，除非在程序执行前设置设置环境变量LD_POINTER_GUARD。为了避免对特定环境的依赖，在所有的IntelCPU架构上state Threads库提供了setjmp/longjmp的替代函数。由于setjmp/longjmp在许多CPU架构上都是可用的，因此在其他CPU架构上也很容易被支持。

  *High resolution time function：Some platforms (IRIX, Solaris) provide a high resolution time function based on the free running hardware counter. This function returns the time counted since some arbitrary moment in the past (usually machine power up time). It is not correlated in any way to the time of day, and thus is not subject to resetting, drifting, etc. This type of time is ideal for tasks where cheap, accurate interval timing is required. If such a function is not available on a particular platform, the gettimeofday(3) function can be used (though on some platforms it involves a system call).
  *高精准的时间功能：类似IRIX,Solaris这样的平台都提供了基于硬件计数器的高精准时间功能。它提供了基于过去某个时刻到现在的时间技术，通常基准点在开机时间。它和一天的时刻无关，因此无法重置或者清零。这对时间有高效，精准要求的任务十分有用。若使用的平台没有该功能，可以使用gettimeofday函数来替代（尽管在某些平台上，该函数依赖于系统调用）

  *The stack growth direction: The library needs to know whether the stack grows toward lower (down) or higher (up) memory addresses. One can write a simple test program that detects the stack growth direction on a particular platform.
  *堆栈的增长方向：需要确认实用平台的堆栈使用的内存地址是向下还是向上增长的，我们可以使用测试程序来确认。

  Non-blocking attribute inheritance: On some platforms (e.g. IRIX) the socket created as a result of the accept(2) call inherits the non-blocking attribute of the listening socket. One needs to consult the manual pages or write a simple test program to see if this applies to a specific platform.
  *非阻塞特性继承:在一些平台（IRIX），由监听socket接收，然后创建的sockect将会继承监听socket的非阻塞特性。我们可以使用测试程序或者用户手册来确认是否具有该特性。

  *Anonymous memory mapping: The library allocates memory segments for thread stacks by doing anonymous memory mapping (mmap(2)). This mapping is somewhat different on SVR4 and BSD4.3 derived platforms.The memory mapping can be avoided altogether by using malloc(3) for stack allocation. In this case the MALLOC_STACK macro should be defined.
  *匿名内存映射:state-threads库在开辟线程堆栈内存时，使用匿名内存映射（mmap）。这个映射在SVR4和BSD4.3衍生平台上有所不同。当然我们可以使用malloc来避免该问题，要使用malloc需要宏定义MALLOC_STACK。

All machine-dependent feature test macros should be defined in the md.h header file. The assembly code for setjmp/longjmp replacement functions for all CPU architectures should be placed in the md.S file.
The current version of the library is ported to:
所有实际机器依赖的测试宏必须定义在头文件md.h中。在所有CPU架构中，对setjmp/longjmp实现的汇编代码应该放到md.S文件里面。
当前版本的库已支持一下平台的移植：

IRIX 6.x (both 32 and 64 bit)
Linux (kernel 2.x and glibc 2.x) onx86, Alpha, MIPS and MIPSEL, SPARC, ARM, PowerPC, 68k, HPPA, S390,IA-64, and Opteron (AMD-64)
Solaris 2.x (SunOS 5.x) on x86, AMD64,SPARC, and SPARC-64
AIX 4.x
HP-UX 11 (both 32 and 64 bit)
Tru64/OSF1
FreeBSD on x86, AMD64, and Alpha
OpenBSD on x86, AMD64, Alpha, andSPARC
NetBSD on x86, Alpha, SPARC, and VAX
MacOS X (Darwin/Tiger) on PowerPC and 32-bit and 64-bit x86
Cygwin

##Signals
Signal handling in an application using State Threads should be treated the same way as in a classical UNIX process application. There is no such thing as per-thread signal mask, all threads share the same signal handlers, and only asynchronous-safe functions can be used in signal handlers. However, there is a way to process signals synchronously by converting a signal event to an I/O event: a signal catching function does a write to a pipe which will be processed synchronously by a dedicated signal handling thread. The following code demonstrates this technique (error handling is omitted for clarity):
State Threads的信号的使用方法和UNIX平台信号一样。所有线程共享同样的信号处理程序，并且在信号处理函数中只能使用“异步-安全”的函数。当然，我们可以通过将信号事件转换为I/O事件的方法来支持同步信号。下面将使用代码展示：

/* Per-process pipe which is used as a signal queue. */
/* Up to PIPE_BUF/sizeof(int) signals can be queued up. */
int sig_pipe[2];

/* Signal catching function. */
/* Converts signal event to I/O event. */
void sig_catcher(int signo)
{
    int err;
    /* Save errno to restore it after the write() */
    err = errno;
    /* write() is reentrant/async-safe */
    write(sig_pipe[1], &signo, sizeof(int));
    errno = err;
}

/* Signal processing function. */
/* This is the "main" function of the signal processing thread. */
void *sig_process(void *arg)
{
    st_netfd_t nfd;
    int signo;
    nfd = st_netfd_open(sig_pipe[0]);

    for ( ; ; ) {
        /* Read the next signal from the pipe */
        st_read(nfd, &signo, sizeof(int), ST_UTIME_NO_TIMEOUT);
        /* Process signal synchronously */
        switch (signo) {
            case SIGHUP:
                /* do something here - reread config files, etc. */
                break;
            case SIGTERM:
                /* do something here - cleanup, etc. */
                break;
                /*..Other signals..*/
        }
    }
    return NULL;
}

int main(int argc, char *argv[])
{
    struct sigaction sa;
    .
    .
    .

    /* Create signal pipe */
    pipe(sig_pipe);
    /* Create signal processing thread */
    st_thread_create(sig_process, NULL, 0, 0);
    /* Install sig_catcher() as a signal handler */
    sa.sa_handler = sig_catcher;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = 0;
    sigaction(SIGHUP, &sa, NULL);

    sa.sa_handler = sig_catcher;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = 0;
    sigaction(SIGTERM, &sa, NULL);
    .
    .
    .
}

Note that if multiple processes are used (see below), the signal pipe should be initialized after the fork(2) call so that each process has its own private pipe.
注意：若使用了多进程，每个进程的信号管道都应该被初始化，这样可以保证每个进程拥有自己私有的管道。

##Intra-Process Synchronization进程内通信
Due to the event-driven nature of the library scheduler, the thread context switch (process state change) can only happen in a well-known set of library functions. This set includes functions in which a thread may “block”: I/O functions (st_read(), st_write(), etc.), sleep functions (st_sleep(), etc.), and thread synchronization functions (st_thread_join(), st_cond_wait(), etc.). As a result, process-specific global data need not to be protected by locks since a thread cannot be rescheduled while in a critical section (and only one thread at a time can access the same memory location). By the same token, non thread-safe functions (in a traditional sense) can be safely used with the State Threads. The library’s mutex facilities are practically useless for a correctly written application (no blocking functions in critical section) and are provided mostly for completeness. This absence of locking greatly simplifies an application design and provides a foundation for scalability.

##Inter-Process Synchronization进程间通信
The State Threads library makes it possible to multiplex a large number of simultaneous connections onto a much smaller number of separate processes, where each process uses a many-to-one user-level threading implementation (N of M:1 mappings rather than one M:N mapping used in native threading libraries on some platforms). This design is key to the application’s scalability. One can think about it as if a set of all threads is partitioned into separate groups (processes) where each group has a separate pool of resources (virtual address space, file descriptors, etc.). An application designer has full control of how many groups (processes) an application creates and what resources, if any, are shared among different groups via standard UNIX inter-process communication (IPC) facilities.

There are several reasons for creating multiple processes:

To take advantage of multiple hardware entities (CPUs, disks, etc.)available in the system (hardware parallelism).
To reduce risk of losing a large number of user connections when one of the processes crashes. For example, if C user connections (threads) are multiplexed onto P processes and one of the processescrashes, only a fraction (C/P) of all connections will be lost.
To overcome per-process resource limitations imposed by the OS. For example, if select(2) is used for event polling, the number ofsimultaneous connections (threads) per process is limited by theFD_SETSIZE parameter (see select(2)). If FD_SETSIZE is equal to 1024 and each connection needs one file descriptor, then an application should create 10 processes to support 10,000 simultaneous connections.

Ideally all user sessions are completely independent, so there is no need for inter-process communication. It is always better to have several separate smaller process-specific resources (e.g., data caches) than to have one large resource shared (and modified) by all processes. Sometimes, however, there is a need to share a common resource among different processes. In that case, standard UNIX IPC facilities can be used. In addition to that, there is a way to synchronize different processes so that only the thread accessing the shared resource will be suspended (but not the entire process) if that resource is unavailable. In the following code fragment a pipe is used as a counting semaphore for inter-process synchronization:

#ifndef PIPE_BUF
#define PIPE_BUF 512  /* POSIX */
#endif

/* Semaphore data structure */
typedef struct ipc_sem {
    st_netfd_t rdfd;  /* read descriptor */
    st_netfd_t wrfd;  /* write descriptor */
} ipc_sem_t;

/* Create and initialize the semaphore. Should be called before fork(2). */
/* 'value' must be less than PIPE_BUF. */
/* If 'value' is 1, the semaphore works as mutex. */
ipc_sem_t *ipc_sem_create(int value)
{
    ipc_sem_t *sem;
    int p[2];
    char b[PIPE_BUF];

    /* Error checking is omitted for clarity */
    sem = malloc(sizeof(ipc_sem_t));

    /* Create the pipe */
    pipe(p);
    sem->rdfd = st_netfd_open(p[0]);
    sem->wrfd = st_netfd_open(p[1]);

    /* Initialize the semaphore: put 'value' bytes into the pipe */
    write(p[1], b, value);

    return sem;
}

/* Try to decrement the "value" of the semaphore. */
/* If "value" is 0, the calling thread blocks on the semaphore. */
int ipc_sem_wait(ipc_sem_t *sem)
{
    char c;

    /* Read one byte from the pipe */
    if (st_read(sem->rdfd, &c, 1, ST_UTIME_NO_TIMEOUT) != 1)
        return -1;
    return 0;
}

/* Increment the "value" of the semaphore. */
int ipc_sem_post(ipc_sem_t *sem)
{
    char c;

    if (st_write(sem->wrfd, &c, 1, ST_UTIME_NO_TIMEOUT) != 1)
        return -1;
    return 0;
}

Generally, the following steps should be followed when writing an application using the State Threads library:

Initialize the library (st_init()).
Create resources that will be shared among different processes: create and bind listening sockets,create shared memory segments, IPC channels, synchronizationprimitives, etc.
Create several processes (fork(2)). The parentprocess should either exit or become a “watchdog” (e.g., it starts anew process when an existing one crashes, does a cleanup upon application termination, etc.).
In each child process create a pool of threads (st_thread_create()) to handle user connections.

##Non-Network I/O
The State Threads architecture uses non-blocking I/O on st_netfd_t objects for concurrent processing of multiple user connections. This architecture has a drawback: the entire process and all its threads may block for the duration of a disk or other non-network I/O operation, whether through State Threads I/O functions, direct system calls, or standard I/O functions. (This is applicable mostly to disk reads; disk writes are usually performed asynchronously – data goes to the buffer cache to be written to disk later.) Fortunately, disk I/O (unlike network I/O) usually takes a finite and predictable amount of time, but this may not be true for special devices or user input devices (including stdin). Nevertheless, such I/O reduces throughput of the system and increases response times. There are several ways to design an application to overcome this drawback:

Create several identical main processes as described above (symmetric architecture). This will improve CPU utilization and thus improve the overall throughput of the system.
Create multiple “helper” processes in addition to the main process that will handle blocking I/O operations (asymmetric architecture). This approach was suggested for Web servers in a paper by Peter Druschel et al. In this architecture the main process communicates with a helper process via an IPC channel (pipe(2), socketpair(2)). The main process instructs a helper to perform the potentially blocking operation. Once the operation completes, the helper returns a notification via IPC.

##Timeouts
The timeout parameter to st_cond_timedwait() and the I/O functions, and the arguments to st_sleep() and st_usleep() specify a maximum time to wait since the last context switch not since the beginning of the function call.

The State Threads’ time resolution is actually the time interval between context switches. That time interval may be large in some situations, for example, when a single thread does a lot of work continuously. Note that a steady, uninterrupted stream of network I/O qualifies for this description; a context switch occurs only when a thread blocks.

If a specified I/O timeout is less than the time interval between context switches the function may return with a timeout error before that amount of time has elapsed since the beginning of the function call. For example, if eight milliseconds have passed since the last context switch and an I/O function with a timeout of 10 milliseconds blocks, causing a switch, the call may return with a timeout error as little as two milliseconds after it was called. (On Linux, select()'s timeout is an upper bound on the amount of time elapsed before select returns.) Similarly, if 12 ms have passed already, the function may return immediately.

In almost all cases I/O timeouts should be used only for detecting a broken network connection or for preventing a peer from holding an idle connection for too long. Therefore for most applications realistic I/O timeouts should be on the order of seconds. Furthermore, there’s probably no point in retrying operations that time out. Rather than retrying simply use a larger timeout in the first place.

The largest valid timeout value is platform-dependent and may be significantly less than INT_MAX seconds for select() or INT_MAX milliseconds for poll(). Generally, you should not use timeouts exceeding several hours. Use ST_UTIME_NO_TIMEOUT (-1) as a special value to indicate infinite timeout or indefinite sleep. Use ST_UTIME_NO_WAIT (0) to indicate no waiting at all.