线程详解：概念、创建、调度与同步-CSDN博客

本文链接：https://blog.csdn.net/STX19980909/article/details/120485256

1.什么是线程？
源代码 -编译和链接-> 程序 -加载到内存中-> 进程
| |
文件内存
/ \
代码 <- 执行
数据 <- 处理
| | <- CPU
静态动态
| |
资源线程
线程就是进程的执行过程，即进程内部的控制序列，或者说是进程中的一个任务。
一个进程可以同时拥有多个线程，即同时被系统调度的多个执行路径，但至少要有一个主线程——main函数及被其调用的其它函数。
一个进程的所有线程都共享进程的代码区、数据区、BSS区、堆区、环境变量和命令行参数区、文件描述符表、信号处理函数、当前工作目录、用户和组的各种ID等。但是栈区不是共享的，一个进程的每个线程都拥有自己独立的栈区。

线程调度：
1)系统内核中专门负责线程调度的处理单元被称为调度器；
2)调度器将所有处于就绪状态(没有阻塞在任何系统调用上)的线程排成一个队列，即所谓就绪队列；
3)调度器从就绪队列中获取队首线程，为其分配一个时间片，并令处理器执行该线程，过了一段时间：
A.该线程的时间片耗尽，调度器立即终止该线程，并将其排到就绪队列的尾端，接着从队首获取下一个线程；
B.该线程的时间片未耗尽，但需阻塞于某系统调用，比如等待I/O或者睡眠。调度器会中止该线程，并将其从就绪队列中移至等待队列，直到其等待的条件满足后，再被移回就绪队列；
4)在低优先级线程执行期间，有高优先级线程就绪，后者会抢占前者的时间片；
5)若就绪队列为空，则系统内核进入空闲状态，直至其非空；
6)象Linux这样的多任务分时系统，基本的调度单位是线程；
7)为线程分配的时间片不宜过长，因为时间片太长会导致没有获得处理机的线程等候时间过久，降低系统运行的并行性，用户会感觉到明显的响应延迟；时间片也不宜过短，因为过短的时间片会增加在线程之间切换上下文的频率，降低系统的运行性能。

2.线程的基本特点
1)线程是进程中的独立实体，可以拥有自己的资源，可以被独立标识——线程ID，同时也被作为基本调用单元，参与时间片的分配。
2)线程有不同的状态，如创建、运行、终止、暂停、恢复、取消等。
3)线程可以使用的大部分资源还是隶属于进程的，因此线程作为进程的一部分不能脱离进程独立存在。
4)一个进程可以同时执行多个线程，这些线程可以执行相同的代码，完成相同的任务，也可以执行不同的代码，完成不同的任务。
5)创建一个线程所花费的开销远小于创建进程的开销。线程也称为轻量级进程。因此在解决诸如并发问题等问题时，优先考虑多线程，其次才是多进程。
6)多线程的问题在于因为太多的资源被共享，极易导致冲突，为了解决冲突可能需要增加额外的开销，因此多进程仍然有它的优势。
进程，内存壁垒，通信。
线程，内存共享，同步。

3.POSIX线程
#include <pthread.h>
链接时要加上-lpthread -lpthread -> libpthread.so

4.创建线程
线程过程函数：在一个线程中被内核调用的函数，对该函数的调用过程就是线程的执行过程，从该函数中返回就意味着线程的结束。因此，main函数其实就是一个进程的主线程的线程过程函数。所有自创建的线程都必须有一个对应线程过程函数。
void* 线程过程函数(void*arg) {
线程的执行过程
}
arg - 线程参数指针

int pthread_create(pthread_t* tid, const pthread_attr_t* attr, void* (*start_routine)(void*), void* arg);
成功返回0，失败返回错误码。
tid - 输出线程标识(TID)。
attr - 线程属性，NULL表示缺省属性。
start_routine - 线程过程函数指针
arg - 线程参数指针
pthread_create
->1、创建一个新线程
->2、调用线程过程函数(start_routine)并传入线程参数指针(arg)
被创建的子线程和创建该子线程的父线程是并行的关系，其调度顺序无法预知，因此当pthread_create函数返回时子线程执行的位置无从确定，其线程过程函数可能尚未被调用，也可能正在执行，甚至可能已经返回。传递给线程的参数对象，一定要在线程过程函数不再使用它的情况下才能被释放。
代码：create.c

/* create.c */
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <pthread.h>
void* thread_proc(void* arg) {
    printf("%lu线程：%s\n", pthread_self(), (char*)arg);
    return NULL;
}
int main(void) {
    pthread_t tid;
    int error = pthread_create(&tid, NULL, thread_proc, "我是子线程！");
    if (error) {
        fprintf(stderr, "pthread_create: %s\n", strerror(error));
        return -1;
    }
    printf("%lu线程：我是主线程，创建%lu线程。\n", pthread_self(), tid);
    sleep(1);
    return 0;
}

使用pthread_self()可查看线程自己的TID

主线程和通过pthread_create函数创建的多个子线程，在时间上“同时”运行，如果不附加任何同步条件，则它们每一个执行步骤的先后顺序完全无法预知，这就叫做自由并发。
为了让线程过程函数的实现更加灵活，可以通过线程参数传递特定的信息，帮助线程过程函数执行不同的任务。
代码：arg.c

/* arg.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <unistd.h>
#include <pthread.h>
#define PI 3.14159
void* thread_area(void* arg) {
    double r = *(double*)arg;
    *(double*)arg = PI * r * r;
    return NULL;
}
struct Pyth {
    double a, b, c;
};
void* thread_pyth(void* arg) {
    struct Pyth* pyth = (struct Pyth*)arg;
    pyth->c = sqrt((pyth->a * pyth->a + pyth->b * pyth->b));
    return NULL;
}
void* thread_aver(void* arg) {
    double* d = (double*)arg;
    d[2] = (d[0] + d[1]) / 2;
    return NULL;
}
void* thread_show(void* arg) {
    printf("%d\n", *(int*)arg);
    return NULL;
}
int main(void) {
    pthread_t tid;
    double r = 10;
    pthread_create(&tid, NULL, thread_area, &r);
    usleep(100000);
    printf("%g\n", r);
    struct Pyth pyth = {3, 4};
    pthread_create(&tid, NULL, thread_pyth, &pyth);
    usleep(100000);
    printf("%g\n", pyth.c);
    double d[3] = {123, 456};
    pthread_create(&tid, NULL, thread_aver, d);
    usleep(100000);
    printf("%g\n", d[2]);
    int* n = malloc(sizeof(int));
    *n = 1234;
    pthread_create(&tid, NULL, thread_show, n);
    usleep(100000);
    free(n);
    return 0;
}

gcc -o arg arg.c -lpthread -lm

5.汇合线程
创建点汇合点
--------+---------+-------> 主线程
\________/ 子线程
int pthread_join(pthread_t tid, void** retval);
成功返回0，失败返回错误码。
tid - 线程标识
retval - 线程退出码
当调用pthread_join函数时：
tid线程已经终止，立刻返回，返回线程退出码。
tid线程尚未终止，阻塞等待，直到被汇合线程终止。
void* 线程过程函数(void* 线程参数指针) {
线程的执行过程
return p;
}
功能：等待子线程终止，清理线程的资源，获得线程过程函数的返回值
代码：join.c

/* join.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <unistd.h>
#include <pthread.h>
#define PI 3.14159
void* thread_area(void* arg) {
    double r = *(double*)arg;
    double* s = malloc(sizeof(double));
    *s = PI * r * r;
    return s;
}
int main(void) {
    pthread_t tid;
    double r = 10;
    pthread_create(&tid, NULL, thread_area, &r);
    double* s;
    pthread_join(tid, (void**)&s);
    printf("%g\n", *s);
    free(s);
    return 0;
}

6.分离线程
在有些时候作为子线程的创建者，父线程可能并不关心子线程何时终止，同时父线程也不需要获得子线程的返回值。这种情况下，就可将子线程设置为分离线程，这样的线程一旦终止，它们的资源会被系统自动回收，而无需在其父线程中调用pthread_join函数。
int pthread_detach(pthread_t tid);
成功返回0，失败返回错误码。
tid - 线程标识
代码：detach.c

/* detach.c */
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <pthread.h>
void* thread_proc(void* arg) {
    for (int i = 0; i < 200; ++i) {
        putchar('-');
        usleep(50000);
    }
    return NULL;
}
int main(void) {
    setbuf(stdout, NULL);
    pthread_attr_t attr;
    pthread_attr_init(&attr);
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
    pthread_t tid;
    pthread_create(&tid, /*NULL*/&attr, thread_proc, NULL);
    pthread_attr_destroy(&attr);
    //pthread_detach(tid);
    int error = pthread_join(tid, NULL);
    if (error)
        fprintf(stderr, "pthread_join: %s\n", strerror(error));
    for (int i = 0; i < 200; ++i) {
        putchar('+');
        usleep(100000);
    }
    printf("\n");
    return 0;
}

pthread_attr_t attr;
pthread_attr_init(&attr); // 用缺省值初始化线程属性
pthread_attr_setdetachstat(&attr,PTHREAD_CREATE_DETACHED);
pthread_create(..., &attr, ...);
pthread_attr_destroy(&attr);

7.线程ID
pthread_t tid;
pthread_create(&tid, ...);—>输出子线程的TID
pthread_self(); -> 返回调用线程的TID
if (tid1 == tid2) // 对于结构体来说兼容性不好
...
int pthread_equal(pthread_t tid1, pthread_t tid2);
两个TID相等返回非零，否则返回0。
if (pthread_equal(tid1, tid2)) // 兼容性好
代码：equal.c

/* equal.c */
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <syscall.h>
#include <pthread.h>
pthread_t g_main; // 主线程的TID
void foo(void) {
    if (pthread_equal(pthread_self(), g_main))
        printf("foo函数在主线程中被调用了。\n");
    else
        printf("foo函数在子线程中被调用了。\n");
}
void* thread_proc(void* arg) {
    printf("系统内核的子线程TID：%ld\n", syscall(SYS_gettid));
    foo();
    return NULL;
}
int main(void) {
    g_main = pthread_self();
    printf("POSIX库的主线程TID：%lu\n", g_main);
    printf("系统内核的主线程TID：%ld\n", syscall(SYS_gettid));
    printf("进程的PID：%d\n", getpid());
    foo();
    pthread_t tid;
    pthread_create(&tid, NULL, thread_proc, NULL);
    pthread_join(tid, NULL);
    return 0;
}

通过pthread_self函数返回的线程ID和pthread_create函数输出的线程ID一样都是由PTHREAD库内部维护的虚拟(伪)线程ID，可用于其它需要提供线程ID的PTHREAD函数。系统内核维护的真实线程ID，可通过syscall(SYS_gettid)获得：
#include <unistd.h> // 声明syscall函数
#include <syscall.h> // 定义SYS_gettid宏
在Linux系统中，一个进程的PID实际上就是其主线程的TID。

8.终止线程(自己)
1)从线程过程函数中返回，执行该过程函数的线程即终止。其返回值可通过pthread_join函数的第二参数输出给调用者。
2)在线程过程函数及被其调用的任何函数中都可以调用pthread_exit函数终止当前线程：
void pthread_exit(void* retval);
该函数的参数retval就相当于线程过程函数的返回值，同样可被pthread_join的第二个参数输出给调用者。
代码：exit.c

/* exit.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <unistd.h>
#include <pthread.h>
#define PI 3.14159
void calc_area(double r) {
    double* s = malloc(sizeof(double));
    *s = PI * r * r;
    pthread_exit(s);
}
void* thread_area(void* arg) {
    printf("调用calc_area函数...\n");
    calc_area(*(double*)arg);
    printf("从calc_area函数返回。\n");
    return NULL;
}
int main(void) {
    pthread_t tid;
    double r = 10;
    pthread_create(&tid, NULL, thread_area, &r);
    //pthread_exit(NULL);
    double* s;
    pthread_join(tid, (void**)&s);
    printf("%g\n", *s);
    free(s);
    return 0;
}

注意：在子线程中调用pthread_exit，只会终止调用线程自己，对其其它兄弟线程和主线程没有影响。但如果在主线程中调用pthread_exit，被终止的将是整个进程及其所包含的全部线程。

9.取消(其它)线程
int pthread_cancel(pthread_t tid);
成功返回0，失败返回错误码。
tid - 被取消线程的TID。
该函数只是向特定线程发出取消请求，并不等待其终止运行。缺省情况下，线程在收到取消请求以后，并不会立即终止，而是仍继续运行，直到达到某个取消点。在取消点处，线程检查其自身是否已被取消，若是则立即终止。取消点通常出现在一些特定的系统调用中。
设置调用线程的取消状态为接受或忽略取消请求：
int pthread_setcancelstate(int state, int* oldstate);
state - 取消状态，可取以下值：
PTHREAD_CANCEL_ENABLE - 接受取消请求(缺省)
PTHREAD_CANCEL_DISABLE - 忽略取消请求
old_state - 输出原取消状态，可取NULL。
设置调用线程的取消类型为延迟取消或立即取消：
int pthread_setcanceltype(int type, int* oldtype);
type - 取消类型，可取以下值：
PTHREAD_CANCEL_DEFERRED - 延迟取消(缺省)，接到取消请求，如果不是忽略的话，继续运行一段时间，直到执行取消点时再终止。
PTHREAD_CANCEL_ASYNCHRONOUS - 立即取消，接到取消请求，如果不是忽略的话，立刻终止运行。
old_type - 输出原取消类型，可取NULL。
代码：cancel.c

/* cancel.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
void elapse(void) {
    for (unsigned int i = 0; i < 800000000; ++i);
}
void* thread_proc(void* arg) {
    /*
    // 禁止接受取消请求
    pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, NULL);
    */
    // 取消类型立即取消
    pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL);
    for (;;) {
        printf("线程：子在川上曰，逝者如斯夫。\n");
        elapse();
    }
    return NULL;
}
int main(void) {
    setbuf(stdout, NULL);
    pthread_t tid;
    pthread_create(&tid, NULL, thread_proc, NULL);
    getchar();
    printf("向线程发送取消请求...\n");
    pthread_cancel(tid);
    printf("等待线程终止...\n");
    pthread_join(tid, NULL);
    printf("线程已终止。\n");
    return 0;
}

10.线程冲突
g = 0
线程1 线程2
把内存(g=0)中的值读入寄存器(eax=0)
把寄存器(eax=0->1)中的值加1
把寄存器(eax=1)中的值存入内存(g=1)
把内存(g=1)中的值读入寄存器(eax=1)
把寄存器(eax=1->2)中的值加1
把寄存器(eax=2)中的值存入内存(g=2)
线程1 线程2
把内存(g=0)中的值读入寄存器(eax=0)
把内存(g=0)中的值读入寄存器(eax=0)
把寄存器(eax=0->1)中的值加1
把寄存器(eax=0->1)中的值加1
把寄存器(eax=1)中的值存入内存(g=1)
把寄存器(eax=1)中的值存入内存(g=1)
当两个或两个的线程同时以非原子化的方式访问同一个对象时，极有可能导致对象的最终状态不稳定，这就是所谓的线程冲突。解决冲突的基本原则就是敏感操作的原子化，即保证一个线程完成这组敏感操作以后，再允许另个线程执行类似的操作，位于与共享资源有关的操作代码，在任何时候都只允许一个线程执行。
代码：vie.c

/* vie.c */
#include <stdio.h>
#include <string.h>
#include <pthread.h>
unsigned int g = 0;
void* thread_proc(void* arg) {
    for (unsigned int i = 0; i < 100000000; ++i)
        ++g;
    return NULL;
}
int main(void) {
    pthread_t t1, t2;
    pthread_create(&t1, NULL, thread_proc, NULL);
    pthread_create(&t2, NULL, thread_proc, NULL);
    pthread_join(t1, NULL);
    pthread_join(t2, NULL);
    printf("g = %u\n", g);
    return 0;
}