Linux：线程概念及线程控制

最新推荐文章于 2021-04-21 16:51:20 发布

黑米姐姐

最新推荐文章于 2021-04-21 16:51:20 发布

阅读量134

点赞数

分类专栏： # Linux操作系统

本文链接：https://blog.csdn.net/qq_41809901/article/details/102859955

版权

Linux操作系统专栏收录该内容

22 篇文章 1 订阅

订阅专栏

Linux线程与进程基本概念

概念
在传统操作系统中，PCB就是进程，控制程序的运行，线程有TCB，但是在Linux下，因为线程是通过进程的PCB描述实现的，因此Linux下的PCB实际上是一个线程，并且因为这些线程共用同一个虚拟地址空间，因此也把Linux下的线程称为轻量级线程，相较于传统PCB更加轻量化。
即进程是一个运行中的程序，操作系统会创建一个pcb用来描述进程，并且分配资源，通过pcb来调度运行这个程序，而线程是一个进程中的执行流，但是Linux下实现进程中的执行流时，使用了pcb实现，因此说Linux下的线程是一个轻量级进程，因为同一个进程中的线程共用线程分配的资源，而进程是一个线程组，系统在运行程序，分配资源是分配给线程组的，进程就像是一个工厂，是资源分配的基本单位，而线程就像是工厂中的工人，是CPU调度的基本单位；
线程是在进程内部运行的，即在进程的地址空间内运行；
如图：

CPU看到的虽然还是PCB，但是比传统的进程更加的轻量化，所有的线程指向同一块虚拟地址空间，虚拟地址在查找页表时，将虚拟地址当成偏移量俩进行查找，在访问期间，如果目标资源不在内存，会触发缺页中断，这时将页面写入，然后建立映射关系（注意：缺页中断会使效率降低），
多进程与多线程的优缺点分析
多线程优点：
（1）因为共用同一块虚拟地址空间，因此通信更加灵活（能实现进程间通信就能实现线程间通信，并且线程全局、传参也可以）
（2）线程的创建和销毁成本更低（）
（3）线程的切换调度成本低
多进程优点：
（1）稳定、健壮性高—适用于对主程序安全稳定性要求更高的场景，例如shell/服务器
多线程缺点：
线程间缺乏访问控制，某些系统调用以及异常是针对整个进程产生效果的（例如exit）
多进程与多线程进行多任务处理的优势在哪里？
1、IO密集型程序（在程序中大量进行IO操作，IO操作：IO等待+数据拷贝）
2、CPU密集型程序（在程序中不断进行数据运算）
一般线程最好创建CPU核数+1个
线程的独有和共享
线程共享进程数据，但是也独有自己的一部分数据
独有的数据有：
（1）线程ID
（2）硬件上下文
（3）栈
（4）errno
（5）信号屏蔽字
（6）调度优先级（每个线程都是一个PCB）
因为线程要进行调度，要调度就要有硬件上下文；线程在运行期间有可能产生临时变量，有可能有函数函数调用，所以必须每个线程有一个栈结构，否则会相互干扰。
栈结构和上下文体现线程可以被调度
共享的数据有：
（1）文件描述符表
（2）每个信号的处理方式（默认、忽略、自定义）
（3）当前工作目录
（4）用户id和组id
注意：单进程就是一个线程执行流的进程
多进程/多线程执行多任务的并发处理的共同优势：
（1）IO密集型程序（程序中大部分的工作都是进行IO）：可以在一个执行流中发起一个IO，避免了只有一个执行流时，只有一个IO完成才能进入下一个的情况，提高了IO效率，并且压缩了IO等待时间；
（2）CPU密集型程序（程序中大部分的工作是进行数据运算）：在CPU密集型程序中，执行流的创建并不是越多越好，多了反而会提高CPU调度的成本，CPU密集型程序中，线程的创建最好是CPU核心数+1；

线程控制

操作系统并没有给用户直接提供创建一个线程的接口，因此就封装了一套第三方POSIX 线程库用于线程控制，是用户模拟的（用户线程就是在用户态中创建的线程），绝大多数函数的名字时以“pthread_t”开头的。
要使用这些函数库，要引入头文件<pthread.h>，链接这些线程库必须使用编译器命令"-lpthread"选项

创建线程
pthread_create函数，它的函数原型是：

 #include <pthread.h>
 int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg);

其中参数thread表示返回线程ID；attr表示设置线程的属性，一般使用NULL表示默认属性；start_routine是一个函数地址，是线程的入口函数；arg是线程入口函数start_routine的参数
它的返回值是成功返回0，失败返回错误码
例如：

#include<stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>
void* rout(void* arg)
{
    while(1)
    {
        printf("I am thread 1:%d\n",getpid());
        sleep(1);
    }
}
int main()
{
    pthread_t tid;
    int ret;
    char* ptr="thread 1";
    ret=pthread_create(&tid,NULL,rout,(void*)ptr);
    if(ret!=0)
    {
        printf("pthread_create error :%s\n",(char*)rout);
    }
    while(1)
    {
        printf("I am main thread:%d\n",getpid());
        sleep(1);
    }
    return 0;
}

结果就是：

[Daisy@localhost test_2019_11_2_1]$ ./pthread 
I am main thread:4336
I am thread 1:4336
I am main thread:4336
I am thread 1:4336
I am main thread:4336
I am thread 1:4336
^C

这时有两个线程在跑，一个执行流，一个线程1，两个线程的pid一样，说明在同一个进程中
此时如果要是将代码改为：

#include<stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>
#include <time.h>
void* rout(void* arg)
{
    while(1)
    {
        srand((unsigned long)time(NULL));
        printf("I am thread 1:%d\n",getpid());
        int time=rand()%10;
        sleep(1);
        int a=10/0;
    }
}
int main()
{
    pthread_t tid;
    pthread_create(&tid,NULL,rout,(void*)"thread 1");
    pthread_create(&tid,NULL,rout,(void*)"thread 2");
    pthread_create(&tid,NULL,rout,(void*)"thread 3");
    pthread_create(&tid,NULL,rout,(void*)"thread 4");
    pthread_create(&tid,NULL,rout,(void*)"thread 5");
    while(1)
    {
        printf("I am main thread:%d\n",getpid());
        sleep(1);
    }
    return 0;
}

此时应该出异常

[Daisy@localhost test_2019_11_2_1]$ ./pthread 
I am main thread:4552
I am thread 1:4552
I am thread 1:4552
I am thread 1:4552
I am thread 1:4552
I am thread 1:4552
I am main thread:4552
浮点数例外(吐核)

此时是某个线程出异常，主线程没有问题，但是一旦出异常，整个进程都退出，也就是一个线程出异常，整个进程都退出，因此线程健壮性不好。
可以使用ps -efL来查看轻量级进程的信息
例如：

[Daisy@localhost ~]$ ps -efL | head -n1 && ps -efL | grep pthread
UID         PID   PPID    LWP  C NLWP STIME TTY          TIME CMD
Daisy      3050   2803   3050  0    6 19:49 pts/1    00:00:00 ./pthread
Daisy      3050   2803   3051  0    6 19:49 pts/1    00:00:00 ./pthread
Daisy      3050   2803   3052  0    6 19:49 pts/1    00:00:00 ./pthread
Daisy      3050   2803   3053  0    6 19:49 pts/1    00:00:00 ./pthread
Daisy      3050   2803   3054  0    6 19:49 pts/1    00:00:00 ./pthread
Daisy      3050   2803   3055  0    6 19:49 pts/1    00:00:00 ./pthread
Daisy      3134   2945   3134  0    1 19:52 pts/2    00:00:00 grep --color=auto pthread

显示了LWP信息，LWP是CPU调度最重要的ID字段，叫做轻量级进程ID，实际上是线程组ID，LWP是不一样的，此时杀掉一个线程就会所有的线程退出；而tid是线程空间在虚拟地址空间中的首地址，tgid是在外边查看到的进程id，实际上是pcb中的；

tid
此时如果执行这个代码打印出tid，例如：

#include<stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>
#include <time.h>
void* rout(void* arg)
{
    while(1)
    {
        printf("I am thread 1:%d\n",getpid());
        sleep(1);
    }
}
int main()
{
    pthread_t tid;
    pthread_create(&tid,NULL,rout,(void*)"thread 1");
    while(1)
    {
        printf("I am main thread:%lu\n",tid);
        sleep(1);
    }
    return 0;
}

此时运行后的结果就是：

发现此时打印出的tid是一个很长的数字，并不是进程ID，因为pthread_create是用户级线程库的函数，不是系统调用的函数，可以这样理解，内核的LWP就相当于身份证号，而这个tid相当于自己工作的工号，这里的线程ID与内核的LWP是一一对应的，这个tid就是线程地址空间的地址。

线程等待
在创建好新线程后，必须让主线程去等待新线程，倘若不等待，已经退出的线程的空间没有释放，仍然在进程地址空间内，创建新的线程不会复用刚才退出线程的地址空间，就会造成类似于僵尸进程的问题，也会造成内存泄漏。
等待线程的函数是pthread_join，它的函数原型是：

#include <pthread.h>
int pthread_join(pthread_t thread, void **retval);

其中参数thread表示线程tid，retval表示获取退出码，它是一个二级指针，线程执行的函数返回值是void*，因此获得这个一级指针要传一个二级指针；返回值是成功返回0，失败返回错误码，调用该函数的线程将挂起等待，直到id为thread的线程终止。
thread线程以不同的方法终止，通过pthread_join得到的状态是不同的，总结：
（1）如果thread线程通过return返回，retval所指向的单元中存放的是thread线程函数的返回值
（2）如果thread线程被别的线程调用pthread_cancel异常终止，retval所指向的单元中存放的是常数PTHREAD_CANCELED（代表该线程被取消）
（3）如果thread线程是自己调用pthread_exit终止，retval中存放的是传给pthread_exit的参数
（4）如果thread线程的终止状态不感兴趣，可以传NULL给retval参数
例如：

#include<stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>
#include <time.h>
void* rout(void* arg)
{
    while(1)
    {
        printf("I am thread 1:%s\n",(char*)arg);
        sleep(1);
    }
}
int main()
{
    pthread_t tid;
    pthread_create(&tid,NULL,rout,(void*)"thread 1");
    printf("I am main thread:%lu\n",tid);
    pthread_join(tid,NULL);//线程等待，对终止状态不感兴趣
    return 0;
}

此时创建了新线程，主线程在以阻塞方式等待新线程退出，但是新线程没有退出，因此它的结果就是：

[Daisy@localhost test_2019_11_2_1]$ ./pthread 
I am main thread:139972940277504
I am thread 1:thread 1
I am thread 1:thread 1
I am thread 1:thread 1
I am thread 1:thread 1
I am thread 1:thread 1
^C

此时创建完新线程之后，主线程就一直等待新线程退出，新线程在一直运行，线程没有非阻塞式等待。

线程终止
1、通过return终止
例如实现代码：

#include<stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>
#include <time.h>
void* rout(void* arg)
{
    while(1)
    {
        printf("I am thread 1:%s\n",(char*)arg);
        sleep(1);
        break;
    }
    return (void*)11;
}
int main()
{
    pthread_t tid;
    pthread_create(&tid,NULL,rout,(void*)"thread 1");
    printf("I am main thread:%p\n",tid);
    void* ret;
    pthread_join(tid,&ret);//线程等待，ret保存线程处理函数的返回值
    printf("ret:%d\n",(long)ret);
    return 0;
}

运行出来的结果是：

[Daisy@localhost test_2019_11_2_1]$ ./pthread 
I am main thread:0x7f773fd9a700
I am thread 1:thread 1
ret:11

这个就获取到了退出码11，进程退出，新线程退出，导致进程退出，也导致主线程退出，根本没有返回的机会，没有调用pthread_join的机会，因此我们默认线程退出是默认线程正常运行，没有出异常（因为出异常主线程也没有办法，主线程已经退出了）只关心结果是否正确，因此新线程退出没有办法拿到退出信号，只能返回退出码
2、pthread_exit函数
它的函数原型是：

#include <pthread.h> 
void pthread_exit(void *retval);

它的用法与return来终止线程的用法一样，例如：

#include<stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>
#include <time.h>
void* rout(void* arg)
{
    while(1)
    {
        printf("I am thread 1:%s\n",(char*)arg);
        sleep(1);
        break;
    }
//    return (void*)11;
    pthread_exit((void*)3);//线程终止
}
int main()
{
    pthread_t tid;
    pthread_create(&tid,NULL,rout,(void*)"thread 1");
    printf("I am main thread:%p\n",tid);
    void* ret;
    pthread_join(tid,&ret);//线程等待，以阻塞方式等待
    printf("ret:%d\n",(long)ret);
    return 0;
}

此时的运行结果就是：

[Daisy@localhost test_2019_11_2_1]$ ./pthread 
I am main thread:0x7f3df62d7700
I am thread 1:thread 1
ret:3

可以看到等待到的新线程退出码是3，main函数中调用return代表进程退出。
注意：pthread_exit或者return返回的指针所指向的内存单元必须是全局的或者使用malloc分配的，不能在线程函数的栈上分配，因为当其他线程得到这个返回指针时线程函数已经退出了。

线程取消
pthread_cancel函数，作用是取消一个执行中的线程，它的函数原型是：

#include <pthread.h>
int pthread_cancel(pthread_t thread);

参数thread是线程ID（tid），表示要取消哪个线程，返回值是成功返回0，失败返回错误码，例如：

#include<stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
void* thread_run(void* arg)
{
    int count=5;
    while(count--)
    {
        printf("I am a new thread,tid is %p\n",pthread_self());//pthread_self用来获取当前线程的线程ID
        sleep(1);
    }
    pthread_exit((void*)111);
}

int main()
{
    pthread_t tid;
    int num=1;
    pthread_create(&tid,NULL,thread_run,(void*)&num);
    pthread_cancel(tid);//取消新线程
    printf("main thread get a new thread tid:%p\n",tid);
    void *ret;
    pthread_join(tid,&ret);
    printf("ret:%d\n",(long)ret);
    return 0;
}

它的运行结果就是：

[Daisy@localhost test_2019_11_3_1]$ ./mythread 
main thread get a new thread tid:0x7f42eae86700
I am a new thread,tid is 0x7f42eae86700
ret:-1

退出码是-1，这个是一个宏，就是PTHREAD_CANCELED宏，表示线程是被取消的，将这个pthread_cancel放在新线程的函数内存取消也可以，但是一般不这么做。（注意：pthread_self函数的作用是获取当前线程ID）
例如：

#include<stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
pthread_t main_id;
void* thread_run(void* arg)
{
    int count=5;
    while(count--)
    {
        printf("I am a new thread,tid is %p\n",pthread_self());//pthread_self用来获取当前线程的线程ID
        sleep(1);
       int ret=pthread_cancel(main_id);//能不能让新线程取消主线程
       printf("ret:%d,string:%s\n",ret,strerror(ret));//使用ret获得返回码，使用strerror打印错误信息
    }
    pthread_exit((void*)111);
}

int main()
{
    main_id=pthread_self();//获取主线程ID
    pthread_t tid;
    int num=1;
    pthread_create(&tid,NULL,thread_run,(void*)&num);
    printf("main thread get a new thread tid:%p\n",tid);
    void *ret;
    pthread_join(tid,&ret);
    printf("ret:%d\n",(long)ret);
    return 0;
}

此时运行结果是：

[Daisy@localhost test_2019_11_3_1]$ ./mythread 
main thread get a new thread tid:0x7f2a6e0f7700
I am a new thread,tid is 0x7f2a6e0f7700
ret:0,string:Success
I am a new thread,tid is 0x7f2a6e0f7700
ret:3,string:No such process
I am a new thread,tid is 0x7f2a6e0f7700
ret:3,string:No such process
I am a new thread,tid is 0x7f2a6e0f7700
ret:3,string:No such process
I am a new thread,tid is 0x7f2a6e0f7700
ret:3,string:No such process

在运行时打开另一终端，执行命令得到：

[Daisy@localhost ~]$ while :; do ps -aL | grep mythread; sleep 1;echo "######################";done
######################
######################
######################
  4066   4066 pts/1    00:00:00 mythread
  4066   4067 pts/1    00:00:00 mythread
######################
  4066   4066 pts/1    00:00:00 mythread <defunct>
  4066   4067 pts/1    00:00:00 mythread
######################
  4066   4066 pts/1    00:00:00 mythread <defunct>
  4066   4067 pts/1    00:00:00 mythread
######################
  4066   4066 pts/1    00:00:00 mythread <defunct>
  4066   4067 pts/1    00:00:00 mythread
######################
  4066   4066 pts/1    00:00:00 mythread <defunct>
  4066   4067 pts/1    00:00:00 mythread

可以看出新线程可以取消主线程，但是进程没有退出，主线程不运行了，新线程就变成了类似于僵尸进程的状态，因为没有人调用join，最终交给bash回收，如果主线程调用return就代表进程退出，就会全部退出。

线程分离
比如将新线程创建出之后，主线程阻塞等待，但是此时主线程想要做其他事情，但是线程没有非阻塞轮询等待，此时就用到了线程分离。
新线程退出后，资源自动被系统回收，不关心新线程返回结果，可以进行线程分离，函数pthread_detach，它的函数原型是：

#include <pthread.h>
int pthread_detach(pthread_t thread);

参数thread表示线程id（tid）
例如：

#include<stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
void* thread_run(void* arg)
{
    pthread_detach(pthread_self());//线程分离
    int count=5;
    while(count--)
    {
        printf("I am a new thread,tid is %p\n",pthread_self());//pthread_self用来获取当前线程的线程ID
        sleep(1);
}
int main()
{
    pthread_t tid;
    int num=1;
    pthread_create(&tid,NULL,thread_run,(void*)&num);
    sleep(1);
    printf("main thread get a new thread tid:%p\n",tid);
   int ret= pthread_join(tid,NULL);
   printf("ret:%d,%s\n",ret,strerror(ret));
    return 0;
}

此时的运行结果就是：

[Daisy@localhost test_2019_11_3_1]$ ./mythread 
I am a new thread,tid is 0x7fbff10e6700
main thread get a new thread tid:0x7fbff10e6700
ret:22,Invalid argument

主线程等待，返回一个退出码22，显示非法参数，此时等待的新线程已经分离，但是如果去掉等待之前的sleep（1），此时运行结果就是：

[Daisy@localhost test_2019_11_3_1]$ ./mythread 
main thread get a new thread tid:0x7f5a5ab33700
I am a new thread,tid is 0x7f5a5ab33700
I am a new thread,tid is 0x7f5a5ab33700
I am a new thread,tid is 0x7f5a5ab33700
I am a new thread,tid is 0x7f5a5ab33700
I am a new thread,tid is 0x7f5a5ab33700
^C

此时发现主线程一直在等待新线程退出，新线程并没有分离，原因就是没有sleep（1），在创建新线程之后，一共有两种执行流，如果主线程先运行，主线程立马等待，此时新线程还没有被分离，主线程一直在等待新线程退出，因此分离还有另一种方式，例如：

#include<stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
void* thread_run(void* arg)
{
    int count=5;
    while(count--)
    {
        printf("I am a new thread,tid is %p\n",pthread_self());//pthread_self用来获取当前线程的线程ID
        sleep(1);
    }
}

int main()
{
    pthread_t tid;
    int num=1;
    pthread_create(&tid,NULL,thread_run,(void*)&num);
    pthread_detach(tid);//由主线程分离新线程
    printf("main thread get a new thread tid:%p\n",tid);
   int ret= pthread_join(tid,NULL);
    printf("ret:%d,%s\n",ret,strerror(ret));
    return 0;
}

在主线程中分离新线程，运行结果就是：

[Daisy@localhost test_2019_11_3_1]$ ./mythread 
main thread get a new thread tid:0x7fbd5df41700
ret:22,Invalid argument

也就是可以线程组内其他线程对目标线程进行分离，也可以是线程自己分离。

线程ID及进程地址空间布局

pthread_create函数会产生一个线程ID，存放在第一个参数指向的地址中，该线程ID和前面说的线程ID不是一回事，前面所说的线程ID属于进程调度的范畴，因为线程是轻量级进程，是操作系统调度器的最小单位，因此需要一个数值来唯一表示该进程（可以理解前面的线程ID是LWP，就是操作系统来标志轻量级进程的id）
pthread_create函数的第一个参数指向一个虚拟内存单元，该内存单元的地址就是新创建线程的线程ID，属于NPTL线程库的范畴，线程库的后续操作就是根据该线程ID来操作线程的，线程库NPTL提供了pthread_self函数来获取线程自身的ID，（我们这里理解的线程ID就是pthread_t类型的id）。

对于Linux实现的NPTL实现而言，pthread_t类型的线程ID本质上是进程地址空间上的一个地址

如图：
在这里插入图片描述
主线程是独立的，动态库加载后，要把动态库本身全部信息映射到主线程堆、栈之间的共享区（mmap），mmap的过程与共享内存映射一样，动态库中不仅有代码，还有为维护线程创建的数据结构。
用户级别的动态库本身承担了线程的组织、管理工作，即每一个线程地址空间有struct_pthread（线程结构体），线程局部存储以及线程栈，实现“先描述，再组织”，可见每一个线程都有自己的私有栈；将struct_pthread、线程局部存储以及线程栈打包成一个整体，进行用户级别的线程描述。
源代码（github）：
https://github.com/wangbiy/Linux2/commit/c38f0190d25f555abcbaa9f1cfeaae5acbdc17fe