Unix网络编程--基于线程的并发编程(1)

最新推荐文章于 2024-04-01 17:03:06 发布

zyhmz

最新推荐文章于 2024-04-01 17:03:06 发布

阅读量492

点赞数

分类专栏： unix与网络编程深入理解计算机系统

本文链接：https://blog.csdn.net/zyhmz/article/details/61925299

版权

深入理解计算机系统同时被 2 个专栏收录

7 篇文章 0 订阅

订阅专栏

unix与网络编程

6 篇文章 0 订阅

订阅专栏

到目前为止，我们已经看到了两种创建并发逻辑流的方法。在第一种方法中，我们为每个流使用了单独的进程，内核会自动调度每个进程。每个进程有它自己的私有地址空间，这使得流分享数据很难。在第二种方法中，我们创建了自己逻辑流，并利用I/O多路复用来显示地调度流。因为只有一个进程，所有流共享一个地址空间。在这里我们，先介绍线程。
线程就是运行在进程上下文的逻辑流，而且现代系统也允许我们编写一个进程里同时运行多个线程的程序。每个线程都有自己上下文(thread contex)，包括一个唯一的整数线程ID(Thead ID)，栈，栈指针，程序计数器，通用目的寄存器和条件码。所有的运行在一个进程里的线程共享该进程的整个虚拟地址空间。

线程执行模型
多线程的执行模型在某些方面和多进程的执行模型是相似的。每个进程开始的生命周期时都是单一进程，这个线程称为主线程(main thread)。在某一个时刻，线程创建一个对等线程(peer thread)，从这个时间点开始，两个线程就开始并发地运行。和一个进程相关的线程组成一个线程池。

创建线程
线程通过调用pthread_create函数来创建其他线程：

#include <pthread.h> //多线程相关操作头文件，可移植众多平台 
typedef void *(func)(void *){
int pthread_create(pthread_t,*tid,pthread_attr_t *attr,func *f,void *arg);
//返回：若成功则返回0，若出错则为非零             
}

pthread_create函数创建一个新的进程，并带着一个输入变量arg，在新线程的上下文中运行线程例程f，比如，运行一个helloworld线程例程：

void *thread(void *vargp){
     printf("Hello world!\n");
     return NULL;    
}

我们能用attr参数来改变新创建线程的默认属性，在这个示例中，我么用一个为NULL的attr参数来调用pthread_create函数。当pthread_create返回时，参数tid包含新创建线程的ID。新线程可以通过调用pthread_self函数来获取它自己的线程ID。

#include<pthread.h>
pthread_t pthread_self(void); //返回调用者的线程ID

终止线程
一个线程是以以下列方式之一来终止的，当顶层的线程例程返回时，线程会隐式地终止。通过调用pthread_exit函数，线程会显示地终止。如果主线程调用pthread_exit,它会等待所有其他对等线程终止，然后再终止主线程和整个进程，返回值为thread_return。

#include <pthread.h>
void pthread_exit(void *thread_return);
//返回：若成功则返回0，若出错则为非零

某个对等线程调用Unix的exit函数，该函数终止进程以及所有与该进程相关的线程(某个线程就可以杀死整个进程)。另一个对等线程通过以当前线程ID作为参数调用pthread_cancle函数来终止当前线程。
回收已终止线程的资源

#include<pthread.h>
int pthread_join(pthread_t tid,void **thread_return);

pthread_join函数会阻塞，直到线程tid终止，将线程例程返回的(void*)指针赋值为thread_return指向的位置，然后回收已终止线程占用的所有存储器资源。和wait函数不同的是，pthrea_join函数只能等待一个指定的线程终止，没有办法让pthread_join等待任意一个线程终止。

创建第一个多线程程序

#include <iostream>
#include <pthread.h>

#define num_pthread 5

void* thread(void* vargp){
      std::cout<<"hello test"<<std::endl;
      return NULL;
}

int main(){
    pthread_t tids[num_pthread];
    for(auto i=0;i!=num_pthread;i++){
        int ret=pthread_create(&tids[i],NULL,thread,NULL);
        if(ret!=0)
           std::cout<<"pthread_create error"<<std::endl;
    }
    pthread_exit(NULL);
}

我们可以看到，当我们用cout作为标准输出的时候，多线程的运行是混乱的:

hhhheheeelellllllllolooo o   t ttteteeesesssttstt

t

但是如果我们把cout换成格式化输出printf，线程是顺序输出的：

#include <iostream>
#include <pthread.h>

#define num_pthread 5

void* thread(void* vargp){
    printf("hello test\n");
    return NULL;
}

int main(){
    pthread_t tids[num_pthread];
    for(auto i=0;i!=num_pthread;i++){
        int ret=pthread_create(&tids[i],NULL,thread,NULL);
        if(ret!=0)
            printf("pthread_create error\n");
    }
    pthread_exit(NULL);
}

输出结果：

hello test
hello test
hello test
hello test
hello test

所以在C++环境下，std::cout和printf是不能混用的，在多线程环境下可能会导致coredump。下面我来对cout和printf做一个说明：printf和std::cout分别为标准C语言与c++中的函数，两者的缓冲区机制不同（printf无缓冲区，而std::cout有), 而且对于标准输出的加锁时机也略不同：
1.printf：在对标准输出作任何处理前先加锁，所以我们可以锁printf是线程安全的。printf()函数是原子操作的，就是输入完数据，在跳转到其他线程中时会及时地刷新输出流，把数据更新到输出界面。
2.std::cout：在实际向标准输出打印时方才加锁，其并不是线程安全的。这个语句不是原子操作，每次我们往输出缓冲区中加入数据时并不会马上刷新到界面上，一般要cout.flush()后才会把缓冲区的数据输出到界面上, 可以通过源码发现，换行endl会刷新缓冲区, endl其实是一个函数模板，函数里面输出回车后会马上调用flush()函数刷新缓冲区。所以多线程环境下由于输出流刷新不及时，因此出现输出混乱。

将线程调用例程写到一个类中
必须将该例程声明为静态函数函数，因为静态成员函数属于静态全局区，线程可以共享这个区域，故可以各自调用。

#include <iostream>
#include <pthread.h>

#define num_pthread 5

class pthread1{
public:
      static void* thread(void* vargp){      //不能和类同名，因为构造函数不能是静态函数
            std::cout<<"hello test"<<std::endl;
            return NULL;
      }
};

int main(){
    pthread_t tids[num_pthread];
    for(auto i=0;i!=num_pthread;i++){
        int ret=pthread_create(&tids[i],NULL,pthread1::thread,NULL);
        if(ret!=0)
           std::cout<<"pthread_create error"<<std::endl;
    }
    pthread_exit(NULL);

我们可以看到，即使静态函数也不是线程安全的，其实出现线程混乱问题的关键是std::cout的标准输出，如果我们换成printf就没有问题了。

hhhheheeelellllllllolooo o   t ttteteeesessststtt
t

线程调用例程里面传入参数

#include <iostream>
#include <pthread.h>
#define num_pthread 5

void* thread(void* vargp){
      int i=*((int*)vargp);//对传入的参数进行强制类型转换，由无类型指针转变为整形指针，再用*解引用读取其指向到内容
      std::cout<<"hello in thread"<<i<<std::endl;
      return NULL;
      }

int main(){
    pthread_t tids[num_pthread];
    std::cout<<"thread start"<<std::endl;
    for(auto i=0;i!=num_pthread;++i){
        int ret=pthread_create(&tids[i],NULL,thread,(void*)&i);//传入到参数必须强转为void*类型，即无类型指针，&i表示取i的地址，即指向i的指针  
        std::cout<<"Current pthread id ="<<tids[i]<<std::endl;
        if(ret!=0)
           std::cout<<"pthread_create error"<<std::endl;
    }
    pthread_exit(NULL);//等待各个线程退出后，进程才结束，否则进程强制结束，线程处于未终止的状态 
}

我们可以看到结果：

thread start
Current pthread id =0x700000081000
Current pthread id =0x700000104000
Current pthread id =0x700000187000
hello in thread2
hChheueelrlllrlloeoo n  itiin nn p  tttthhhhrrrreeeeaaaadddd3 33
i

d =0x70000020a000
Current pthread id =hello in thread04x
70000028d000

运行的结果相当混乱，因为我们线程例程里面使用的是引用传参，所以++i影响前面的线程的运行，线程例程读的是当前i的值。

加入pthread_join函数

#include <iostream>
#include <pthread.h>

#define num_pthread 5

void* thread(void* vargp){
      int i=*((int*)vargp);
      std::cout<<"hello in "<<i<<std::endl;
      return NULL;
      }

int main(){
    pthread_t tids[num_pthread];
    std::cout<<"thread start"<<std::endl;
    for(auto i=0;i!=num_pthread;++i){
        int ret=pthread_create(&tids[i],NULL,thread,(void*)&i);
        if(ret!=0)
           std::cout<<"pthread_create error"<<std::endl;
            pthread_join(tids[i],NULL);pthread_join用来等待一个线程的结束，是一个线程阻塞的函数 
    }
    pthread_exit(NULL);
}

在这里，线程的标准输出就非常整齐了，pthread_join函数堵塞主线程，等待某个对等线程结束后才开始执行另一个线程：

thread start
hello in thread0
hello in thread1
hello in thread2
hello in thread3
hello in thread4

线程创建时属性参数的设置pthread_attr_t及join功能的使用

我们先来说说分离线程的概念，在任何一个时间点上，线程是可结合的(joinable)或者是可分离的(detached)。一个可结合的线程能够被其他线程回收资源和杀死。在被其他线程回收之前，它的存储器资源(例如栈)是没有被释放的。相反，一个分离的线程是不能被其他线程回收或杀死的。它的存储器资源在它终止时由系统自动释放。默认情况下，线程被创建成可结合的。为了避免存储器泄漏，每个可结合线程都应该要么被其他线程显式地回收，要么通过调用pthread_detach函数被分离：

#include <pthread.h>
int pthread_detach(pthread_t tid);

线程的属性由结构体pthread_attr_t进行管理:

typedef struct
{
    int detachstate;   线程的分离状态
    int schedpolicy;   线程调度策略
    struct sched_param schedparam;   线程的调度参数
    int inheritsched; 线程的继承性 
    int scope; 线程的作用域 
    size_t guardsize; 线程栈末尾的警戒缓冲区大小 
    int stackaddr_set; void * stackaddr; 线程栈的位置 
    size_t stacksize; 线程栈的大小
}pthread_attr_t;

#include <iostream>
#include <pthread.h>

#define num_pthread 5

void* thread(void* vargp){
      int i=*((int*)vargp);
      std::cout<<"hello in thread "<<i<<std::endl;
      int status=10+i;
      pthread_exit((void*)status);//线程退出时添加退出的信息，status供主程序提取该线程的结束信息  
      return NULL;
      }

int main(){
    pthread_t tids[num_pthread];
    int indexes[num_pthread];

    pthread_attr_t attr;
    pthread_attr_init(&attr);
    pthread_attr_setdetachstate(&attr,PTHREAD_CREATE_JOINABLE);//这是设置你想要指定线程属性参数，这个参数表明这个线程是可以join连接的，join功能表示主程序可以等线程结束后再去做某事，实现了主程序和线程同步功能。这说明现在这个线程是可结合的，我们现在必须显式地回收它的存储器资源

    std::cout<<"thread start"<<std::endl;
    for(auto i=0;i!=num_pthread;++i){
        indexes[i] = i;
        int ret=pthread_create(&tids[i],NULL,thread,(void*)&indexes[i]);
        if(ret!=0)
           std::cout<<"pthread_create error"<<std::endl;
        }
        pthread_attr_destroy(&attr);//释放内存
        void *status;
        for(auto i=0;i!=num_pthread;++i){
            int ret=pthread_join(tids[i],&status);
            if (ret!=0) {
                std::cout<<"pthread_join error"<<std::endl;
            }
            else{
                std::cout<<"pthread_join get status:"<<(long)status<<std::endl;
            }
        }
    pthread_exit(NULL);
}

这个程序其实写得模模糊糊的，第一次让我有了一种想看看源码的渴望。特别是在这个程序里面，线程例程嵌套了pthread_exit函数，然而主线程又使用了ptread_join函数改变了status的值。第一次发现深入理解计算机系统也不能carry这个知识点了，路漫漫其修远兮，吾将上下而求索。等我完全理解这个点，再继续回来完成这个blog吧。

zyhmz

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Unix网络编程--基于线程的并发编程(1)

到目前为止，我们已经看到了两种创建并发逻辑流的方法。在第一种方法中，我们为每个流使用了单独的进程，内核会自动调度每个进程。每个进程有它自己的私有地址空间，这使得流分享数据很难。在第二种方法中，我们创建了自己逻辑流，并利用I/O多路复用来显示地调度流。因为只有一个进程，所有流共享一个地址空间。在这里我们，先介绍线程。
复制链接

扫一扫