C++多线程编程#pragma omp parallel

最新推荐文章于 2025-03-12 17:15:23 发布

ab0902cd

最新推荐文章于 2025-03-12 17:15:23 发布

阅读量2.5w

点赞数 25

本文链接：https://blog.csdn.net/ab0902cd/article/details/108770396

版权

本文详细介绍了如何使用POSIX线程API（pthread_create和pthread_exit）创建和管理线程，以及通过OpenMP的#pragma omp parallel简化多线程编程。特别关注了omp的并行区、for制导和section指导，以及如何设置线程属性和数据共享。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

通常创建线程通过pthread_create来进行线程创建

创建线程

下面的程序，我们可以用它来创建一个 POSIX 线程：

#include <pthread.h>
pthread_create (thread, attr, start_routine, arg)

在这里，pthread_create 创建一个新的线程，并让它可执行。下面是关于参数的说明：

参数	描述
thread	指向线程标识符指针。
attr	一个不透明的属性对象，可以被用来设置线程属性。您可以指定线程属性对象，也可以使用默认值 NULL。
start_routine	线程运行函数起始地址，一旦线程被创建就会执行。
arg	运行函数的参数。它必须通过把引用作为指针强制转换为 void 类型进行传递。如果没有传递参数，则使用 NULL。

创建线程成功时，函数返回 0，若返回值不为 0 则说明创建线程失败。

终止线程

使用下面的程序，我们可以用它来终止一个 POSIX 线程：

#include <pthread.h>
pthread_exit (status)

在这里，pthread_exit 用于显式地退出一个线程。通常情况下，pthread_exit() 函数是在线程完成工作后无需继续存在时被调用

这样创建线程较为复杂而繁琐，下面介绍通过#pragma omp parallel简单而高效的创建线程

#pragma omp parallel创建线程

#pragma omp parallel通过定义代码块创建多线程，如下面的方式指定哪部分代码创建多线程

#include <omp.h>
int main(){
    print(“The output:\n”);
    #pragma omp parallel     /* define multi-thread section */
    {
        printf(“Hello World\n”);
    }
    /* Resume Serial section*/
    printf(“Done\n”);
}

下面是一个创建多线程的实例：

 #include<stdio.h>
 #include<stdlib.h>

 void main(int argc, int *argv[]){
   int width = 1280;
   int height = 1280;
   float *imageBuffer = new float[3 * width* height];
   #pragma omp parallel for num_threads(3)
   {
      int tid = omp_get_thread_num();
      for(int i=0;i< width * height;i++){
            imageBuffer[i] = 0;
            imageBuffer[width * height + i] = 255;
            imageBuffer[width * height * 2 + i] = 0;
      }
   }
}

这种创建多线程的方式简单高效，但是有一点必须注意，#pragma omp parallel关键字创建多线程必须在编译时加上-fopenmp选

项，否则起不到并行的效果，

g++ a.cc -fopenmp

首先，如何使一段代码并行处理呢？omp中使用parallel制导指令标识代码中的并行段，形式为：

#pragma omp parallel

{

每个线程都会执行大括号里的代码

}

如果想将for循环用多个线程去执行，可以用for制导语句

for制导语句是将for循环分配给各个线程执行，这里要求数据不存在依赖。

使用形式为：

（1）#pragma omp parallel for

for()

（2）#pragma omp parallel

{//注意：大括号必须要另起一行

#pragma omp for

for()

}

指定代码分块，每个分块开一个线程去执行，例如

 #pragma omp parallel sections // starts a new team
 {
   { Work1(); }
   #pragma omp section
   { Work2();
     Work3(); }
   #pragma omp section
   { Work4(); }
 }
or
 

 #pragma omp parallel // starts a new team
 {
   //Work0(); // this function would be run by all threads.
   
   #pragma omp sections // divides the team into sections
   { 
     // everything herein is run only once.
     { Work1(); }
     #pragma omp section
     { Work2();
       Work3(); }
     #pragma omp section
     { Work4(); }
   }
   
   //Work5(); // this function would be run by all threads.
 }

以shared,private的修饰为例：

#include <stdlib.h>   //malloc and free
#include <stdio.h>    //printf
#include <omp.h>      //OpenMP

// Very small values for this simple illustrative example
#define ARRAY_SIZE 8     //Size of arrays whose elements will be added together.
#define NUM_THREADS 4    //Number of threads to use for vector addition.

/*
 *  Classic vector addition using openMP default data decomposition.
 *
 *  Compile using gcc like this:
 *  	gcc -o va-omp-simple VA-OMP-simple.c -fopenmp
 *
 *  Execute:
 *  	./va-omp-simple
 */
int main (int argc, char *argv[]) 
{
	// elements of arrays a and b will be added
	// and placed in array c
	int * a;
	int * b; 
	int * c;
        
        int n = ARRAY_SIZE;                 // number of array elements
	int n_per_thread;                   // elements per thread
	int total_threads = NUM_THREADS;    // number of threads to use  
	int i;       // loop index
        
        // allocate spce for the arrays
        a = (int *) malloc(sizeof(int)*n);
	b = (int *) malloc(sizeof(int)*n);
	c = (int *) malloc(sizeof(int)*n);

        // initialize arrays a and b with consecutive integer values
	// as a simple example
        for(i=0; i<n; i++) {
            a[i] = i;
        }
        for(i=0; i<n; i++) {
            b[i] = i;
        }   
        
	// Additional work to set the number of threads.
	// We hard-code to 4 for illustration purposes only.
	omp_set_num_threads(total_threads);
	
	// determine how many elements each process will work on
	n_per_thread = n/total_threads;
	
        // Compute the vector addition
	// Here is where the 4 threads are specifically 'forked' to
	// execute in parallel. This is directed by the pragma and
	// thread forking is compiled into the resulting exacutable.
	// Here we use a 'static schedule' so each thread works on  
	// a 2-element chunk of the original 8-element arrays.
	#pragma omp parallel for shared(a, b, c) private(i) schedule(static, n_per_thread)
        for(i=0; i<n; i++) {
		c[i] = a[i]+b[i];
		// Which thread am I? Show who works on what for this samll example
		printf("Thread %d works on element%d\n", omp_get_thread_num(), i);
        }
	
	// Check for correctness (only plausible for small vector size)
	// A test we would eventually leave out
	printf("i\ta[i]\t+\tb[i]\t=\tc[i]\n");
        for(i=0; i<n; i++) {
		printf("%d\t%d\t\t%d\t\t%d\n", i, a[i], b[i], c[i]);
        }
	
        // clean up memory
        free(a);  free(b); free(c);
	
	return 0;
}

对于递归函数也可以使用task并行：