barrier synchronization 障碍同步

最新推荐文章于 2024-11-06 22:01:55 发布

安二柴

最新推荐文章于 2024-11-06 22:01:55 发布

阅读量2k

点赞数

barrier synchronization用于这样的场景：在执行某个任务前，必须完成N个任务，一般由N个线程来各自完成一个任务。

相关函数：
int pthread_barrier_init(pthread_barrier_t *barrier,
const pthread_barrierattr_t *restrict attr,
unsigned count);
count参数必须大于0，指定要同步的线程的数量：只有当所有的线程都执行pthread_barrier_wait 后，它们才能从pthread_barrier_wait返回。

pthread_barrier_wait：同步当前线程，使其在barrier对象处同步。当在该barrier处执行pthread_barrier_wait的线程数量达到预先设定值后，该线程会得到PTHREAD_BARRIER_SERIAL_THREAD返回值，其他线程得到0返回值。barrier对象会被reset到最近一次init的状态。

Barriers

Some parallel computations need to "meet up" at certain points before continuing. This can, of course, be accomplished with semaphores, but another construct is often more convenient: the barrier (the pthreads library pthread_barrier_t). As a motivating example, take this program:

#define _XOPEN_SOURCE 600

#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>

#define ROWS 10000
#define COLS 10000
#define THREADS 10

double initial_matrix[ROWS][COLS];
double final_matrix[ROWS][COLS];
// Barrier variable
pthread_barrier_t barr;

extern void DotProduct(int row, int col,
double source[ROWS][COLS],
double destination[ROWS][COLS]);
extern double determinant(double matrix[ROWS][COLS]);

void * entry_point(void *arg)
{
int rank = (int)arg;
for(int row = rank * ROWS / THREADS; row < (rank + 1) * THREADS; ++row)
for(int col = 0; col < COLS; ++col)
DotProduct(row, col, initial_matrix, final_matrix);

// Synchronization point
int rc = pthread_barrier_wait(&barr);
if(rc != 0 && rc != PTHREAD_BARRIER_SERIAL_THREAD)
{
printf("Could not wait on barrier\n");
exit(-1);
}

for(int row = rank * ROWS / THREADS; row < (rank + 1) * THREADS; ++row)
for(int col = 0; col < COLS; ++col)
DotProduct(row, col, final_matrix, initial_matrix);
}

int main(int argc, char **argv)
{
pthread_t thr[THREADS];

// Barrier initialization
if(pthread_barrier_init(&barr, NULL, THREADS))
{
printf("Could not create a barrier\n");
return -1;
}

for(int i = 0; i < THREADS; ++i)
{
if(pthread_create(&thr[i], NULL, &entry_point, (void*)i))
{
printf("Could not create thread %d\n", i);
return -1;
}
}

for(int i = 0; i < THREADS; ++i)
{
if(pthread_join(thr[i], NULL))
{
printf("Could not join thread %d\n", i);
return -1;
}
}

double det = Determinant(initial_matrix);
printf("The determinant of M^4 = %f\n", det);

return 0;
}
This program spawns a number of threads, assigning each to compute part of a matrix multiplication. Each thread then uses the result of that computation in the next phase: another matrix multiplication.

There are a few things to note here:

The barrier declaration at the top
The barrier initialization in main
The point where each thread waits for its peers to finish.