pthread并行编程

转载 2012年03月25日 00:20:18

zz
http://pages.cs.wisc.edu/~travitch/pthreads_primer.html


Introduction

POSIX threads (pthreads) are a standardized interface onoperating system threads.

Compiling a pthreads Program

Relevant headers aside (they are discussed below), a program wishingto use pthreads must link against the pthreads library. Here is anexample invocation of gcc
demonstrating this:

gcc -pedantic -Wall -o theaded_program src.c -lpthread

The -l flag specifies the name of a library to linkagainst (pthread, in our case); since pthreads is a systemlibrary,gcc knows where to find it.

Creating Threads

Any program using pthreads will need to include pthread.h.Below is the Hello World of pthreads programs:

#include <pthread.h>
#include <stdio.h>

void * entry_point(void *arg)
{
    printf("Hello world!\n");

    return NULL;
}

int main(int argc, char **argv)
{
    pthread_t thr;
    if(pthread_create(&thr, NULL, &entry_point, NULL))
    {
        printf("Could not create thread\n");
        return -1;
    }

    if(pthread_join(thr, NULL))
    {
        printf("Could not join thread\n");
        return -1;
    }
    return 0;
}

In this example, note that "Hello again?" is neverprinted. The thread exits inother_function andentry_point never returns. The argument topthread_exit is a value to be returned to the joiningthread.

Synchronization

The pthreads specification provides many synchronizationprimitives; we will cover three in this primer:

  • Barriers
  • Mutexes
  • Semaphores

Implementing barrier in Pthreads

Barriers

Some parallel computations need to "meet up" at certainpoints before continuing. This can, of course, be accomplishedwith semaphores, but another construct is often more convenient:the barrier (the pthreads librarypthread_barrier_t).As a motivating example, take this program:

#define _XOPEN_SOURCE 600

#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>


#define ROWS 10000
#define COLS 10000
#define THREADS 10

double initial_matrix[ROWS][COLS];
double final_matrix[ROWS][COLS];
// Barrier variable
pthread_barrier_t barr;

extern void DotProduct(int row, int col,
                       double source[ROWS][COLS],
                       double destination[ROWS][COLS]);
extern double determinant(double matrix[ROWS][COLS]);

void * entry_point(void *arg)
{
    int rank = (int)arg;
    for(int row = rank * ROWS / THREADS; row < (rank + 1) * THREADS; ++row)
        for(int col = 0; col < COLS; ++col)
            DotProduct(row, col, initial_matrix, final_matrix);

    // Synchronization point
    int rc = pthread_barrier_wait(&barr);
    if(rc != 0 && rc != PTHREAD_BARRIER_SERIAL_THREAD)
    {
        printf("Could not wait on barrier\n");
        exit(-1);
    }

    for(int row = rank * ROWS / THREADS; row < (rank + 1) * THREADS; ++row)
        for(int col = 0; col < COLS; ++col)
            DotProduct(row, col, final_matrix, initial_matrix);
}

int main(int argc, char **argv)
{
    pthread_t thr[THREADS];

    // Barrier initialization
    if(pthread_barrier_init(&barr, NULL, THREADS))
    {
        printf("Could not create a barrier\n");
        return -1;
    }

    for(int i = 0; i < THREADS; ++i)
    {
        if(pthread_create(&thr[i], NULL, &entry_point, (void*)i))
        {
            printf("Could not create thread %d\n", i);
            return -1;
        }
    }

    for(int i = 0; i < THREADS; ++i)
    {
        if(pthread_join(thr[i], NULL))
        {
            printf("Could not join thread %d\n", i);
            return -1;
        }
    }

    double det = Determinant(initial_matrix);
    printf("The determinant of M^4 = %f\n", det);

    return 0;
}
This program spawns a number of threads, assigning each tocomputepart of a matrix multiplication. Each threadthen uses the result of that computation in the next phase: anothermatrix multiplication.

There are a few things to note here:

  1. The barrier declaration at the top
  2. The barrier initialization in main
  3. The point where each thread waits for its peers to finish.

NOTE

The preprocessor definition of _XOPEN_SOURCE at the topof the program is important; without it, the barrier prototypesare not defined inpthread.h. The definition must comebefore any headers are included.

Mutexes

The pthreads library provides a basic synchronization primitive:pthread_mutex_t. The declarations required to usepthreads mutexes are included inpthread.h. This is astandard mutex with lock and unlock operations; see this example:

#include <pthread.h>
#include <stdio.h>
#include <math.h>

#define ITERATIONS 10000

// A shared mutex
pthread_mutex_t mutex;
double target;

void* opponent(void *arg)
{
    for(int i = 0; i < ITERATIONS; ++i)
    {
        // Lock the mutex
        pthread_mutex_lock(&mutex);
        target -= target * 2 + tan(target);
        // Unlock the mutex
        pthread_mutex_unlock(&mutex);
    }

    return NULL;
}

int main(int argc, char **argv)
{
    pthread_t other;

    target = 5.0;

    // Initialize the mutex
    if(pthread_mutex_init(&mutex, NULL))
    {
        printf("Unable to initialize a mutex\n");
        return -1;
    }

    if(pthread_create(&other, NULL, &opponent, NULL))
    {
        printf("Unable to spawn thread\n");
        return -1;
    }


    for(int i = 0; i < ITERATIONS; ++i)
    {
        pthread_mutex_lock(&mutex);
        target += target * 2 + tan(target);
        pthread_mutex_unlock(&mutex);
    }

    if(pthread_join(other, NULL))
    {
        printf("Could not join thread\n");
        return -1;
    }

    // Clean up the mutex
    pthread_mutex_destroy(&mutex);

    printf("Result: %f\n", target);

    return 0;
}
The important functions for managing mutexes are:
  • pthread_mutex_init:Initialize a new mutex.
  • pthread_mutex_destroy:Clean up a mutex that is no longer needed.
  • pthread_mutex_lock:Acquire a mutex (blocking if it is not available).
  • pthread_mutex_unlock:Release a mutex that you previously locked.

Semaphores

The pthreads library itself does not provide a semaphore;however, a separate POSIX standard does define them. Thenecessary declarations to use these semaphores are containedinsemaphore.h.

NOTE: Do not confuse these with SystemV semaphoreswhich are insys/sem.h.

#include <semaphore.h>
#include <pthread.h>
#include <stdio.h>

#define THREADS 20

sem_t OKToBuyMilk;
int milkAvailable;

void* buyer(void *arg)
{
    // P()
    sem_wait(&OKToBuyMilk);
    if(!milkAvailable)
    {
        // Buy some milk
        ++milkAvailable;
    }
    // V()
    sem_post(&OKToBuyMilk);

    return NULL;
}

int main(int argc, char **argv)
{
    pthread_t threads[THREADS];

    milkAvailable = 0;

    // Initialize the semaphore with a value of 1.
    // Note the second argument: passing zero denotes
    // that the semaphore is shared between threads (and
    // not processes).
    if(sem_init(&OKToBuyMilk, 0, 1))
    {
        printf("Could not initialize a semaphore\n");
        return -1;
    }

    for(int i = 0; i < THREADS; ++i)
    {
        if(pthread_create(&threads[i], NULL, &buyer, NULL))
        {
            printf("Could not create thread %d\n", i);
            return -1;
        }
    }

    for(int i = 0; i < THREADS; ++i)
    {
        if(pthread_join(threads[i], NULL))
        {
            printf("Could not join thread %d\n", i);
            return -1;
        }
    }

    sem_destroy(&OKToBuyMilk);

    // Make sure we don't have too much milk.
    printf("Total milk: %d\n", milkAvailable);

    return 0;
}
The semaphore API has several functions of note:
  • sem_init:Initialize a new semaphore. Note, the second argumentdenoteshow the semaphore will be shared. Passingzero denotes that it will be shared amongthreadsrather than processes. The final argument is the initialvalue of the semaphore.
  • sem_destroy:Deallocate an existing semaphore.
  • sem_wait:This is the P() operation.
  • sem_post:This is the V() operation.

Relevant Man Pages

Man pages for all of the necessary library functionsshould be available on every CSL Linux system:
  Basic Management Barriers Mutexes Semaphores
Creation pthread_create pthread_barrier_init pthread_mutex_init sem_init
Destroy pthread_exit pthread_barrier_destroy pthread_mutex_destroy sem_destroy
Waiting pthread_join pthread_barrier_wait - -
Acquisition - - pthread_mutex_lock sem_wait
Release - - pthread_mutex_unlock sem_post

Resources

Below are some additional resources:

Heterogeneous Parallel Programming(异构并行编程)学习笔记(二)

这里讲讲二维数据和内存模型 1. 二维参数设置 前面已经说过,CUDA支持多维的Grid和Block,以方便处理多维数据,那么在调用Kernel时函数也会有所不同。假定需要处理一张76x6...

C# 并行编程 之 PLINQ 规约操作和聚合函数

概要 PLINQ可以简化对一个序列或一个组中所有成员应用同一个函数的过程,这个过程称之为规约操作。类似Sum()函数就是一个规约操作。PLINQ提供一个可重载Aggregate的接口,这里用户可以定...

matlab并行编程SPMD

SPMD(Single Program/Multiple Data)单程序多任务进行任务并行:并行可分为两种,一种是任务并行(parfor),另一种则数据并行(Spmd)。Spmd中的“Single ...

Java并行编程-lock中使用多条件condition(生产者消费者模式实例)

Java 并发包下的提供Lock,Lock相对于Synchronized可以更好的解决线程同步问题,更加的灵活和高效,并且ReadWriteLock锁还能实现读、写的分离。但线程间仅仅互斥是不够的,还...

C# 并行编程 之 并发集合 (.Net Framework 4.0)

此文为个人学习《C#并行编程高级教程》的笔记,总结并调试了一些文章中的代码示例。 在以后开发过程中可以加以运用。 对于并行任务,与其相关紧密的就是对一些共享资源,数据结构的并行访问。经常要做的就是...

Heterogeneous Parallel Programming(异构并行编程)学习笔记(四)

这次的内容主要集中在Reduction模型上。 1. Reduction Reduction是一种广泛使用的计算模型,特别是在并行计算领域。简单地来说,Reduction就是一系列的划分(...

CUDA并行编程较有用的总结

Cuda并行编程学习时候需注意的一些基本概念 1、Cuda的编程风格:spmp(单程序多数据)的并行编程风格。 2、在多GPU下,cudaMemcpy()不能用于GPU之间的数据复制 3、cud...

用 Hadoop 进行分布式并行编程(一)基本概念与安装部署

基本概念与安装部署

并行编程中的内存回收Hazard Pointer

接上篇使用RCU技术实现读写线程无锁,在没有GC机制的语言中,要实现Lock free的算法,就免不了要自己处理内存回收的问题。 Hazard Pointer是另一种处理这个问题的算法,而且相比起来...

python 多cpu并行编程

python 多线程只能算并发,因为它智能使用一个cpu内核 python 下 pp包支持多cpu并行计算 安装   pip install pp 使用 #-*- codin...
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:pthread并行编程
举报原因:
原因补充:

(最多只允许输入30个字)