Using Threads for Parallelism

Thus far in our study of concurrency, we have assumed concurrent threads executing on uniprocessor systems. However, many modern machines have multi-core processors. Concurrent programs often run faster on such machines because the operating system kernel schedules the concurrent threads in parallel on multiple cores, rather than sequentially on a single core. 
      A sequential program is written as a single logical flow. A concurrent program is written as multiple concurrent flows. A parallel program is a concurrent program running on multiple processors. To understand some important aspects of parallel programming, a very simple program is represented below.
#include <iostream>
#include <stdlib.h>
#include <pthread.h>
using namespace std;
const int MAXTHREADS 32;

void *sum(void *vargp);	// Thread routine

// Global shared variables
long psum[MAXTHREADS];	// Partial sum computed by each thread
long nelems_per_thread;	// Number of elements summed by each thread

int main(int argc, int **argv)
{
		// Get input arguments
		if (argc != 3) {
				cout << "Usage: " << argv[0] 
				     << " <nthreads> <log_nelem>" << endl;
				return 0;
		}	
		long nthreads = atoi(argv[1]);
		long log_nelems = atoi(argv[2]);
		long nelems = (1L << log_nelems);
		nelems_per_thread = nelems / nthreads;
		
		// Create peer threads and wait for them to finish
		pthread_t tid[MAXTHREADS];
		int myid[MAXTHREADS];		
		for (long i = 0; i != nthreads; ++i) {
				myid[i] = i;
				pthread_create(&tid[i], NULL, sum, &myid[i]);
		}
		for (long i = 0; i != nthreads; ++i)
				pthread_join(tid[i], NULL);
		
		// Add up the partial sums computed by each thread
		long result = 0;
		for (long i = 0; i != nthreads; ++i)
				result += psum[i];
		
		// Check final answer
		if (result != (nelems*(nelems-1))/2)
				cout << "Error: result=" << result << endl;
		return 0;
}
The code above shows how we might implement this simple parallel sum algorithm. Notice that the main thread passes a small integer to each peer thread that serves as a unique thread ID. Each peer thread will use its thread ID to determine which portion of the sequence it should work on. This idea of passing a small unique thread ID to the peer thread is a general technique that is used in many parallel applications.
      The thread function that each peer thread executes is showed below.
void *sum(void *vargp)
{
		int myid = *((int *)vargp);			// Extract the thread ID 
		long start = myid * nelems_per_thread;	// Start element index
		long end = start + nelem_per_thread;		// End element index
		
		long sum = 0;
		for (long i = start; i != end; ++i)
				sum += i;
		psum[myid] = sum;
		return NULL;
}
Notice that we are careful to give each peer thread a unique memory location to update, and thus it is not necessary to synchronize access to the psum array with semaphore mutexes. The only necessary synchronization in this particular case is that the main thread must wait for each of the children to finish so that it knows that each entry in psum is valid.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值