The answer is simple — understanding the difference between concurrent and parallel. I believe these two terms are often used interchangeably while, in my opinion, they are represent two different concepts.
Let’s start with concurrency. A concurrent program or algorithm is one where operations can occur at the same time. For instance, a simple integration, where numbers are summed over an interval. The interval can be broken into many concurrent sums of smaller sub-intervals. As I like to say, concurrency is a property of the program. Parallel execution is when the concurrent parts are executed at the same time on separate processors. The distinction is subtle, but important. And, parallel execution is a property of the machine, not the program.
If execution efficiency is important (i.e. you want things to go faster by adding more cores), then the question you need to ask is “If I run everything that is concurrent in parallel, will my code run faster?” If the answer were “yes” then we would not be having this discussion. And, since the answer, is “no”, then the question is “What should run in parallel?” which is obviously, the portions of code that lower execution time.
This decision is one of the reasons cluster parallel computing is hard. It really does depend on the machine. Take our integration case. If the integration interval is small, then breaking it up into small sub-intervals and sending them out to other nodes will result in extending the execution time of the program due to parallel overhead. If the integration interval is huge, then parallel execution may make sense. Because parallel overhead can vary from cluster to cluster, there is no easy way to predict overhead beforehand. (i.e. The parallel overhead is larger for GigE vs InfiniBand when sending small packets.)
The same applies to multi-core. The overhead for thread communication is lower, but there is still overhead (see my HPC Hopscotch for background on SMP memory). There is no free lunch — everyone has to deal with overhead.
In summary, the point I want to make is this, Concurrency is a property of the program and parallel execution is a property of the machine. What concurrent parts should and should not be executed in parallel can only be answered when the exact hardware is known. Which I might like to add leads to the most unhappy conclusion when dealing with explicit parallel programming, There is no guarantee of both efficiency and portability with explicit parallel programs. Yes, I know, a sad state of affairs. I’ll let you wrestle with that for a while, in the mean time, I’m going to the beach.
原文地址:http://www.linux-mag.com/id/7411/
Let’s start with concurrency. A concurrent program or algorithm is one where operations can occur at the same time. For instance, a simple integration, where numbers are summed over an interval. The interval can be broken into many concurrent sums of smaller sub-intervals. As I like to say, concurrency is a property of the program. Parallel execution is when the concurrent parts are executed at the same time on separate processors. The distinction is subtle, but important. And, parallel execution is a property of the machine, not the program.
If execution efficiency is important (i.e. you want things to go faster by adding more cores), then the question you need to ask is “If I run everything that is concurrent in parallel, will my code run faster?” If the answer were “yes” then we would not be having this discussion. And, since the answer, is “no”, then the question is “What should run in parallel?” which is obviously, the portions of code that lower execution time.
This decision is one of the reasons cluster parallel computing is hard. It really does depend on the machine. Take our integration case. If the integration interval is small, then breaking it up into small sub-intervals and sending them out to other nodes will result in extending the execution time of the program due to parallel overhead. If the integration interval is huge, then parallel execution may make sense. Because parallel overhead can vary from cluster to cluster, there is no easy way to predict overhead beforehand. (i.e. The parallel overhead is larger for GigE vs InfiniBand when sending small packets.)
The same applies to multi-core. The overhead for thread communication is lower, but there is still overhead (see my HPC Hopscotch for background on SMP memory). There is no free lunch — everyone has to deal with overhead.
In summary, the point I want to make is this, Concurrency is a property of the program and parallel execution is a property of the machine. What concurrent parts should and should not be executed in parallel can only be answered when the exact hardware is known. Which I might like to add leads to the most unhappy conclusion when dealing with explicit parallel programming, There is no guarantee of both efficiency and portability with explicit parallel programs. Yes, I know, a sad state of affairs. I’ll let you wrestle with that for a while, in the mean time, I’m going to the beach.
原文地址:http://www.linux-mag.com/id/7411/