使用OpenMP优化for循环进行并行处理需要for的每次训话是彼此独立的。看下面的实例伪代码:
用相邻图像中的加权平均像素(包括该图像)来替换每个图像像素,便可通过模糊的方式来弱化图像。以下伪代码介绍了 3x3 模糊模板:
1)
- sum = value of pixel
- // compute the average of 9 pixels from imageIn
- for each neighbor of (pixel)
- sum += value of neighbor
- // store the resulting value in imageOut
- pixelOut = sum / 9
2)另一个常见的例子是循环内部的指针发生偏移:
ptr = &someArray[0]
- for (i = 0; i < N; i++)
- {
- Compute (ptr);
- ptr++;
- }
3)下面是用OpenCV编写的横向合并两张图片的测试源代码:
IplImage *pImgOne = cvLoadImage("Result45678.jpg");
IplImage *pImgTwo = cvLoadImage("R009_9.jpg");
if (pImgOne==NULL || pImgTwo==NULL)
{
printf("Load Pic failed!/r/n");
return;
}
int iWidthResult = pImgOne->width + pImgTwo->width;
int iHightResult = pImgOne->height;
IplImage *pImgResult = cvCreateImage(cvSize(iWidthResult,iHightResult), pImgOne->depth, pImgOne->nChannels);
char *pResult = pImgResult->imageData;
char *pOne = pImgOne->imageData;
char *pTwo = pImgTwo->imageData;
for(int i=0; i<iHightResult-1; i++)
{
memcpy(pResult, pOne, pImgOne->widthStep);
pResult += pImgOne->widthStep;
pOne += pImgOne->widthStep;
memcpy(pResult, pTwo, pImgTwo->widthStep);
pResult += pImgTwo->widthStep;
pTwo += pImgTwo->widthStep;
}
这个三个例子都不能简单的利用OpenMP进行for并行优化。但我们可以人工消除并行依赖。
对于1)和3)我们可以采用分块处理,充分机器的多核优势!使每个处理器单独的处理每个数据块。
为了有效地实现模糊运算线程化,可以考虑将图像细分为子图像,或固定大小的数据块。模糊算法支持独立地对数据块进行计算。以下伪代码阐释了图像模块化的使用方法:
// Decompose the image into non-overlapping blocks.
- blockList = Decompose (image, xRes, yRes)
- foreach (block in blockList)
- {
- BlurBlock (block, imageIn, imageOut)
- }
同理,横向复制每行图像的程序3)也可以采用分块处理。
int iBlockSize = 512;//分块的大小与机器的处理器个数有关,笔者本机4核处理器,图像的总高度为2048,512=2048/4呵呵!
int iEveryBlockH = iHightResult/iBlockSize;
#pragma omp parallel for
for (int i=0; i<iBlockSize; i++)
{
JointPicForMP(pImgOne, pImgTwo, pImgResult, i, iEveryBlockH);
}
void JointPicForMP(IplImage *pImgOne, IplImage *pImgTwo, IplImage *pImgResult, int iIndex, int iHight)
{
char *pResult = pImgResult->imageData + iIndex * pImgResult->widthStep * iHight;
char *pOne = pImgOne->imageData + iIndex * pImgOne->widthStep * iHight;
char *pTwo = pImgTwo->imageData + iIndex * pImgTwo->widthStep * iHight;
for (int i=0; i<iHight-1; i++)
{
memcpy(pResult, pOne, pImgOne->widthStep);
pResult += pImgOne->widthStep;
pOne += pImgOne->widthStep;
memcpy(pResult, pTwo, pImgTwo->widthStep);
pResult += pImgTwo->widthStep;
pTwo += pImgTwo->widthStep;
}
}
对于程序2)可以这样修改即可应用并行处理。
- ptr = &someArray[0]
- for (i = 0; i < N; i++)
- {
- Compute (ptr[i]);
- }