CUDA实现图像二次线性插值缩放

原理:
参考:http://blog.csdn.net/housisong/article/details/1452249,说得很清楚,这里只是顺带一提。

这里写图片描述
(Sx-0)/(SW-0)=(Dx-0)/(DW-0) (Sy-0)/(SH-0)=(Dy-0)/(DH-0)
=> Sx=Dx*SW/DW Sy=Dy*SH/DH
聚焦看看(Sx,Sy)坐标点(Sx,Sy为浮点数)附近的情况;
对于近邻取样插值的缩放算法,直接取Color0颜色作为缩放后点的颜色;
二次线性插值需要考虑(Sx,Sy)坐标点周围的4个颜色值Color0/Color1/Color2/Color3,
把(Sx,Sy)到A/B/C/D坐标点的距离作为系数来把4个颜色混合出缩放后点的颜色;
( u=Sx-floor(Sx); v=Sy-floor(Sy); 说明:floor函数的返回值为小于等于参数的最大整数 )
二次线性插值公式为:
tmpColor0=Color0*(1-u) + Color2*u;
tmpColor1=Color1*(1-u) + Color3*u;
DstColor =tmpColor0*(1-v) + tmpColor2*v;

这里写图片描述
展开公式为:
pm0=(1-u)*(1-v);
pm1=v*(1-u);
pm2=u*(1-v);
pm3=u*v;
则颜色混合公式为:
DstColor = Color0*pm0 + Color1*pm1 + Color2*pm2 + Color3*pm3;

CUDA实现:

__global__ void cudaTransform(Uint8 *output, Uint8 *input, Uint32 pitchOutput, Uint32 pitchInput, Uint8 bytesPerPixelOutput, Uint8 bytesPerPixelInput, float xRatio, float yRatio)
{
    int x = (int)(xRatio * blockIdx.x);
    int y = (int)(yRatio * blockIdx.y);

    Uint8 *a; Uint8 *b; Uint8 *c; Uint8 *d;
    float xDist, yDist, blue, red, green;

    // X and Y distance difference
    xDist = (xRatio * blockIdx.x) - x;
    yDist = (yRatio * blockIdx.y) - y;

    // Points
    a = input + y * pitchInput + x * bytesPerPixelInput;
    b = input + y * pitchInput + (x + 1) * bytesPerPixelInput;
    c = input + (y + 1) * pitchInput + x * bytesPerPixelInput;
    d = input + (y + 1) * pitchInput + (x + 1) * bytesPerPixelInput;

    // blue
    blue = (a[2])*(1 - xDist)*(1 - yDist) + (b[2])*(xDist)*(1 - yDist) + (c[2])*(yDist)*(1 - xDist) + (d[2])*(xDist * yDist);

    // green
    green = ((a[1]))*(1 - xDist)*(1 - yDist) + (b[1])*(xDist)*(1 - yDist) + (c[1])*(yDist)*(1 - xDist) + (d[1])*(xDist * yDist);

    // red
    red = (a[0])*(1 - xDist)*(1 - yDist) + (b[0])*(xDist)*(1 - yDist) + (c[0])*(yDist)*(1 - xDist) + (d[0])*(xDist * yDist);

    Uint8 *p = output + blockIdx.y * pitchOutput + blockIdx.x * bytesPerPixelOutput;
    *(Uint32*)p = 0xff000000 | ((((int)blue) << 16)) | ((((int)green) << 8)) | ((int)red);
}

void RGB24_resize32(uint8_t* src, uint8_t*dst, int w, int h, int dstw, int dsth)
{
    uint32_t src_row_btyes;
    uint32_t dst_row_bytes;
    int src_nb_component;
    int dst_nb_component;
    uint32_t src_size;
    uint32_t dst_size;
    uint8_t* device_src;
    uint8_t* device_dst;

    if (dstw <= 0 || dsth <= 0)
        return;
    float x_ratio = ((float)(w - 1)) / dstw;
    float y_ratio = ((float)(h - 1)) / dsth;

    dim3 grid(dstw, dsth);

    src_row_btyes = (w * 3 + 3) & ~3;
    dst_row_bytes = (dstw * 4 + 3) & ~3;

    src_nb_component = 3;
    dst_nb_component = 4;

    src_size = src_row_btyes * h;
    dst_size = dst_row_bytes * dsth;

    // Copy original image
    cudasafe(cudaMalloc((void **)&device_src, src_size), "Original image allocation ", __FILE__, __LINE__);
    cudasafe(cudaMemcpy(device_src, src, src_size, cudaMemcpyHostToDevice), "Copy original image to device ", __FILE__, __LINE__);

    cudasafe(cudaMalloc((void **)&device_dst, dst_size), "New image allocation ", __FILE__, __LINE__);

    cudaTransform << <grid, 1 >> >(device_dst, device_src, dst_row_bytes, src_row_btyes, dst_nb_component, src_nb_component, x_ratio, y_ratio);
    // Copy scaled image to host
    cudasafe(cudaMemcpy(dst, device_dst, dst_size, cudaMemcpyDeviceToHost), "from device to host", __FILE__, __LINE__);
    cudaFree(device_src);
    cudaFree(device_dst);
}

为了处理方便,最终图片转换成32位位图数据,需要实现24位的改动也不是很大。
CUDA实现代码参考:http://mkaczanowski.com/bilinear-interpolation-with-nvidia-cuda-c/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值