Learn CUDA Programming第二章 scaling image例子报错 invalid argument

从Learn CUDA programming 的GitHub网站下载的例子在我的Ubuntu上运行会出现错误

Example Code如下:

#include<stdio.h>
#include"scrImagePgmPpmPackage.h"

//Kernel which calculate the resized image
__global__ void createResizedImage(unsigned char *imageScaledData, int scaled_width, float scale_factor, cudaTextureObject_t texObj)
{
	const unsigned int tidX = blockIdx.x*blockDim.x + threadIdx.x;
	const unsigned int tidY = blockIdx.y*blockDim.y + threadIdx.y;
	const unsigned index = tidY*scaled_width+tidX;
       	
	// Step 4: Read the texture memory from your texture reference in CUDA Kernel
	imageScaledData[index] = tex2D<unsigned char>(texObj,(float)(tidX*scale_factor),(float)(tidY*scale_factor));
}

int main(int argc, char*argv[])
{
	int height=0, width =0, scaled_height=0,scaled_width=0;
	//Define the scaling ratio	
	float scaling_ratio=0.5;
	unsigned char*data;
	unsigned char*scaled_data,*d_scaled_data;

	char inputStr[1024] = {"aerosmith-double.pgm"};
	char outputStr[1024] = {"aerosmith-double-scaled.pgm"};
	cudaError_t returnValue;

	//Create a channel Description to be used while linking to the tecture
	cudaArray* cu_array;
	cudaChannelFormatKind kind = cudaChannelFormatKindUnsigned;
	cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc(8, 0, 0, 0, kind);

	get_PgmPpmParams(inputStr, &height, &width);	//getting height and width of the current image
	data = (unsigned char*)malloc(height*width*sizeof(unsigned char));
	printf("\n Reading image width height and width [%d][%d]", height, width);
	scr_read_pgm( inputStr , data, height, width );//loading an image to "inputimage"

	scaled_height = (int)(height*scaling_ratio);
	scaled_width = (int)(width*scaling_ratio);
	scaled_data = (unsigned char*)malloc(scaled_height*scaled_width*sizeof(unsigned char));
	printf("\n scaled image width height and width [%d][%d]", scaled_height, scaled_width);

	//Allocate CUDA Array
 	returnValue = cudaMallocArray( &cu_array, &channelDesc, width, height);
	returnValue = (cudaError_t)(returnValue | cudaMemcpy( cu_array, data, height * width * sizeof(unsigned char), cudaMemcpyHostToDevice));

	if(returnValue != cudaSuccess)
		printf("\n Got error while running CUDA API Array Copy");

	// Step 1. Specify texture
	struct cudaResourceDesc resDesc;
	memset(&resDesc, 0, sizeof(resDesc));
	resDesc.resType = cudaResourceTypeArray;
	resDesc.res.array.array = cu_array;
	// Step 2. Specify texture object parameters
	struct cudaTextureDesc texDesc;
	memset(&texDesc, 0, sizeof(texDesc));
	texDesc.addressMode[0] = cudaAddressModeClamp;
	texDesc.addressMode[1] = cudaAddressModeClamp;
	texDesc.filterMode = cudaFilterModePoint;
	texDesc.readMode = cudaReadModeElementType;
	texDesc.normalizedCoords = 0;

	// Step 3: Create texture object
	cudaTextureObject_t texObj = 0;
	cudaCreateTextureObject(&texObj, &resDesc, &texDesc, NULL);

	if(returnValue != cudaSuccess) 
		printf("\n Got error while running CUDA API Bind Texture");
	
	cudaMalloc(&d_scaled_data, scaled_height*scaled_width*sizeof(unsigned char) );

	dim3 dimBlock(32, 32,1);
	dim3 dimGrid(scaled_width/dimBlock.x,scaled_height/dimBlock.y,1);
	printf("\n Launching grid with blocks [%d][%d] ", dimGrid.x,dimGrid.y);

	createResizedImage<<<dimGrid, dimBlock>>>(d_scaled_data,scaled_width,1/scaling_ratio, texObj);

	returnValue = (cudaError_t)(returnValue | cudaDeviceSynchronize());

	returnValue = (cudaError_t)(returnValue |cudaMemcpy (scaled_data , d_scaled_data, scaled_height*scaled_width*sizeof(unsigned char), cudaMemcpyDeviceToHost ));
	if(returnValue != cudaSuccess) 
		printf("\n Got error while running CUDA API kernel");

	// Step 5: Destroy texture object
	cudaDestroyTextureObject(texObj);
	
	scr_write_pgm( outputStr, scaled_data, scaled_height, scaled_width, "####" ); //storing the image with the detections
		
	if(data != NULL)
		free(data);
	if(cu_array !=NULL)
		cudaFreeArray(cu_array);
	if(scaled_data != NULL)
		free(scaled_data);
	if(d_scaled_data!=NULL)
		cudaFree(d_scaled_data);
	
	return 0;
}

根据cudaGetErrorString(returnValue) 的输出来看,错误定位到这两句

returnValue = cudaMemcpy( cu_array, data, height * width * sizeof(unsigned char), cudaMemcpyHostToDevice)

cudaCreateTextureObject(&texObj, &resDesc, &texDesc, NULL);

很明显是上一条语句的失败导致创建纹理对象失败

一开始还怀疑是不是作者读pgm的接口有问题,换成呗被图形学验证过无数次的stb_image这个库后还是报这两句错。

于是我开始怀疑CUDA API是否随着版本更新有变化,毕竟作者完成这本书是用的CUDA 10.x开发环境,而我的CUDA版本已经是12.0了。

最后发现cudaArray这个类型在新版本中从Host端复制到Device端需要使用cudaMemcpy2DToArray这个接口

重新编译

nvcc -c image_scaling.cu

nvcc -o image_scaling image_scaling.o scrImagePgmPpmPackage.o

发现已经不报错并且生成了缩放后的图片

不过我的代码是由stb_image读取的,所以在读取图片后CPU传给GPU前它的数据就是上下翻转的,因此GPU做完Kernel运算后缩放的图片转存后的图片也是上下翻转的。

  • 5
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值