GPU高性能计算CUDA编程：读取和输出BMP文件

阿里猫的编程小辈

已于 2024-09-09 00:36:40 修改

阅读量243

点赞数 7

分类专栏： 007 文章标签： c++ 开发语言人工智能 gpu算力

于 2024-09-07 23:56:41 首次发布

本文链接：https://blog.csdn.net/m0_51165837/article/details/142006783

版权

007 专栏收录该内容

37 篇文章 0 订阅

订阅专栏

GPU高性能计算CUDA编程：读取和输出BMP文件

声明：本文不做商用

以下示例代码显示了读取和输出BMP图像的两个函数。这些函数与我们在之前文章中的示例代码中看到的有所不同，区别在于它们是在线性存储区中读取图像，因此在函数名称末尾添加了后缀lin。线性存储区意味着只有一个索引，而不是使用x，y两个索引来存储像素。这使得从磁盘读取图像非常简单，因为图像就是以线性的方式存储在磁盘上。但是，当我们处理图像时，我们需要使用一个非常简单的转换公式将它转换为x，y格式:

像素坐标索引 = (x,y) → 线性索引 = x + (y x ip.Hpixels)            （6.2）

// Read a 24-bit/pixel BMP file into a 1D linear array.
// Allocate memory to store the 1D image and return its pointer.
uch *ReadBMPlin(char* fn)
{
	static uch *Img;
	FILE* f = fopen(fn, "rb");
	if (f == NULL){	printf("\n\n%s NOT FOUND\n\n", fn);	exit(EXIT_FAILURE); }

	uch HeaderInfo[54];
	fread(HeaderInfo, sizeof(uch), 54, f); // read the 54-byte header
	// extract image height and width from header
	int width = *(int*)&HeaderInfo[18];			ip.Hpixels = width;
	int height = *(int*)&HeaderInfo[22];		ip.Vpixels = height;
	int RowBytes = (width * 3 + 3) & (~3);		ip.Hbytes = RowBytes;
	//save header for re-use
	memcpy(ip.HeaderInfo, HeaderInfo,54);
	printf("\n Input File name: %17s  (%u x %u)   File Size=%u", fn, 
			ip.Hpixels, ip.Vpixels, IMAGESIZE);
	// allocate memory to store the main image (1 Dimensional array)
	Img  = (uch *)malloc(IMAGESIZE);
	if (Img == NULL) return Img;      // Cannot allocate memory
	// read the image from disk
	fread(Img, sizeof(uch), IMAGESIZE, f);
	fclose(f);
	return Img;
}


// Write the 1D linear-memory stored image into file.
void WriteBMPlin(uch *Img, char* fn)
{
	FILE* f = fopen(fn, "wb");
	if (f == NULL){ printf("\n\nFILE CREATION ERROR: %s\n\n", fn); exit(1); }
	//write header
	fwrite(ip.HeaderInfo, sizeof(uch), 54, f);
	//write data
	fwrite(Img, sizeof(uch), IMAGESIZE, f);
	printf("\nOutput File name: %17s  (%u x %u)   File Size=%u", fn, ip.Hpixels, ip.Vpixels, IMAGESIZE);
	fclose(f);
}

在每个CUDA程序中会有几个#include<文件路径>语句，分别是<cudaruntime.h>、<cuda.h>和 < device launch parameters.h >，以允许我们使用Nvidia的API。这些 API,例如 cudaMalloc()，是CPU和GPU端之间的桥梁。Nvidia的工程师开发了它们，让你能在CPU 和 GPU之间传输数据而不用关心具体的细节。参见【0voice C++】
请注意在这里定义的数据类型。ul、uch和ui分别表示 unsigned long、unsigned char 和unsignedint。这些类型的数据会经常使用，将它们定义为用户自定义的类型可以使代码更加清晰，减少代码中的混乱。存放文件名的变量是InputFileName和OutputFileName，它们的值都来自于命令行参数。变量ProgName被硬编码到程序中用于输出报告。