vs2019+QT项目中使用CUDA加速（目前vs2019最详细教程，包括环境配置，以及测试方法

最新推荐文章于 2024-09-23 20:26:31 发布

雷达达

最新推荐文章于 2024-09-23 20:26:31 发布

阅读量6.1k

点赞数 12

文章标签： c++ qt dll cuda

本文链接：https://blog.csdn.net/sqdlmm/article/details/104728915

版权

device_launch_parameters.h

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

vs2019+QT项目中使用CUDA加速（目前vs2019最详细教程，包括环境配置，以及测试方法）

为了实现该功能，我看了很多的大佬的博客，受益良多，但因为大多都是前几年的版本，出现了不少的偏差，因此我的这篇博客属于杂交品种，但经测试，在vs2019中，可以很好地运行。大佬的博客我就不一一给链接了，但还是十分感谢。

文章目录

vs2019+QT项目中使用CUDA加速（目前vs2019最详细教程，包括环境配置，以及测试方法）

思路

可行方案主要有两种，一是创建VS+QT项目文件，并在该项目中创建cu文件，编写cuda函数，并于调用；二是创建DLL项目（动态链接库），在DLL程序中将cuda程序封装好，生成对应的dll文件和lib文件，并将其导入QT项目中，完成cuda函数的调用。

方案一：直接在QT项目中创建cuda函数

1.点击创建新项目。
在这里插入图片描述
2.选择Qt Gui Application，并点击下一步。

3.可以根据个人习惯选择名称与保存地址。

4.接下里一直点击next-》next-》finish，至此，vs中的QT项目创建成功。

5.在资源文件（source files）中添加新建项

6.创建CUDA C/C++ File，并命名为cudatest.cu
在这里插入图片描述
7.环境配置
（1）右键工程文件-》生成依赖项-》生成自定义，然后勾选CUDA。

在这里插入图片描述
（2）右键.cu文件-》属性，然后将项类型改为CUDA C/C++

（3）右键工程文件-》属性-》连接器-》输入，然后添加附加依赖项

（4）工具-》选项-》文本编辑器-》文件拓展名，然后添加拓展名

8.cudatest.cu的代码如下，要想在C++文件中使用cuda的函数就必须使用extern“C”來声明，详细的可以研究一下函数。

#include "cuda_runtime.h"
#include "device_launch_parameters.h"
//#include <stdio.h>
__global__ void addKernel(int* c, const int* a, const int* b)
{
	int i = threadIdx.x;
	c[i] = a[i] + b[i];
}
// Helper function for using CUDA to add vectors in parallel.
void addWithCuda(int* c, const int* a, const int* b, unsigned int size)
{
	int* dev_a = 0;
	int* dev_b = 0;
	int* dev_c = 0;


	// Choose which GPU to run on, change this on a multi-GPU system.
	cudaSetDevice(0);


	// Allocate GPU buffers for three vectors (two input, one output)    .
	cudaMalloc((void**)&dev_c, size * sizeof(int));
	cudaMalloc((void**)&dev_a, size * sizeof(int));
	cudaMalloc((void**)&dev_b, size * sizeof(int));


	// Copy input vectors from host memory to GPU buffers.
	cudaMemcpy(dev_a, a, size * sizeof(int), cudaMemcpyHostToDevice);
	cudaMemcpy(dev_b, b, size * sizeof(int), cudaMemcpyHostToDevice);

	// Launch a kernel on the GPU with one thread for each element.
	addKernel << <1, size >> > (dev_c, dev_a, dev_b);

	// cudaDeviceSynchronize waits for the kernel to finish, and returns
	// any errors encountered during the launch.
	cudaDeviceSynchronize();

	// Copy output vector from GPU buffer to host memory.
	cudaMemcpy(c, dev_c, size * sizeof(int), cudaMemcpyDeviceToHost);

	cudaFree(dev_c);
	cudaFree(dev_a);
	cudaFree(dev_b);
}
extern "C" int* test()
{
	const int arraySize = 5;
	const int a[arraySize] = { 1, 2, 3, 4, 5 };
	const int b[arraySize] = { 10, 20, 30, 40, 50 };
	int* c = new int[5];
	// Add vectors in parallel.
	addWithCuda(c, a, b, arraySize);
	// getchar();
	//system("pause");
	return c;
}
extern "C" int* test_1(int d[])
{
	const int arraySize = 5;
	const int *a = d;
	const int b[arraySize] = { 10, 20, 30, 40, 50 };
	int* c = new int[5];
	// Add vectors in parallel.
	addWithCuda(c, a, b, arraySize);
	// getchar();
	//system("pause");
	return c;
}

9.main.cpp的代码如下

#include "mainwindow.h"
#include <QtWidgets/QApplication>

#include<stdio.h>
#include <QtWidgets\qmessagebox.h>

extern "C" int* test();
extern "C" int* test_1(int d[]);
int main(int argc, char *argv[])
{
	int* c = test();
	int d[] = { 1,2,3,4,6};
	int* e = test_1(d);
	
	QApplication a(argc, argv);
	mainwindow w;
	w.show();

	QMessageBox::about(&w, "CUDA",
	QObject::tr("{1,2,3,4,5} + {10,20,30,40,50}= {%1,%2,%3,%4,%5}")
	.arg(e[0]).arg(e[1]).arg(e[2]).arg(e[3]).arg(e[4]));//创建新的窗口用以显示，cuda函数运行的结果
	return a.exec();
}

10.至此方案一结束，结果如下：
在这里插入图片描述

方案二：使用动态链接库（dll）封装cuda函数，并在QT项目中，调用cuda函数

分两步进行，先进行cuda函数的封装，生成dll文件和lib文件，然后再创建QT项目，并将生成的dll和lib文件导入QT项目中，实现Qt项目的cuda加速。

生成动态链接库（dll）

1.点击创建新项目。
在这里插入图片描述
2.创建动态链接库，并点击下一步。

3.根据自己的习惯，命名dll程序的名字，并保存在相应好找的文件夹里，我的dll的名称为cuda_dll

4.创建成功的解决方案如下
5.在头文件中添加新建项，并命名为cuda_dll.h
在这里插入图片描述

6.在资源文件中添加新建项，命名为kernel.cu

7.此时的解决方案如下：

8.开始配置环境，首先配置解决方案，将编译器改成debug x64

9.右键cuda_dll,然后点击生成依赖项，再点击生成自定义。

选择CUDA10.2，并确定
在这里插入图片描述
10.右键kernel.cu，点击属性

将项类型改成CUDA C/C++，点击确定。

11.右键cuda_dll,点击属性

点击链接器-》输入-》附加依赖项-》编辑

添加依赖项cudart.lib，点击确定，此时环境就已经配置好了。

12.kernel.cu的代码

#include "cuda_runtime.h"  
#include "device_launch_parameters.h"    
#include "cuda_dll.h"
//CUDA核函数  
__global__ void addKernel(int* c, const int* a, const int* b)
{
	int i = threadIdx.x;
	c[i] = a[i] + b[i];
}
//向量相加  
int vectorAdd(int c[], int a[], int b[], int size)
{
	int result = -1;
	int* dev_a = 0;
	int* dev_b = 0;
	int* dev_c = 0;
	cudaError_t cudaStatus;

	// 选择用于运行的GPU  
	cudaStatus = cudaSetDevice(0);
	if (cudaStatus != cudaSuccess) {
		result = 1;
		goto Error;
	}

	// 在GPU中为变量dev_a、dev_b、dev_c分配内存空间.  
	cudaStatus = cudaMalloc((void**)&dev_c, size * sizeof(int));
	if (cudaStatus != cudaSuccess) {
		result = 2;
		goto Error;
	}
	cudaStatus = cudaMalloc((void**)&dev_a, size * sizeof(int));
	if (cudaStatus != cudaSuccess) {
		result = 3;
		goto Error;
	}
	cudaStatus = cudaMalloc((void**)&dev_b, size * sizeof(int));
	if (cudaStatus != cudaSuccess) {
		result = 4;
		goto Error;
	}

	// 从主机内存复制数据到GPU内存中.  
	cudaStatus = cudaMemcpy(dev_a, a, size * sizeof(int), cudaMemcpyHostToDevice);
	if (cudaStatus != cudaSuccess) {
		result = 5;
		goto Error;
	}
	cudaStatus = cudaMemcpy(dev_b, b, size * sizeof(int), cudaMemcpyHostToDevice);
	if (cudaStatus != cudaSuccess) {
		result = 6;
		goto Error;
	}

	// 启动GPU内核函数  
	addKernel << <1, size >> > (dev_c, dev_a, dev_b);

	// 采用cudaDeviceSynchronize等待GPU内核函数执行完成并且返回遇到的任何错误信息  
	cudaStatus = cudaDeviceSynchronize();
	if (cudaStatus != cudaSuccess) {
		result = 7;
		goto Error;
	}

	// 从GPU内存中复制数据到主机内存中  
	cudaStatus = cudaMemcpy(c, dev_c, size * sizeof(int), cudaMemcpyDeviceToHost);
	if (cudaStatus != cudaSuccess) {
		result = 8;
		goto Error;
	}

	result = 0;

	// 重置CUDA设备，在退出之前必须调用cudaDeviceReset  
	cudaStatus = cudaDeviceReset();
	if (cudaStatus != cudaSuccess) {
		return 9;
	}
Error:
	//释放设备中变量所占内存  
	cudaFree(dev_c);
	cudaFree(dev_a);
	cudaFree(dev_b);

	return result;

}

13.cuda_dll.h的代码

#pragma once
extern "C" __declspec(dllexport)  int vectorAdd(int c[], int a[], int b[], int size);

14.右键cuda_dll,点击生成。
在这里插入图片描述
生成成功

15。于是在保存dll程序的文件夹内，出现图中标识三个文件，此时dll的生成就完全结束了。

在VS+QT的项目中导入生成的DLL库，并调用刚才封装好的cuda函数。

1.创建vs+qt项目，方法和之前一样，生成的项目如下。
在这里插入图片描述
2.打开QtGUIApplication文件夹

将我之前用红线标出的，三个文件cuda_dll.h,cuda_dll.dll,cuda_dll.lib,复制到QtGUIApplication文件夹中，结果如下：

3.打开QtGUIApplication的解决方案，右键解决方案，并点击重新生成解决方案
在这里插入图片描述
重新生成成功

4.右键header files，点击添加现有项

选择cuda_dll.h,并点击添加。

5.同样的方法，在source files中添加cuda_dll.dll和cuda_dll.lib,此时的解决方案如下

6.这样环境就已经搭建好了，接下来在main.cpp验证，main.cpp的代码如下

#include "QtGuiApplication1.h"
#include <QtWidgets/QApplication>
#include<qmessagebox.h>
#include"cuda_dll.h"
int main(int argc, char* argv[])
{
	QApplication a(argc, argv);
	QtGuiApplication1 w;
	w.show();
	const int arraySize = 5;
	int d[arraySize] = { 11, 22, 33, 44, 55 };
	int b[arraySize] = { 10, 20, 30, 40, 50 };
	int e[arraySize] = { 0 };
	int number = vectorAdd(e, d, b, arraySize);
	QMessageBox::about(&w, "CUDA",
		QObject::tr("{1,2,3,4,5} + {10,20,30,40,50}= {%1,%2,%3,%4,%5}")
		.arg(e[0]).arg(e[1]).arg(e[2]).arg(e[3]).arg(e[4]));
	return a.exec();
}