TensorRT及CUDA自学笔记004 内核函数以及两个简单的内核函数demo

TensorRT及CUDA自学笔记004 内核函数以及两个简单的内核函数demo

内核函数是能够在GPU上被线程并发执行的函数

CUDA程序中的函数修饰符

修饰符执行位置调用注意
global在设备(device)上执行主机和compute capabilitiy 3(计算能力三级)的设备可以调用必须有一个void type的返回值
device在设备(device)上执行只有设备可以调用,只能在设备上执行
host在主机(host)上执行只有主机能调用,只能在主机上执行__host__可以省略,也就是说__host__是默认的修饰符

内核函数的特性

  1. 只能访问GPU memory
  2. 必须返回void type
  3. 不能用变长参数,不能使用静态变量,不能使用函数指针
  4. 有异步性,当内核函数执行时,CPU上的程序可以和内核函数并行执行

demo1 能在GPU上运行并打印信息的内核函数

代码

#include<stdio.h>

__global__ void HellofromGPU(){
    printf("Hello from GPU!\n");
}

int main(){
    printf("Hello from CPU!\n");

    HellofromGPU<<<1,6>>>();//<<<grid,block>>>
    cudaDeviceReset();//释放GPU资源
    return 0;
}

注意:文件后缀名应为.cu

nvcc ./main.cu -o main.exe

然后运行使用nvcc编译的可执行文件

运行结果

luke@ubuntu:~/workspace/TensorRT_course/02_kenel_demo$ ./main.exe
Hello from CPU!
Hello from GPU!
Hello from GPU!
Hello from GPU!
Hello from GPU!
Hello from GPU!
Hello from GPU!

demo2 能在GPU上运行并打印thread 标识和thread ID的内核函数

代码

#include<stdio.h>   
#include<iostream>

__global__ void hellowfromGPU(){
    printf("Hello from block(%d,%d,%d) thread(%d,%d,%d)\t thread ID is %d \n",blockIdx.x,blockIdx.y,blockIdx.z,threadIdx.x,threadIdx.y,threadIdx.z,threadIdx.x  + threadIdx.y * blockDim.x + threadIdx.z * blockDim.x * blockDim.y);
}

int main(){
    std::cout <<"Hello from CPU!"<<std::endl;
    //block被grid包含
    dim3 grid1(2,2,1);//指定name为grid1的grid内部含有2x2x1=4个block块
    dim3 block1(2,2,2);//指定name为block1的block内部含有2x2x2=8个thread
    // printf("Launching kernel width gridDim:%d %d %d blockDim:%d %d %d ",gridDim.x,gridDim.y,gridDim.z,blockDim.x,blockDim.y,blockDim.z);
    // 实际测试发现不能在CPU端直接访问gridDim和blockDim
    std::cout <<"Launching kernel..."<<std::endl;
    hellowfromGPU<<<grid1,block1>>>();//会调用4x8=32个thread并行运行
    //std::cout <<"Work done!"<<std::endl;
    cudaDeviceReset();
    std::cout <<"Work done!"<<std::endl;
    return 0; 
}

注意:文件后缀名应为.cu

nvcc ./main.cu -o main.exe

然后运行使用nvcc编译的可执行文件

运行结果

luke@ubuntu:~/workspace/TensorRT_course/03_cuda_idx$ ./main.exe
Hello from CPU!
Launching kernel...
Hello from block(1,1,0) thread(0,0,0)    thread ID is 0 
Hello from block(1,1,0) thread(1,0,0)    thread ID is 1 
Hello from block(1,1,0) thread(0,1,0)    thread ID is 2 
Hello from block(1,1,0) thread(1,1,0)    thread ID is 3 
Hello from block(1,1,0) thread(0,0,1)    thread ID is 4 
Hello from block(1,1,0) thread(1,0,1)    thread ID is 5 
Hello from block(1,1,0) thread(0,1,1)    thread ID is 6 
Hello from block(1,1,0) thread(1,1,1)    thread ID is 7 
Hello from block(1,0,0) thread(0,0,0)    thread ID is 0 
Hello from block(1,0,0) thread(1,0,0)    thread ID is 1 
Hello from block(1,0,0) thread(0,1,0)    thread ID is 2 
Hello from block(1,0,0) thread(1,1,0)    thread ID is 3 
Hello from block(1,0,0) thread(0,0,1)    thread ID is 4 
Hello from block(1,0,0) thread(1,0,1)    thread ID is 5 
Hello from block(1,0,0) thread(0,1,1)    thread ID is 6 
Hello from block(1,0,0) thread(1,1,1)    thread ID is 7 
Hello from block(0,1,0) thread(0,0,0)    thread ID is 0 
Hello from block(0,1,0) thread(1,0,0)    thread ID is 1 
Hello from block(0,1,0) thread(0,1,0)    thread ID is 2 
Hello from block(0,1,0) thread(1,1,0)    thread ID is 3 
Hello from block(0,1,0) thread(0,0,1)    thread ID is 4 
Hello from block(0,1,0) thread(1,0,1)    thread ID is 5 
Hello from block(0,1,0) thread(0,1,1)    thread ID is 6 
Hello from block(0,1,0) thread(1,1,1)    thread ID is 7 
Hello from block(0,0,0) thread(0,0,0)    thread ID is 0 
Hello from block(0,0,0) thread(1,0,0)    thread ID is 1 
Hello from block(0,0,0) thread(0,1,0)    thread ID is 2 
Hello from block(0,0,0) thread(1,1,0)    thread ID is 3 
Hello from block(0,0,0) thread(0,0,1)    thread ID is 4 
Hello from block(0,0,0) thread(1,0,1)    thread ID is 5 
Hello from block(0,0,0) thread(0,1,1)    thread ID is 6 
Hello from block(0,0,0) thread(1,1,1)    thread ID is 7 
Work done!
  • 7
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值