OpenCL & Parallel computing



https://www.fixstars.com/en/opencl/book/OpenCLProgrammingBook/contents/

 

  1. Parallel programming

HW:

  • grid computing
  • Cluster server
  • Symmetric multiprocessor system (SMP) - identical processors in powers of 2
  • Multi-core processor - heterogeneous system

Flynn'sTaxonomy

  • SISD
  • SIMD
  • MISD
  • MIMD
    • Distributed memory
    • Shared memory
      • SMP system
      • NUMA system (non-uniform memory access)

Accelerator

  • GPU
  • DSP
  • AI engine

 

ParallelComputing (Software)

  • Data parallel
    • Vectorization
  • Task parallel
    • Multi-Threading
    • Pipeline Optimization

 

Processof parallelism

1. Analyze the dependency withinthe data structures or within the processes, etc, in order to decide whichsections can be executed in parallel.

2. Decide on the best algorithmto execute the code over multiple processors

3. Rewrite code using frameworkssuch as Message Passing Interface (MPI), OpenMP, or OpenCL.

 

Parallelism Levela

1. Write parallel code using theoperating system's functions.

2. Use a parallelizationframework for program porting.

3. Use anautomatic-parallelization compiler.

 

Theory

  • Taking the limit as N goes to infinity, the most speedup that can be achieved is S=1/y. This law is known as Amdahl's Law

From <https://www.fixstars.com/en/opencl/book/OpenCLProgrammingBook/parallel-computing-software/>

 

  • Gustafson states that of the changes in Ts, Tp is directly proportional to the program size, and that Tp grows faster than Ts. 

From <https://www.fixstars.com/en/opencl/book/OpenCLProgrammingBook/parallel-computing-software/>

 

 

  1. Programming Concepts

Creatingan OpenCL program requires writing codes for the host side(CPU), as well as forthe device side (GPU). The device is programmed in OpenCL C, as shown in List3.3 hello.cl. The host is programmed in C/C++ using an OpenCL runtime API(Running on CPU)

 

OpenCLis the open source version of CUDA.

 

We will now take a look atsoftware development in a heterogeneous environment.

First, we will take a look atCUDA, which is a way to write generic code to be run on the GPU. Since no OS isrunning on the GPU, the CPU must perform tasks such as code execution, filesystem management, and the user interface, so that the data parallelcomputations can be performed on the GPU.

In CUDA, the control managementside (CPU) is called the "host", and the data parallel side (GPU) iscalled the "device". The CPU side program is called the host program,and the GPU side program is called the kernel. The main difference between CUDAand the normal development process is that the kernel must be written in theCUDA language, which is an extension of the C language.

 

From <https://www.fixstars.com/en/opencl/book/OpenCLProgrammingBook/historical-background/>

 

OpenCLmodel and Terminology

https://www.fixstars.com/en/opencl/book/OpenCLProgrammingBook/applicable-platforms/

 

So what exactly is meant by"using OpenCL"? When developing software using OpenCL, the followingtwo tools are required:

  • OpenCL Compiler
  • OpenCL Runtime Library

 

From <https://www.fixstars.com/en/opencl/book/OpenCLProgrammingBook/an-overview-of-opencl/>

 

OpenCL provides API for thefollowing programming models.

  • Data parallel programming model
  • Task parallel programming model

 

From <https://www.fixstars.com/en/opencl/book/OpenCLProgrammingBook/applicable-platforms/>

 

 

The basic difference between the2 methods is as follows:

  • Offline: Kernel binary is read in by the host code
  • Online: Kernel source file is read in by the host code

 

From <https://www.fixstars.com/en/opencl/book/OpenCLProgrammingBook/online-offline-compilation/>

 

 

  1. Programming flows:

Itis programming model is similar like OpenGL programming, need to program hostand device side (kernel) program separately. And load/compile/launch/managekernel program from host program. Difference of OpenCL to OpenGL is that it canget data from GPU to CPU. It is bi-directional data path.

 

https://www.fixstars.com/en/opencl/book/OpenCLProgrammingBook/basic-program-flow/

 

  1. Get a list of available platforms
  2. Select device
  3. Create Context
  4. Create command queue
  5. Create memory objects
  6. Read kernel file
  7. Create program object
  8. Compile kernel
  9. Create kernel object
  10. Set kernel arguments
  11. Execute kernel (Enqueue task) hello() kernel function is called here
  12. Read memory object
  13. Free objects
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值