openMP

OpenMP

OpenMP is an industry standard API of C/C++ and Fortran for shared memory parallel programming. The OpenMP Architecture Review Board (ARB) consists of major compiler vendors and many research institutions. Common architectures include shared memory architecture (multiple CPUs sharing global memory with Uniform Memory Access (UMA) and a typical shared memory programming model of OpenMP or pthreads), distributed memory architecture (each CPU has its own memory with Non-Uniform Memory Access (NUMA) and the typical Message Passing Interface (MPI), and hybrid architecture (UMA within one node or socket, NUMA across nodes or sockets, and the typical hybrid programming model of hybrid MPI/OpenMP). The current architecture trend needs a hybrid programming model with three levels of parallelism: MPI between nodes or sockets, shared memory (such as OpenMP) on the nodes/sockets, and increased vectorization for lower-level loop structures.

OpenMP has three components: compiler directives and clauses, runtime libraries, and environment variables. The compiler directives are only interpreted when the OpenMP compiler option is turned on. OpenMP uses the "fork and join" execution model: the master thread forks new threads at the beginning of parallel regions, multiple threads share work in parallel; and threads join at the end of parallel regions.

OpenMP fork and join model

In OpenMP, all threads have access to the same shared global memory. Each thread has access to its own private local memory. Threads synchronize implicitly by reading and writing shared variables. No explicit communication is needed between threads.

OpenMP memory model

The thread that executes the implicit parallel region that surrounds the whole program executes on the host device. OpenMP supports other devices (e.g., GPUs) besides the host device (i.e., CPUs). On Perlmutter, GPUs are available to the host device for offloading code and data. Each device has its own threads that are distinct from threads that execute on another device, and threads cannot migrate from one device to another device. For info on how to offload code and data, please see here and here in the Perlmutter Readiness page.

Major features in OpenMP 3.1 include:

  • Thread creation with shared and private memory
  • Loop parallelism and work sharing constructs
  • Dynamic work scheduling
  • Explicit and implicit synchronizations
  • Simple reductions
  • Nested parallelism
  • OpenMP tasking

New features in OpenMP 4.0 (released in July 2013) include:

  • Device constructs for accelerators
  • SIMD constructs for vectorization
  • Task groups and dependencies
  • Thread affinity control
  • User defined reductions
  • Cancellation construct
  • Initial support for Fortran 2003
  • OMP_DISPLAY_ENV for all internal variables

New features in OpenMP 4.5 (released in November 2015) include:

  • Significantly improved support for devices
  • Support for doacross loops
  • New taskloop construct
  • Reductions for C/C++ arrays
  • New hint mechanisms
  • Thread affinity support
  • Improved support for Fortran 2003
  • SIMD extensions

 

参考链接

Home - OpenMP

Reference Guides - OpenMP

Specifications - OpenMP

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值