R语言 | 多线程包 RcppParallel 测试_rcppparallel包安装失败无法载入rcppparallel.so’-CSDN博客

本文链接：https://blog.csdn.net/wangjunliang/article/details/126273684

1. Overview 概述

(1) CRAN

https://cran.microsoft.com/snapshot/2022-05-30/web/packages/RcppParallel/index.html

Rcpp进行并行编程的高阶函数。比如， parallelFor() 函数可以用于把标准顺序执行的 for 循环转换为并行的；parallelReduce() 可用于累加求和等。
High level functions for parallel programming with ‘Rcpp’. For example, the ‘parallelFor()’ function can be used to convert the work of a standard serial “for” loop into a parallel one and the ‘parallelReduce()’ function can be used for accumulating aggregate or other values.

(2) 源码

https://rcppcore.github.io/RcppParallel/

RcppParallel 提供了完整的创建可移植、高性能并行算法的工具，无需直接操作系统线程。

RcppParallel provides a complete toolkit for creating portable, high-performance parallel algorithms without requiring direct manipulation of operating system threads. RcppParallel includes:

RcppParallel 包括:

Intel TBB, 一个C++库，用于广泛的并行算法和数据结构（仅限于win, OS X, Linux 和 Solaris x86）
Intel TBB, a C++ library for task parallelism with a wide variety of parallel algorithms and data structures (Windows, OS X, Linux, and Solaris x86 only).
TinyThread，一个可移植的使用操作系统线程的C++库。
TinyThread, a C++ library for portable use of operating system threads.
RVector and RMatrix 包装类，安全、方便的在多进程中访问R数据结构。
RVector and RMatrix wrapper classes for safe and convenient access to R data structures in a multi-threaded environment.
高性能函数(parallelFor and parallelReduce)，使用 Intel TBB 作为后台，如果操作系统支持，其他平台使用 TinyThread
High level parallel functions (parallelFor and parallelReduce) that use Intel TBB as a back-end on systems that support it and TinyThread on other platforms.

2. 实例

实现一个对矩阵每个元素加上0-100的任务。

主机是12线程 Ubuntu 20.04系统。

R 4.1.1
Rcpp v1.0.9
RcppParallel v5.1.5

(1) R code

# https://rcppcore.github.io/RcppParallel/

library(Rcpp)
# form1: as string
Rcpp::cppFunction("
  int add2(int x, int y){
     return x+y;
  }  ")

add2(2,10)


# form2: from file
library("RcppParallel")
Rcpp::sourceCpp("./backup/demo.RCpp.cpp")

mat1=matrix(0:11, nrow=2);mat1
rs=parallelMatrixSqrt(mat1); rs

output:

> rs=parallelMatrixSqrt(mat1); rs
tid=3990719
tid=3990858
tid=3990859
tid=3990863
tid=3990865
tid=3990866
tid=3990864
tid=tid=3990857
tid=3990867
3990861
tid=3990862
tid=3990860
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 5050 5052 5054 5056 5058 5060
[2,] 5051 5053 5055 5057 5059 5061

（2） C++ code

Worker class 的 operator 是干啥的？
是具体执行for循环的代码。

我们定义了自定义函数add10，对每个具体元素进行转换。
为了模拟的更像，我们让函数先休息1s，然后再执行。
为了证明是多线程还是多进程，输出其pid和cid号，证明是多线程，而不是多进程。

$ cat backup/demo.RCpp.cpp

// [[Rcpp::depends(RcppParallel)]]
#include <Rcpp.h>
#include <RcppParallel.h>

namespace Ad10{
#include <unistd.h>
#include<iostream>

#include <sys/unistd.h>
#define gettid() syscall(__NR_gettid)

using namespace std;
int add10(int x){ //自定义函数，输入一个值，处理后返回结果；并行的每个具体子任务
  usleep(1000000);
  for(int i=0; i<=100; i++){
    //std::cout << x << std::endl;
        x +=  i;
  }
  //cout << "pid=" << getpid() << endl;
  cout << "tid=" << gettid() << endl; //一个进程，输出进程号
  return x;
}
}

using namespace Rcpp;
using namespace RcppParallel;



struct SquareRoot : public Worker
{
  // source matrix
  const RMatrix<double> input;

  // destination matrix
  RMatrix<double> output;

  // initialize with source and destination
  SquareRoot(const NumericMatrix input, NumericMatrix output)
  : input(input), output(output) {}

  // take the square root of the range of elements requested
  void operator()(std::size_t begin, std::size_t end) { //把顺序执行的for变成 多线程 并行执行
    std::transform(input.begin() + begin,
                   input.begin() + end,
                   output.begin() + begin,
                   Ad10::add10); //调用自定义函数
                  //::sqrt); //调用c++的平方运算
                  //::fabs); //调用c++的求绝对值
  }
};




// [[Rcpp::export]]
NumericMatrix parallelMatrixSqrt(NumericMatrix x) {

  // allocate the output matrix
  NumericMatrix output(x.nrow(), x.ncol());

  // SquareRoot functor (pass input and output matrixes)
  SquareRoot squareRoot(x, output);

  // call parallelFor to do the work
  parallelFor(0, x.length(), squareRoot); //调用并行形式的for循环，变换矩阵的每个元素

  // return the output matrix
  return output;
}

(3) 效果

当把C++代码中的 add10 中for循环修改为 i<=2e9 ，再执行，能看到CPU短暂跑满，也就是确实启动了多线程。

在这里插入图片描述

3. 更多用法看文档

Rcpp Quick Reference Guide https://dirk.eddelbuettel.com/code/rcpp/Rcpp-quickref.pdf
https://rcppcore.github.io/RcppParallel/
作者在Rstudio上的推广视频: RcppParallel provides a complete toolkit for creating safe, portable, high-performance parallel algorithms