Rcpp手册(旧)

最新推荐文章于 2023-03-31 10:10:07 发布

EmissaryDX

最新推荐文章于 2023-03-31 10:10:07 发布

阅读量1k

点赞数 1

分类专栏： r语言文章标签： R Rcpp c++ 科学计算统计

本文链接：https://blog.csdn.net/iamsuperman2/article/details/72860539

版权

r语言专栏收录该内容

3 篇文章 1 订阅

订阅专栏

1. Some prerequisites

Assume you find some R functions become bottlenecks in your R scripts and dramatically reduce the speed of your code, you can consider transforming them into some equivalent C++ function. You can store these C++ equivalents in a “.cpp” file and call them in your R script with sourceCpp().
In R source file, where you want to call the functions in test.cpp file, use the following code

library(Rcpp)
Sys.setenv("PKG_CXXFLAGS"="-std=c++11")
sourceCpp("test.cpp")

The second line above can let us use C++11 with Rcpp.

In test.cpp file, use the following code

#include <Rcpp.h>
using namespace Rcpp;
//[[Rcpp::export]]

After, //[[Rcpp::export]], you can add your customized C++ functions.

2. Some examples of Rcpp functions with different types of input and output

2.1. Scalar input and Scalar output

In the first example, we implement a univariate sign(x) function which return 1 if x is positive, -1 if x is negative and 0 if x is 0.
R code:

signR <- function(x) {
  if (x > 0) {
    1
  } else if (x == 0) {
    0
  } else {
    -1
  }
}

C++ equivalent:

int signC(int x) {
  if (x > 0) {
    return 1;
  } else if (x == 0) {
    return 0;
  } else {
    return -1;
  }
}

2.2. Vector input and Scalar output

Here we want to calculate the summation of a vector:
R code:

sumR <- function(x) {
  total <- 0
  for (i in seq_along(x)) {
    total <- total + x[i]
  }
  total
}

C++ equivalent:

double sumC(NumericVector x) {
  int n = x.size();
  double total = 0;
  for(int i = 0; i < n; ++i) {
    total += x[i];
  }
  return total;
}

sumR() will be several orders of magnitude slower than sumC().
use .size() to calculate the length of a vector.

2.3. Vector input and Vector output

In this example, we computes the Euclidean distance between a value and a vector of values.
R code:

pdistR <- function(x, ys) {
  sqrt((x - ys) ^ 2)
}

C++ equivalent:

NumericVector pdistC(double x, NumericVector ys) {
  int n = ys.size();
  NumericVector out(n);
  for(int i = 0; i < n; ++i) {
    out[i] = sqrt(pow(ys[i] - x, 2.0));
  }
  return out;
}

In C++, use pow() instead of ^.
We create a new numeric vector of length n with a constructor: NumericVector out(n). Another useful way of making a vector is to copy an existing one: NumericVector zs = clone(ys).

2.4. Matrix input and Vector output

In this example, we calculate the row sums of a matrix with rowSums() and a C++ equivalent.

cppFunction('NumericVector rowSumsC(NumericMatrix x) {
  int nrow = x.nrow(), ncol = x.ncol();
  NumericVector out(nrow);
  for (int i = 0; i < nrow; i++) {
    double total = 0;
    for (int j = 0; j < ncol; j++) {
      total += x(i, j);
    }
    out[i] = total;
  }
  return out;
}')

set.seed(1014)
x <- matrix(sample(100), 10)
rowSums(x)

##  [1] 458 558 488 458 536 537 488 491 508 528

rowSumsC(x)

##  [1] 458 558 488 458 536 537 488 491 508 528

Each vector type has a matrix equivalent:
NumericMatrix, IntegerMatrix, CharacterMatrix, and LogicalMatrix.
In C++, you subset a matrix with (), not [].
Use .nrow() and .ncol() methods to get the dimensions of a matrix.

2.5. Matrix input and Matrix output

We have two $p\times q$ matrices and we want to calculate the sum of these two matrices. I do it elementwisely.

NumericMatrix sumC(NumericMatrix x, NumericMatrix y){
  int nrow=x.nrow(), ncol=x.ncol();
  NumericMatrix out(nrow,ncol);
  for (int i=0;i<nrow;++i){
    for (int j=0;j<ncol;++j){
      out(i,j)=x(i,j)+y(i,j);
    }
  }
  return out;
}

We create a $p\times q$ matrix with NumericMatrix out(p,q);.

3. Core data types

3.1. Rcpp Classes

Here are some commonly used Rcpp classes.

Rcpp Classes	Detailes
IntegerVector	for vectors of type integer
NumericVector	for vectors of type numeric
LogicalVector	for vectors of type logical
CharacterVector	for vectors of type character
GenericVector	for generic vectors which implement List types
ExpressionVector	for vectors of expression types
RawVector	for vectors of type raw
IntegerMatrix	for matrix of type integer
NumericMatrix	for matrix of type numeric
List	for lists
DataFrame	for data frames

3.2. Vector Classes

Vector is one of the most commonly used classes in Rcpp. Rcpp functions can directly accept R vectors as input and output a Rcpp Vector classes (will be automatically transformed to R vectors when call this function in R). Moreover, if the Rcpp function receives a list, for example, which contains a R vector, and we want to assign this R vector to a C++ object, we can use as<>() for this conversion. For instance,

X=as<NumericVector>(mod["residuals"]);

where X is C++ vector and mod is R list which contains a R vector mod["residuals"]. Also, we can use wrap() function for the inverse direction, but this is often done implicitly when return a C++ vector.

3.2.1. The NumericVector Class

NumericVector is quite the most commonly used vector type among Rcpp classes. This object corresponds to the numeric R vector which stores real-values floating-point variables. Its storage is double.

Initialization:
```
NumericVector x(n);
```
Here we initialized a NumericVector called x which has length n (n is selective).
Slicing:
We can take the $i$ th element of x as x[i].
Methods:
x.size(): length of the vector.
x.push_back(): add a new element in the end
x.push_front(): add a new element in front

Example:
Let’s calculate the exponential of each element of a vector with a given exponent.

// [[Rcpp::export]]
NumericVector powC(NumericVector x, double y){
NumericVector out(x.size());
for (auto i=x.begin(),j=out.begin();i!=x.end();++i,++j){
    *j=pow((*i),y);
}
for (int i=0;i<out.size();++i){
    std::cout<<out[i]<<" ";
}
std::cout<<std::endl;
return out;
}

/*** R
x=rnorm(5,0,1)
y=2;
C_square=powC(x,y);
R_square=x^2;
cbind(x,R_square,C_square)
*/

> C_square=powC(x,y);
0.547511 0.107676 0.471773 2.08374 0.0868845 

> cbind(x,R_square,C_square)
      x   R_square   C_square
[1,] -0.7399398 0.54751089 0.54751089
[2,] -0.3281404 0.10767615 0.10767615
[3,] -0.6868573 0.47177289 0.47177289
[4,]  1.4435153 2.08373656 2.08373656
[5,]  0.2947617 0.08688446 0.08688446

3.2.2. Multidimensional arrays induced by vector

Array also plays an important roles in modeling, and internally, array is implemented by vector with dimension attributes. Matrix classes we will talk later is a special case of array with two dimensional attributes.
- Initialization:

    NumericVector out=NumericVector(Dimension(2,2,3));

3.2.3. Other Vector Class

LogicalVector

// [[Rcpp::export]]
LogicalVector logicalVector(){
LogicalVector out(5);
out[0]=false;
out[1]=true;
out[2]=R_NaN;
out[3]=R_PosInf;
out[4]=NA_REAL;
return out;
}
/*** R
logicalVector()
*/
> logicalVector()
[1] FALSE  TRUE    NA    NA    NA

CharacterVector

// [[Rcpp::export]]
CharacterVector charVector(){
CharacterVector out(3);
out[0]="Hello";
out[1]="A";
out[2]="World";
return out;
}
/*** R
charVector()
*/
> charVector()
[1] "Hello" "A"     "World"

3.3. Matrix Classes

As we mentioned above, each vector has a matrix equivalent. Here we only talk about NumericMatrix.

3.3.1 NumericMatrix

Initialization:
```
NumericMatrix mat(nrow,ncol);
```
Here we initialized a NumericMatrix called mat which has nrow rows and ncol columns.
Slicing:
We can take the $(i,j)$ th element of mat as mat(i,j). If we want to make some change on a subset of a matrix, we may need other library such as RcppArmadillo.
Method:
mat.ncol: number of columns
mat.nrow: number of rows.

3.4. List aka GenericVector

List is known as a container for data with different type which is the same as the list object in R.

Example 1 (List to retrieve parameters from R)

// [[Rcpp::export]]
double retrieve_list(List mod)
{
  double out=0;
  NumericVector x=as<NumericVector>(mod["a"]);
  NumericVector y=as<NumericVector>(mod["b"]);
  for (auto i=x.begin(),j=y.begin();i!=x.end()&&j!=y.end();++i,++j){
    out+=(*i)/((*i)+(*j));
  }
  return out;
}
/*** R
a=c(1,4,3);
b=c(12,23,24);
mod=list(a=a,b=b);
retrieve_list(mod)
*/

> retrieve_list(mod)
[1] 0.3361823

Example 2 (List to return parameters to R)

// [[Rcpp::export]]
List return_list()
{
  NumericMatrix y1(2,2);
  NumericVector y2(5);
  NumericVector tmp=rnorm(4);
  for (int i=0;i<2;++i){
    for (int j=0;j<2;++j){
      y1(i,j)=tmp[i+j];
    }
  }
  y2=rnorm(5);
  return List::create(Named("y1")=y1,
                      Named("y2")=y2);
}
/*** R
return_list()
*/

> return_list()
$y1
          [,1]       [,2]
[1,]  2.838554 -1.1276921
[2,] -1.127692 -0.5148732

$y2
[1] -0.1708179  0.6451966 -0.5324484 -1.3813581  0.2408375

3.5. Data Frames

Data frames are an essential object type in R and are used by almost all modeling functions, so naturally Rcpp supports this type too. Internally, data frames are represented as lists.

// [[Rcpp::export]]
DataFrame create_DataFrame()
{       
  NumericVector a=NumericVector::create(1,2,3);
  CharacterVector b=CharacterVector::create("a","b","c");
  std::vector<std::string> c(3);
  c[0]="A";c[1]="B";c[2]="C";
  return DataFrame::create(Named("col1")=a,
                           Named("col2")=b,
                           Named("col3")=c);
}
/*** R
create_DataFrame()
*/

> create_DataFrame()
  col1 col2 col3
1  100    a    A
2  200    b    B
3  300    c    C

4. Rcpp Sugar

4.1. Operators

4.1.1. Binary arithmetic operators

Rcpp sugar defines the usual binary arithmetic operators : +, -, *, /.

    // two numeric vectors of the same size
    NumericVector x ;
    NumericVector y ;
    // expressions involving two vectors
    NumericVector res = x + y ;
    NumericVector res = x - y ;
    NumericVector res = x * y ;
    NumericVector res = x / y ;
    // one vector, one single value
    NumericVector res = x + 2.0 ;
    NumericVector res = 2.0 - x;
    NumericVector res = y * 2.0 ;
    NumericVector res = 2.0 / y;
    // two expressions
    NumericVector res = x * y + y / 2.0 ;
    NumericVector res = x * ( y - 2.0 ) ;
    NumericVector res = x / ( y * y ) ;

4.1.2. Binary logical operators

Binary logical operators create a logical sugar expression from either two sugar expressions of the same type or one sugar expression and a primitive value of the associated type.

    // two integer vectors of the same size
    NumericVector x ;
    NumericVector y ;
    // expressions involving two vectors
    LogicalVector res = x < y ;
    LogicalVector res = x > y ;
    LogicalVector res = x <= y ;
    LogicalVector res = x >= y ;
    LogicalVector res = x == y ;
    LogicalVector res = x != y ;
    // one vector, one single value
    LogicalVector res = x < 2 ;
    LogicalVector res = 2 > x;
    LogicalVector res = y <= 2 ;
    LogicalVector res = 2 != y;
    // two expressions
    LogicalVector res = ( x + y ) < ( x*x ) ;
    LogicalVector res = ( x + y ) >= ( x*x ) ;
    LogicalVector res = ( x + y ) == ( x*x ) ;

4.1.3. Unary operators

The unary operator- can be used to negate a (numeric) sugar expression, whereas the unary operator! negates a logical sugar expression:

4.2. Functions

is.na()
Produces a logical sugar expression of the same length. Each element of the result expression evaluates to TRUE if the corresponding input is a missing value, or FALSE otherwise.
seq_len()
seq_len( 10 ) will generate an integer vector from 1 to 10 (Note: not from 0 to 9), which is very useful in conjugation with sapply() and lapply().
pmin(a,b) and pmax(a,b)
a and b are two vectors. pmin()(or pmax()) compares the i <script type="math/tex" id="MathJax-Element-5">i</script>th elements of a and b and return the smaller (larger) one.
ifelse()
ifelse( x > y, x+y, x-y ) means if x>y is true, then do the addition; otherwise do the subtraction.
sapply()
sapply applies a C++ function to each element of the given expression to create a new expression. The type of the resulting expression is deduced by the compiler from the result type of the function.
The function can be a free C++ function such as the overload generated by the template function below:
```
template <typename T>
T square( const T& x){
return x * x ;
}
sapply( seq_len(10), square<int> ) ;
```

Alternatively, the function can be a functor whose type has a nested type called result_type

    template <typename T>
    struct square : std::unary_function<T,T> {
        T operator()(const T& x){
        return x * x ;
        }
    }
    sapply( seq_len(10), square<int>() ) ;

lappy()
lapply is similar to sapply except that the result is allways an list expression (an expression of type VECSXP).
sign()
diff()

other functions

IntegerVector x;
abs( x )
exp( x )
floor( x )
ceil( x )
pow(x, z) # x to the power of z

Reference

Eddelbuettel, D. (2013). Seamless R and C++ Integration with Rcpp. Springer Publishing Company, Incorporated.
Allaire, J.J. (2016). Rcpp Attributes.
Eddelbuettel, D. (2016). Rcpp syntactic sugar.