Rcpp手册(旧)

1. Some prerequisites

Assume you find some R functions become bottlenecks in your R scripts and dramatically reduce the speed of your code, you can consider transforming them into some equivalent C++ function. You can store these C++ equivalents in a “.cpp” file and call them in your R script with sourceCpp().
In R source file, where you want to call the functions in test.cpp file, use the following code

library(Rcpp)
Sys.setenv("PKG_CXXFLAGS"="-std=c++11")
sourceCpp("test.cpp")

The second line above can let us use C++11 with Rcpp.

In test.cpp file, use the following code

#include <Rcpp.h>
using namespace Rcpp;
//[[Rcpp::export]]

After, //[[Rcpp::export]], you can add your customized C++ functions.

2. Some examples of Rcpp functions with different types of input and output

2.1. Scalar input and Scalar output

In the first example, we implement a univariate sign(x) function which return 1 if x is positive, -1 if x is negative and 0 if x is 0.
R code:

signR <- function(x) {
  if (x > 0) {
    1
  } else if (x == 0) {
    0
  } else {
    -1
  }
}

C++ equivalent:

int signC(int x) {
  if (x > 0) {
    return 1;
  } else if (x == 0) {
    return 0;
  } else {
    return -1;
  }
}

2.2. Vector input and Scalar output

Here we want to calculate the summation of a vector:
R code:

sumR <- function(x) {
  total <- 0
  for (i in seq_along(x)) {
    total <- total + x[i]
  }
  total
}

C++ equivalent:

double sumC(NumericVector x) {
  int n = x.size();
  double total = 0;
  for(int i = 0; i < n; ++i) {
    total += x[i];
  }
  return total;
}
  • sumR() will be several orders of magnitude slower than sumC().
  • use .size() to calculate the length of a vector.

2.3. Vector input and Vector output

In this example, we computes the Euclidean distance between a value and a vector of values.
R code:

pdistR <- function(x, ys) {
  sqrt((x - ys) ^ 2)
}

C++ equivalent:

NumericVector pdistC(double x, NumericVector ys) {
  int n = ys.size();
  NumericVector out(n);
  for(int i = 0; i < n; ++i) {
    out[i] = sqrt(pow(ys[i] - x, 2.0));
  }
  return out;
}
  • In C++, use pow() instead of ^.
  • We create a new numeric vector of length n with a constructor: NumericVector out(n). Another useful way of making a vector is to copy an existing one: NumericVector zs = clone(ys).

2.4. Matrix input and Vector output

In this example, we calculate the row sums of a matrix with rowSums() and a C++ equivalent.

cppFunction('NumericVector rowSumsC(NumericMatrix x) {
  int nrow = x.nrow(), ncol = x.ncol();
  NumericVector out(nrow);
  for (int i = 0; i < nrow; i++) {
    double total = 0;
    for (int j = 0; j < ncol; j++) {
      total += x(i, j);
    }
    out[i] = total;
  }
  return out;
}')

set.seed(1014)
x <- matrix(sample(100), 10)
rowSums(x)

##  [1] 458 558 488 458 536 537 488 491 508 528

rowSumsC(x)

##  [1] 458 558 488 458 536 537 488 491 508 528
  • Each vector type has a matrix equivalent:
    NumericMatrix, IntegerMatrix, CharacterMatrix, and LogicalMatrix.
  • In C++, you subset a matrix with (), not [].
  • Use .nrow() and .ncol() methods to get the dimensions of a matrix.

2.5. Matrix input and Matrix output

We have two p×q matrices and we want to calculate the sum of these two matrices. I do it elementwisely.

NumericMatrix sumC(NumericMatrix x, NumericMatrix y){
  int nrow=x.nrow(), ncol=x.ncol();
  NumericMatrix out(nrow,ncol);
  for (int i=0;i<nrow;++i){
    for (int j=0;j<ncol;++j){
      out(i,j)=x(i,j)+y(i,j);
    }
  }
  return out;
}
  • We create a p×q matrix with NumericMatrix out(p,q);.

3. Core data types

3.1. Rcpp Classes

Here are some commonly used Rcpp classes.

Rcpp ClassesDetailes
IntegerVectorfor vectors of type integer
NumericVectorfor vectors of type numeric
LogicalVectorfor vectors of type logical
CharacterVectorfor vectors of type character
GenericVectorfor generic vectors which implement List types
ExpressionVectorfor vectors of expression types
RawVectorfor vectors of type raw
IntegerMatrixfor matrix of type integer
NumericMatrixfor matrix of type numeric
Listfor lists
DataFramefor data frames

3.2. Vector Classes

Vector is one of the most commonly used classes in Rcpp. Rcpp functions can directly accept R vectors as input and output a Rcpp Vector classes (will be automatically transformed to R vectors when call this function in R). Moreover, if the Rcpp function receives a list, for example, which contains a R vector, and we want to assign this R vector to a C++ object, we can use as<>() for this conversion. For instance,

X=as<NumericVector>(mod["residuals"]);

where X is C++ vector and mod is R list which contains a R vector mod["residuals"]. Also, we can use wrap() function for the inverse direction, but this is often done implicitly when return a C++ vector.

3.2.1. The NumericVector Class

NumericVector is quite the most commonly used vector type among Rcpp classes. This object corresponds to the numeric R vector which stores real-values floating-point variables. Its storage is double.

  • Initialization:

    NumericVector x(n);
    

    Here we initialized a NumericVector called x which has length n (n is selective).

  • Slicing:
    We can take the i th element of x as x[i].

  • Methods:
    x.size(): length of the vector.
    x.push_back(): add a new element in the end
    x.push_front(): add a new element in front

  • Example:
    Let’s calculate the exponential of each element of a vector with a given exponent.

    // [[Rcpp::export]]
    NumericVector powC(NumericVector x, double y){
    NumericVector out(x.size());
    for (auto i=x.begin(),j=out.begin();i!=x.end();++i,++j){
        *j=pow((*i),y);
    }
    for (int i=0;i<out.size();++i){
        std::cout<<out[i]<<" ";
    }
    std::cout<<std::endl;
    return out;
    }
    
    /*** R
    x=rnorm(5,0,1)
    y=2;
    C_square=powC(x,y);
    R_square=x^2;
    cbind(x,R_square,C_square)
    */
    
    > C_square=powC(x,y);
    0.547511 0.107676 0.471773 2.08374 0.0868845 
    
    > cbind(x,R_square,C_square)
          x   R_square   C_square
    [1,] -0.7399398 0.54751089 0.54751089
    [2,] -0.3281404 0.10767615 0.10767615
    [3,] -0.6868573 0.47177289 0.47177289
    [4,]  1.4435153 2.08373656 2.08373656
    [5,]  0.2947617 0.08688446 0.08688446
    

3.2.2. Multidimensional arrays induced by vector

Array also plays an important roles in modeling, and internally, array is implemented by vector with dimension attributes. Matrix classes we will talk later is a special case of array with two dimensional attributes.
- Initialization:

    NumericVector out=NumericVector(Dimension(2,2,3));

3.2.3. Other Vector Class

  • LogicalVector

    // [[Rcpp::export]]
    LogicalVector logicalVector(){
    LogicalVector out(5);
    out[0]=false;
    out[1]=true;
    out[2]=R_NaN;
    out[3]=R_PosInf;
    out[4]=NA_REAL;
    return out;
    }
    /*** R
    logicalVector()
    */
    > logicalVector()
    [1] FALSE  TRUE    NA    NA    NA
    
  • CharacterVector

    // [[Rcpp::export]]
    CharacterVector charVector(){
    CharacterVector out(3);
    out[0]="Hello";
    out[1]="A";
    out[2]="World";
    return out;
    }
    /*** R
    charVector()
    */
    > charVector()
    [1] "Hello" "A"     "World"
    

3.3. Matrix Classes

As we mentioned above, each vector has a matrix equivalent. Here we only talk about NumericMatrix.

3.3.1 NumericMatrix

  • Initialization:

    NumericMatrix mat(nrow,ncol);
    

    Here we initialized a NumericMatrix called mat which has nrow rows and ncol columns.

  • Slicing:
    We can take the (i,j)th element of mat as mat(i,j). If we want to make some change on a subset of a matrix, we may need other library such as RcppArmadillo.

  • Method:
    mat.ncol: number of columns
    mat.nrow: number of rows.

3.4. List aka GenericVector

List is known as a container for data with different type which is the same as the list object in R.

  • Example 1 (List to retrieve parameters from R)

    // [[Rcpp::export]]
    double retrieve_list(List mod)
    {
      double out=0;
      NumericVector x=as<NumericVector>(mod["a"]);
      NumericVector y=as<NumericVector>(mod["b"]);
      for (auto i=x.begin(),j=y.begin();i!=x.end()&&j!=y.end();++i,++j){
        out+=(*i)/((*i)+(*j));
      }
      return out;
    }
    /*** R
    a=c(1,4,3);
    b=c(12,23,24);
    mod=list(a=a,b=b);
    retrieve_list(mod)
    */
    
    > retrieve_list(mod)
    [1] 0.3361823
    
  • Example 2 (List to return parameters to R)

    // [[Rcpp::export]]
    List return_list()
    {
      NumericMatrix y1(2,2);
      NumericVector y2(5);
      NumericVector tmp=rnorm(4);
      for (int i=0;i<2;++i){
        for (int j=0;j<2;++j){
          y1(i,j)=tmp[i+j];
        }
      }
      y2=rnorm(5);
      return List::create(Named("y1")=y1,
                          Named("y2")=y2);
    }
    /*** R
    return_list()
    */
    
    > return_list()
    $y1
              [,1]       [,2]
    [1,]  2.838554 -1.1276921
    [2,] -1.127692 -0.5148732
    
    $y2
    [1] -0.1708179  0.6451966 -0.5324484 -1.3813581  0.2408375
    

3.5. Data Frames

Data frames are an essential object type in R and are used by almost all modeling functions, so naturally Rcpp supports this type too. Internally, data frames are represented as lists.

// [[Rcpp::export]]
DataFrame create_DataFrame()
{       
  NumericVector a=NumericVector::create(1,2,3);
  CharacterVector b=CharacterVector::create("a","b","c");
  std::vector<std::string> c(3);
  c[0]="A";c[1]="B";c[2]="C";
  return DataFrame::create(Named("col1")=a,
                           Named("col2")=b,
                           Named("col3")=c);
}
/*** R
create_DataFrame()
*/

> create_DataFrame()
  col1 col2 col3
1  100    a    A
2  200    b    B
3  300    c    C

4. Rcpp Sugar

4.1. Operators

4.1.1. Binary arithmetic operators

Rcpp sugar defines the usual binary arithmetic operators : +, -, *, /.

    // two numeric vectors of the same size
    NumericVector x ;
    NumericVector y ;
    // expressions involving two vectors
    NumericVector res = x + y ;
    NumericVector res = x - y ;
    NumericVector res = x * y ;
    NumericVector res = x / y ;
    // one vector, one single value
    NumericVector res = x + 2.0 ;
    NumericVector res = 2.0 - x;
    NumericVector res = y * 2.0 ;
    NumericVector res = 2.0 / y;
    // two expressions
    NumericVector res = x * y + y / 2.0 ;
    NumericVector res = x * ( y - 2.0 ) ;
    NumericVector res = x / ( y * y ) ;

4.1.2. Binary logical operators

Binary logical operators create a logical sugar expression from either two sugar expressions of the same type or one sugar expression and a primitive value of the associated type.

    // two integer vectors of the same size
    NumericVector x ;
    NumericVector y ;
    // expressions involving two vectors
    LogicalVector res = x < y ;
    LogicalVector res = x > y ;
    LogicalVector res = x <= y ;
    LogicalVector res = x >= y ;
    LogicalVector res = x == y ;
    LogicalVector res = x != y ;
    // one vector, one single value
    LogicalVector res = x < 2 ;
    LogicalVector res = 2 > x;
    LogicalVector res = y <= 2 ;
    LogicalVector res = 2 != y;
    // two expressions
    LogicalVector res = ( x + y ) < ( x*x ) ;
    LogicalVector res = ( x + y ) >= ( x*x ) ;
    LogicalVector res = ( x + y ) == ( x*x ) ;

4.1.3. Unary operators

The unary operator- can be used to negate a (numeric) sugar expression, whereas the unary operator! negates a logical sugar expression:

4.2. Functions

  • is.na()
    Produces a logical sugar expression of the same length. Each element of the result expression evaluates to TRUE if the corresponding input is a missing value, or FALSE otherwise.
  • seq_len()
    seq_len( 10 ) will generate an integer vector from 1 to 10 (Note: not from 0 to 9), which is very useful in conjugation with sapply() and lapply().
  • pmin(a,b) and pmax(a,b)
    a and b are two vectors. pmin()(or pmax()) compares the i <script type="math/tex" id="MathJax-Element-5">i</script>th elements of a and b and return the smaller (larger) one.
  • ifelse()
    ifelse( x > y, x+y, x-y ) means if x>y is true, then do the addition; otherwise do the subtraction.
  • sapply()
    sapply applies a C++ function to each element of the given expression to create a new expression. The type of the resulting expression is deduced by the compiler from the result type of the function.
    The function can be a free C++ function such as the overload generated by the template function below:

    template <typename T>
    T square( const T& x){
    return x * x ;
    }
    sapply( seq_len(10), square<int> ) ;
    

Alternatively, the function can be a functor whose type has a nested type called result_type

    template <typename T>
    struct square : std::unary_function<T,T> {
        T operator()(const T& x){
        return x * x ;
        }
    }
    sapply( seq_len(10), square<int>() ) ;
  • lappy()
    lapply is similar to sapply except that the result is allways an list expression (an expression of type VECSXP).
  • sign()
  • diff()
  • other functions

    IntegerVector x;
    abs( x )
    exp( x )
    floor( x )
    ceil( x )
    pow(x, z) # x to the power of z
    

Reference

  1. Eddelbuettel, D. (2013). Seamless R and C++ Integration with Rcpp. Springer Publishing Company, Incorporated.
  2. Allaire, J.J. (2016). Rcpp Attributes.
  3. Eddelbuettel, D. (2016). Rcpp syntactic sugar.
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值