In an introductory post on R APIs to C code, Calling C Code ‘Hello World!’, we explored the .C() function with some ‘Hello World!’ baby steps. In this post we will make a leap forward by implementing the same functionality using the .Call() function.
Is .Call() better than .C()?
A heated but friendly conversation took place on the r-devel email forum this past March about R’s copying of arguments and the merits of .C() and .Call(). It is perhaps best to just include a highlight from this exchange. Here is Simon Urbanek responding to Hervé Pagès:
The important differences between the two R interfaces to C code are summarized here:
.C()
- allows you to write simple C code that knows nothing about R
- only simple data types can be passed
- all argument type conversion and checking must be done in R
- all memory allocation must be done in R
- all arguments are copied locally before being passed to the C function (memory bloat)
.Call()
- allows you to write simple R code
- allows for complex data types
- allows for a C function return value
- allows C function to allocate memory
- does not require wasteful argument copying
- requires much more knowledge of R internals
- is the recommended, modern approach for serious C programmers
To allow readers to compare for themselves how difficult or easy it is to switch from .C() to .Call() we will re-implement our three “Hello World!” examples using the .Call() interface.
Getting used to SEXP
The first thing you have to embrace when using the .Call() interface is the new way of dealing with R objects inside your C code. Excellent introductory information and example code is available here:
- Calling C code from R (Sigal Blay, 2004) *
- Calling other languages from R (R.M. Ripley, 2009) *
- R API cheat sheet (Simon Urbanek, 2012) *
In preparation for working with .Call() you will want to familiarize yourself with the location of R’s include files. The following Unix shell commands show how to find where R is installed and then look at the contents of the include directory:
Here’s what they contain:
Rconfig.h | various configuration flags |
Rdefines.h | lots of macros of interest, includes Rinternals.h |
Rembedded.h | function declarations for embedding R in C programs |
R_ext | directory of include files for specific data types, etc. |
R.h | includes all the files found in R_ext |
Rinterface.h | provides hooks for external GUIs |
Rinternals.h | core R data structures |
Rmath.h | math constants and function declarations |
Rversion.h | version string components |
S.h | macros for S/R compatibility |
With the .Call() interface, the C function needs to be of type SEXP — a pointer to a SEXPREC or Simple EXPression RECord. We’ll get the definition of SEXP and everything else we need by including both R.h and Rdefines.h in our code. So here is the C code for our first, brain dead C function — helloA1.c:
Note that, even though we are returning R_NilValue (aka NULL), the function is declared to be of type SEXP. The function will always be of type SEXP, as will any arguments. It will be up to the C code to convert other data types into and out of SEXP. As in the previous post, you should compile this code with R CMD SHLIB helloA1.c. Here is the very simple R function we need to add to wrappers.R:
Finally, what does it look like when invoked from R?
Whew! That was a lot of complexity just to run “Hello World!”. However, the value of this complexity will become apparent as we move forward.
PROTECT against garbage collection
One of the things R does well is pick up the garbage we leave lying around. (If you’ve ever lived through a garbage haulers’ strike you know this is a good thing.) Unused objects are disposed of after they are no longer needed (i.e. after there are no more active references to them) to free up memory. As we write C code that uses R functions and structures we need to make sure that R knows when it should not toss something out and, after we are done, when it is again OK. This is done with the PROTECT and UNPROTECT functions.
Here is our next iteration of “Hello World!” where we will allocate space for an R character vector, assign our greeting to the first element and then return the vector:
Note that we allocate memory for a character vector of length # with NEW_CHARACTER(#). It is worth taking a look in the R include files to see how this and similar macros are defined:
So we could have used allocVector(STRSXP,1) instead of NEW_CHARACTER(1) and you will see plenty of the former in R source code and packages. Similarly you can grep for “_ELT” or “mkChar” and learn about those. There really isn’t any definitive source for information and you will have to get comfortable googling, poking around source code examples, examining the R include files and even checking the R-devel mailing list to get a sense of the R functions that are available for getting C code to work with R objects. I would recommend spending some time with Rinternals.h and Rdefines.h.
After R CMD SHLIB‘ing we will again create a very simple wrapper and then run the code from R:
Double Whew! So far it still seems like .Call() is a big headache. But we haven’t really tried to do anything in our C code yet. The complexity/benefit balance evens out a little in our final example.
Casting about in the R header files
The title of this section really says it all. As you start to do more in your C code you will need to learn how to cast character strings into SEXP objects, SEXP objects into integers, etc. etc. There is a finite, but large, amount to know before you become expert. The two links in the “Getting used to SEXP” section above have excellent examples as does Programming with Data: Using and Extending R by Dirk Eddelbuettel.
Here is our last “Hello World!” example, the one that counts the characters in incoming greetings. This example shows how R macros defined in Rdefines.h are used to extract elements from a vector, how vector elements are cast into char and int and how you need to UNPROTECT the same number of elements that you placed on the PROTECT stack.
After R CMD SHLIB, here is the wrapper and the R session:
Yes, it’s still at the double Whew! level but we did some worthwhile things like allocate space for R objects and correctly harness garbage collection. If there were any halfway decent API docs for all this I would have no hesitation in recommending the .Call() interface to anyone writing C code. As it is, however, there will be a painful learning curve. If all you are doing is processing a vector of numbers and returning a simple scalar or vector result then the .C() interface will certainly be much easier — assuming you can take the memory hit. If, on the other hand, you are doing things like using a C library to convert a bunch of raw data into more complex structures then you are going to have to learn to do things the R way.
But there is hope! In the next post we will investigate using the Rcpp package to simplify this robust but complex interface to C code. Hopefully we won’t have to become C++ wizards to do so.
Example Packages using .Call()
The .Call() interface is heavily used in many R packages. Along with poring over Writing R Extensions document it is important to have some example code to work from. Here is a running list of the packages I found with useful example code:
- Rcsdp — R interface to the CSDP semidefinite programming library.
More Information
Hadley Wickham has written an excellent tutorial on using the .Call() interface.