本文主要描述R语言环境的搭建及Java调用R语言,没有深入到R语言本身,等后续再研究;
1. R 语言环境搭建
1.1 软件安装
软件下载完成后安装完成。
1.1.2 运行软件,出现如下界面,安装成功;
2. Java调用R
JAVA开发应用系统,R做运算引擎,将应用于分析引擎结合,做单独做不来的事情。
Java调用R这里介绍两种方式,一种是R作为服务被调用的Rserve方式,这是一个基于TCP/IP的服务器,通过二进制协议传输数据,可以提供远程连接,使得客户端语言能够调用R。第二中是动态库调用的方式;
2.1 先介绍第一种服务调用方式:
2.1.1 在RGui中输入install.packages("Rserve"),安装Rserve包,如下图,需要选择镜像,应该随便哪个都可以的,当然要联网的情况下;
2.1.2 安装完成
2.1.3 程序包-->加载程序包
2.1.4 选择刚才安装的Rserve
2.1.5 输入命令Rserve(),开启服务器,到此R这边就做好了被调用的准备了
2.1.6 Java调用,下载需要的Jar包,REngine.jar和RserveEngine.jar
2.1.7 新建Java工程、创建类写调用的代码,示例如下(部分代码从网络获取)
/**
*
*/
package javaCallR;
import java.io.File;
import java.io.IOException;
import org.rosuda.REngine.REXP;
import org.rosuda.REngine.REXPMismatchException;
import org.rosuda.REngine.REngineException;
import org.rosuda.REngine.Rserve.RConnection;
import org.rosuda.REngine.Rserve.RserveException;
/**
*@author LiHuaibei
*
*/
publicclass RserveCall {
/**
*@param args
*@throws RserveException
*/
publicstaticvoid main(String[]args)throws REXPMismatchException, REngineException {
// RConnection c = new RConnection();
RConnectionc =new RConnection("127.0.0.1");
REXPx =c.eval("R.version.string");
System.out.println(x.asString());
double[]arr =c.eval("rnorm(20)").asDoubles();//rnorm()函数会随机正态分布,然后随机抽样或者取值 n次
for (doublea :arr) {
System.out.print(a + ",");
}
// 保存为图像文件
FiletempFile =null;
try {
c.assign("x",arr);
tempFile = File.createTempFile("test-",".jpg");
StringfilePath =tempFile.getAbsolutePath();
c.eval("jpeg('d://test-1.jpg')");
c.eval("plot(x)");
c.eval("dev.off()");
}catch (IOExceptione) {
e.printStackTrace();
}catch (REngineExceptione) {
e.printStackTrace();
}finally {
c.close();
}
}
}
2.1.8 直接运行即可调用成功;
2.2 第二种是动态库调用的方式JRI
全名是Java/R Interface,通过调用R的动态链接库从而利用R中的函数等:
2.2.1 install.packages("rJava")安装rJava(如第一种方式一样选择镜像安装)
2.2.2 安装完成后再安装目录可以在R的安装目录看到如下的目录结构:
2.2.3 接下来依据这个目录结构设置环境变量:
R_HOME=C:\Program Files\R\R-3.3.1
PATH后添加C:\Program Files\R\R-3.3.1\bin\x64;C:\Program Files\R\R-3.3.1\library\rJava\jri;(可以用%R_HOME%的写法),但是要特别注意“bin\x64”,系统是64就指定x64文件夹,32位就指定i386文件夹,否则会找不到依赖库;同理rJava\jri下用的dll文件也要与计算机位数一致;
2.2.4 环境变量配置完成后把安装目录下jri文件夹下的JRI.jar、REngine.jar和JRIEngine.jar放进Java工程添加到编译路径。到这里Java调用R的桥梁就搭建好了;
2.2.5 Java调用示例(代码就是安装目录下examples下的),直接运行就可以调用成功;
package javaCallR;
import java.io.*;
import java.awt.Frame;
import java.awt.FileDialog;
import java.util.Enumeration;
import org.rosuda.JRI.Rengine;
import org.rosuda.JRI.REXP;
import org.rosuda.JRI.RList;
import org.rosuda.JRI.RVector;
import org.rosuda.JRI.RMainLoopCallbacks;
class TextConsoleimplements RMainLoopCallbacks {
public void rWriteConsole(Rengine re, String text, int oType) {
System.out.print(text);
}
public void rBusy(Rengine re, int which) {
System.out.println("rBusy(" + which + ")");
}
public String rReadConsole(Rengine re, String prompt, int addToHistory) {
System.out.print(prompt);
try {
BufferedReader br =new BufferedReader(new InputStreamReader(System.in));
String s = br.readLine();
return (s == null || s.length() == 0) ? s : s +"\n";
}catch (Exception e) {
System.out.println("jriReadConsole exception: " + e.getMessage());
}
return null;
}
public void rShowMessage(Rengine re, String message) {
System.out.println("rShowMessage \"" + message + "\"");
}
public String rChooseFile(Rengine re, int newFile) {
FileDialog fd =new FileDialog(new Frame(), (newFile == 0) ?"Select a file" :"Select a new file",
(newFile ==0) ? FileDialog.LOAD : FileDialog.SAVE);
fd.show();
String res =null;
if (fd.getDirectory() != null)
res = fd.getDirectory();
if (fd.getFile() != null)
res = (res ==null) ? fd.getFile() : (res + fd.getFile());
return res;
}
public void rFlushConsole(Rengine re) {
}
public void rLoadHistory(Rengine re, String filename) {
}
public void rSaveHistory(Rengine re, String filename) {
}
}
public class DllCall {
public static void main(String[] args) {
// just making sure we have the right version of everything
if (!Rengine.versionCheck()) {
System.err.println("** Version mismatch - Java files don't match library version.");
System.exit(1);
}
System.out.println("Creating Rengine (with arguments)");
// 1) we pass the arguments from the command line
// 2) we won't use the main loop at first, we'll start it later (that's the "false" as second argument)
// 3) the callbacks are implemented by the TextConsole class above
Rengine re =new Rengine(args,false,new TextConsole());
System.out.println("Rengine created, waiting for R");
// the engine creates R is a new thread, so we should wait until it's ready
if (!re.waitForR()) {//失败
System.out.println("Cannot load R");
return;
}
//到这里引擎已经创建并且已经加载了R
/*
* High-level API - do not use RNI methods unless there is no other way to accomplish what you want
*/
try {
REXP x;
re.eval("data(iris)",false);//data(iris)是R的命令
System.out.println(x = re.eval("iris"));//iris是R的命令
// generic vectors are RVector to accomodate names
RVector v = x.asVector();
if (v.getNames() != null) {
System.out.println("has names:");
for (Enumeration e = v.getNames().elements(); e.hasMoreElements();) {
System.out.println(e.nextElement());
}
}
// for compatibility(兼容) with Rserve we allow casting of vectors to lists
RList vl = x.asList();
String[] k = vl.keys();
if (k != null) {
System.out.println("and once again from the list:");
int i = 0;
while (i < k.length)
System.out.println(k[i++]);
}
// get boolean array
System.out.println(x = re.eval("iris[[1]]>mean(iris[[1]])"));
// R knows about TRUE/FALSE/NA, so we cannot use boolean[] this way
// instead, we use int[] which is more convenient (and what R uses
// internally anyway)
int[] bi = x.asIntArray();
{
int i = 0;
while (i < bi.length) {
System.out.print(bi[i] ==0 ?"F " : (bi[i] ==1 ?"T " :"NA "));
i++;
}
System.out.println("");
}
// push a boolean array
boolean by[] = { true,false,false };
re.assign("bool", by);
System.out.println(x = re.eval("bool"));
// asBool returns the first element of the array as RBool
// (mostly useful for boolean arrays of the length 1). is should
// return true
System.out.println("isTRUE? " + x.asBool().isTRUE());
// now for a real dotted-pair list:
System.out.println(x = re.eval("pairlist(a=1,b='foo',c=1:5)"));
RList l = x.asList();
if (l != null) {
int i = 0;
String[] a = l.keys();
System.out.println("Keys:");
while (i < a.length)
System.out.println(a[i++]);
System.out.println("Contents:");
i =0;
while (i < a.length)
System.out.println(l.at(i++));
}
System.out.println(re.eval("sqrt(36)"));
}catch (Exception e) {
System.out.println("EX:" + e);
e.printStackTrace();
}
// Part 2 - low-level API - for illustration purposes only!
// System.exit(0);
// simple assignment like a<-"hello" (env=0 means use R_GlobalEnv)
long xp1 = re.rniPutString("hello");
re.rniAssign("a", xp1, 0);
// Example: how to create a named list or data.frame
double da[] = { 1.2,2.3,4.5 };
double db[] = { 1.4,2.6,4.2 };
long xp3 = re.rniPutDoubleArray(da);
long xp4 = re.rniPutDoubleArray(db);
// now build a list (generic vector is how that's called in R)
long la[] = { xp3, xp4 };
long xp5 = re.rniPutVector(la);
// now let's add names
String sa[] = {"a","b" };
long xp2 = re.rniPutStringArray(sa);
re.rniSetAttr(xp5,"names", xp2);
// ok, we have a proper list now
// we could use assign and then eval "b<-data.frame(b)", but for now
// let's build it by hand:
String rn[] = {"1","2","3" };
long xp7 = re.rniPutStringArray(rn);
re.rniSetAttr(xp5,"row.names", xp7);
long xp6 = re.rniPutString("data.frame");
re.rniSetAttr(xp5,"class", xp6);
// assign the whole thing to the "b" variable
re.rniAssign("b", xp5, 0);
{
System.out.println("Parsing");
long e = re.rniParse("data(iris)",1);
System.out.println("Result = " + e + ", running eval");
long r = re.rniEval(e, 0);
System.out.println("Result = " + r + ", building REXP");
REXP x =new REXP(re, r);
System.out.println("REXP result = " + x);
}
{
System.out.println("Parsing");
long e = re.rniParse("iris",1);
System.out.println("Result = " + e + ", running eval");
long r = re.rniEval(e, 0);
System.out.println("Result = " + r + ", building REXP");
REXP x =new REXP(re, r);
System.out.println("REXP result = " + x);
}
{
System.out.println("Parsing");
long e = re.rniParse("names(iris)",1);
System.out.println("Result = " + e + ", running eval");
long r = re.rniEval(e, 0);
System.out.println("Result = " + r + ", building REXP");
REXP x =new REXP(re, r);
System.out.println("REXP result = " + x);
String s[] = x.asStringArray();
if (s != null) {
int i = 0;
while (i < s.length) {
System.out.println("[" + i + "] \"" + s[i] +"\"");
i++;
}
}
}
{
System.out.println("Parsing");
long e = re.rniParse("rnorm(10)",1);
System.out.println("Result = " + e + ", running eval");
long r = re.rniEval(e, 0);
System.out.println("Result = " + r + ", building REXP");
REXP x =new REXP(re, r);
System.out.println("REXP result = " + x);
double d[] = x.asDoubleArray();
if (d != null) {
int i = 0;
while (i < d.length) {
System.out.print(((i ==0) ?"" :", ") + d[i]);
i++;
}
System.out.println("");
}
System.out.println("");
}
{
REXP x = re.eval("1:10");
System.out.println("REXP result = " + x);
int d[] = x.asIntArray();
if (d != null) {
int i = 0;
while (i < d.length) {
System.out.print(((i ==0) ?"" :", ") + d[i]);
i++;
}
System.out.println("");
}
}
re.eval("print(1:10/3)");
if (true) {
// so far we used R as a computational slave without REPL
// now we start the loop, so the user can use the console
System.out.println("Now the console is yours ... have fun");
re.startMainLoop();
}else {
re.end();
System.out.println("end");
}
}
}
总结:
第一种调用服务的方式,R服务需要面对压力的问题,承受能力待测试验证;第二种动态库调用的方式,根据以往的Java调用微软Dll的经验,处理量大的时候不稳定,不过JRI项目已经成了rJava的子项目,应该不会有问题吧,需要实际生产的验证。