R Sys.setenv

 

 

http://stackoverflow.com/questions/17583846/failed-to-remotely-execute-r-script-which-loads-library-rhdfs

Failed to remotely execute R script which loads library “rhdfs”

I'm working on a project using R-Hadoop, and got this problem.

I'm using JSch in JAVA to ssh to remote hadoop pseudo-cluster, and here are part of Java code to create connection.

/* Create a connection instance */
Connection conn = new Connection(hostname);
/* Now connect */
conn.connect();
/* Authenticate */
boolean isAuthenticated = conn.authenticateWithPassword(username, password);
if (isAuthenticated == false)
throw new IOException("Authentication failed.");
/* Create a session */
Session sess = conn.openSession();
//sess.execCommand("uname -a && date && uptime && who");
sess.execCommand("Rscript -e 'args1 <- \"Dell\"; args2 <- 1; source(\"/usr/local/R/mytest.R\")'");
//sess.execCommand("ls");
sess.waitForCondition(ChannelCondition.TIMEOUT, 50);

I tried several simple R scripts, and my codes worked fine. But when it comes to R-Hadoop, the R script will stop running. But if I run Rscript -e 'args1 <- "Dell"; args2 <- 1; source("/usr/local/R/mytest.R")' directly in remote server, everything works fine.

Here is what I got after taking Hong Ooi's suggestion:Instead of using Rscript, I used following command:

sess.execCommand("R CMD BATCH --no-save --no-restore '--args args1=\"Dell\" args2=1' /usr/local/R/mytest.R /usr/local/R/whathappened.txt");

And in the whathappened.txt, I got following error:

> args=(commandArgs(TRUE))
> for(i in 1:length(args)){
+      eval(parse(text=args[[i]]))
+ }
> source("/usr/local/R/main.R")
> main(args1,args2)
Loading required package: rJava
Error : .onLoad failed in loadNamespace() for 'rhdfs', details:
  call: fun(libname, pkgname)
  error: Environment variable HADOOP_CMD must be set before loading package rhdfs
Error: package/namespace load failed for 鈥榬hdfs鈥?
Execution halted

Well, now the problem is much clearer. Unfortunately, I'm pretty new to linux, and have no idea how to solve this.

share | improve this question
 
 
What error message(s) do you get with RHadoop? Are they Java or R errors? Hong Ooi Jul 11 '13 at 4:53
 
@HongOoi The R script will automatically run in background in remote server, which means the command line user interface in remote server remains unchanged, thus I can't even know what exactly happened in remote server. Even if I add cat("blabla") to the R script, I'll not get any printed information in remote server. So I used a tricky method, generating txt files with name like "Inside xxx function" to see how far the script goes, which turns out it will stop every time when it  try to execute "library("whatever")" Hao Huang Jul 11 '13 at 17:02
 
You can use sink to redirect output to a file. That might help you diagnose what's going on. Hong Ooi Jul 11 '13 at 17:05
 
@HongOoi Thanks for your advice! Check my question update, it shows more information. But I'm so new to linux, and I really don't know how to handle problems related to namespace things. Hao Huang Jul 11 '13 at 18:05
add comment

2 Answers

Well, I solved this problem like this:

sess.execCommand("source /etc/profile; R CMD BATCH --no-save --no-restore '--args args1=\"Dell\" args2=1' /usr/local/R/mytest.R /usr/local/R/whathappened.txt");

The problem was caused by environment. SSH to the remote Hadoop cluster actually uses a different environment, so variables like $HADOOP_CMD will not be discovered. There are multiple ways to let the SSH session know how to pick the environment variables.

In my method, the "source /etc/profile" can tell the sshed environment where to find the environment virables.

share | improve this answer
 
 
add comment
No problem. We won't show you that ad again. Why didn't you like it?
Oops! I didn't mean to do this.
up vote 2 down vote accepted

Well, I just found another solution by myself:

Instead of caring about env from outside Hadoop cluster, can set env in R scripts like:

Sys.setenv(HADOOP_HOME="put your HADOOP_HOME path here")
Sys.setenv(HADOOP_CMD="put your HADOOP_CMD path here")

library(rmr2)
library(rhdfs)
share | improve this answer

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值