CDH3(Hadoop 0.20) Install -- RedHat6/CentOS6 -- Single Node

1 prepare for installation

1.1 Make sure your host name is valid

For example, if your hostname is vm231.com, then edit /etc/hosts, add a entry like:

127.0.0.1  vm231.com

1.2 Install the Oracle Java Development Kit

Download a recommended version of the Oracle JDK from http://www.oracle.com/technetwork/java/javasebusiness/downloads/java-archive-downloads-javase6-419409.html We choose  jdk-6u45-linux-x64-rpm.bin (jdk-6uXX-linux-x64-rpm.bin file for 64-bit systems and jdk-6uXX-linux-i586-rpm.bin for 32-bit systems.)

Install jdk

# chmod a+x jdk-6u45-linux-x64-rpm.bin
# ./jdk-6u45-linux-x64-rpm.bin

As the root user, set JAVA_HOME to the directory where the JDK is installed; for example:

# export JAVA_HOME="/usr/java/jdk1.6.0_45"
# export PATH=$JAVA_HOME/bin:$PATH

Note 1: where <jdk-install-dir> might be something like /usr/java/jdk1.6.0_26,which contain executable file "bin/java" ,depending on the system configuration and where the JDK is actually installed.

Note 2: I also tried OpenJDK, but couldn't find the way to make it work.

2 Installing CDH3 on a Single Linux Node in Pseudo-distributed mode

2.1 Download the CDH3 Package

For RedHat/CentOS 6  http://archive.cloudera.com/redhat/6/x86_64/cdh/cdh3-repository-1.0-1.noarch.rpm

Install the RPM:

# sudo yum --nogpgcheck localinstall cdh3-repository-1.0-1.noarch.rpm

2.2 Install CDH3

(Optionally) add a repository key. Add the Cloudera Public GPG Key to your repository by executing the following command:

# sudo rpm --import http://archive.cloudera.com/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera  

Install Hadoop in pseudo-distributed mode:

# sudo yum install hadoop-0.20-conf-pseudo

3 Starting Hadoop and Verifying it is Working Properly: 

3.1 Start the Daemons:

# for service in /etc/init.d/hadoop-0.20-*; do sudo $service start; done

3.2 Confirm Hadoop is working by performing some operations and running a job.

For example, try performing some DFS operations:

$ hadoop fs -mkdir /foo
$ hadoop fs -ls /
Found 2 items
drwxr-xr-x - root supergroup 0 2013-10-22 19:11 /foo
drwxr-xr-x - mapred supergroup 0 2013-10-22 19:11 /var
$ hadoop fs -rmr /foo
Deleted hdfs://localhost/foo
$ hadoop fs -ls /
Found 1 items
drwxr-xr-x - mapred supergroup 0 2013-10-22 19:11 /var

4 Common Error

When you verify your configuration, like performing $ hadoop fs -mkdir /foo, you may got error

13/10/22 19:05:38 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 0 time(s).

13/10/22 19:05:39 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 1 time(s).

……

Fail to connect the localhost/127.0.0.1:8020.

You can look up the hadoop log file "/usr/lib/hadoop/logs/hadoop-hadoop-datanode-<hostname>.log". It show error :

2013-10-22 10:22:55,517 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.IOException: Missing directory /var/lib/hadoop-0.20/cache/hadoop/dfs/name

2013-10-22 10:22:55,518 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException: Missing directory /var/lib/hadoop-0.20/cache/hadoop/dfs/name

Appears that your base dirs haven't been created properly, leading the FSNamesystem to fail at initialization (it expects a ready, formatted directory the first time it starts). Try the below and you should have it working after it:

1. Stop all Hadoop/Hadoop-related services.

# for service in /etc/init.d/hadoop-0.20-*; do sudo $service stop; done

2. Run the following fix-up commands:

$ sudo rm -rf /var/lib/hadoop-0.20/cache/hadoop/dfs
$ sudo mkdir -p /var/lib/hadoop-0.20/cache/hadoop/dfs/{name,data}
$ sudo chown hdfs:hdfs /var/lib/hadoop-0.20/cache/hadoop/dfs/{name,data}

$ sudo -u hdfs hadoop namenode -format

3. Start your services now, and even FSNamesystem should start up fine.

# for service in /etc/init.d/hadoop-0.20-*; do sudo $service start; done

Note : Do not repeat these steps if you run into NameNode-down or other issues later down your use. The rm -rf/format will delete all your HDFS data, and you do not want to do that in a working cluster thats just stopped working because of some other recoverable issue.[Refer to http://grokbase.com/t/cloudera/cdh-user/125p93ggmd/installation-issues-with-with-cdh3]

More Installation see official docs:  http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH3/CDH3u6/CDH3-Quick-Start/CDH3-Quick-Start.html

 


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值