CDH3 Install Guide
1 Install Hadoop
1.1 Add user hadoop
[root@gd02 ~]# adduser hadoop
使用vim在/etc/group中将hadoop用户添加到mapred和hdfs组;
将mapred和hdfs用户添加到hadoop组。
hadoop:x:105:mapred,hdfs
hdfs:x:106:hadoop
mapred:x:107:hadoop
1.2 Change hadoop’s privileges ofrelated directories.
chown -R hadoop:hadoop /usr/lib/hadoop-0.20/
chown -R hadoop:hadoop /usr/lib/hadoop-0.20/pids/
chown -R hadoop:hadoop /usr/lib/hadoop-0.20/logs/
chown -R hadoop:hadoop /usr/lib/hadoop-0.20/logs/*
1.3 Format HDFS
sudo -u hadoop hadoop namenode -format
1.4 Automated scripts
1.4.1 Init.sh
#!/bin/bash
chown -R hadoop:hadoop /usr/lib/hadoop-0.20/
chown -R hadoop:hadoop /usr/lib/hadoop-0.20/pids/
chown -R hadoop:hadoop /usr/lib/hadoop-0.20/logs/
cd /usr/lib/hadoop-0.20/logs/
chown -R hadoop:hadoop *
1.4.2 Start-all.sh
#!/bin/sh
/etc/init.d/hadoop-0.20-namenode start
/etc/init.d/hadoop-0.20-secondarynamenodestart
/etc/init.d/hadoop-0.20-jobtracker start
/etc/init.d/hadoop-zookeeper start
/etc/init.d/hadoop-hbase-master start
1.4.3 Stop-all.sh
#!/bin/sh
/etc/init.d/hadoop-zookeeper stop
/etc/init.d/hadoop-0.20-secondarynamenodestop
/etc/init.d/hadoop-0.20-jobtracker stop
/etc/init.d/hadoop-0.20-namenode stop
2 Install Hbase
2.1 Change hadoop’s privileges ofrelated directories.
修改HBase权限
chown -R hadoop:hadoop /usr/lib/hbase/
chown -R hadoop:hadoop /usr/lib/hbase/logs/
修改ZooKeeper权限
chown -R hadoop:hadoop /local/zookeeper/
2.2 Automated scrips
3 Sqoop: Import Mysql to Hbase
#/bin/bash
MySQL_Server="10.10.97.116"
MySQL_Port="3306"
DataBase="rsearch"
sqoop import --connectjdbc:mysql://10.10.97.116:3306/rsearch --table institute --hbase-tableinstitute --column-family institute --hbase-row-key domain --hbase-create-table--username 'root' -P
4 Q&A
4.1 Synctime error
2011-06-21 08:41:10,470 FATALorg.apache.hadoop.hbase.regionserver.HRegionServer: Master rejected startupbecause clock is out of sync
org.apache.hadoop.hbase.ClockOutOfSyncException:org.apache.hadoop.hbase.ClockOutOfSyncException: Servergd03,60020,1308616870092 has been rejected; Reported time is too far out ofsync with master. Time difference of 50375801ms > max allowed of 30000ms
将集群中节点的时间同步
4.2 Permissionerror
2011-06-21 23:14:16,338 WARNorg.apache.hadoop.mapred.JobTracker: Failed to operate on mapred.system.dir(hdfs://gd02:9000/mapred/system) because of permissions.
删除datanode中的mapred.system.dir目录。
rm –rf /local/dfs