今天试了一下RedHat下伪分布式的Hadoop模式的安装,过程如下。
一、JDK安装,网上到处都是,这里略过
二、ssh无密码验证配置
三、修改机器名
[root@yourHostName data]# hostname
yourHostName
[root@yourHostName data]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=yourHostName
[root@yourHostName data]# vi /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
172.17.6.35 yourHostName yourHostName
四、修改hadoop-env.sh,core-site.xml,hdfs-site.xml,mapred-site.xml
hadoop-env.sh:
添加 :export JAVA_HOME=/usr/program/jdk1.6 .0_06
core-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>hadoop.tmp.dir</name> <value>/home/joe/hadoop/hadooptmp</value> <description>A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://172.17.6.35:9000</value> <description> The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem. </description> </property> </configuration>
hdfs-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>1</value> <description> Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description> </property> </configuration>
mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>172.17.6.35:9001</value> <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description> </property> </configuration>
五、格式化hadoop,启动
[root@yourHostName hadoop-0.20.2]# hadoop namenode -format
11/08/18 00:46:55 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
11/08/18 00:46:55 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = daishubin/172.17.6.35
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
11/08/18 00:46:56 INFO namenode.FSNamesystem: fsOwner=root,root,bin,daemon,sys,adm,disk,wheel
11/08/18 00:46:56 INFO namenode.FSNamesystem: supergroup=supergroup
11/08/18 00:46:56 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/08/18 00:46:56 INFO common.Storage: Image file of size 94 saved in 0 seconds.
11/08/18 00:46:56 INFO common.Storage: Storage directory /home/joe/hadoop/hadooptmp/dfs/name has been successfully formatted.
11/08/18 00:46:56 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at daishubin/172.17.6.35
************************************************************/
[root@yourHostName hadoop-0.20.2]# start-all.sh
starting namenode, logging to /usr/local/hadoop-0.20.2/bin/../logs/hadoop-root-namenode-daishubin.out
localhost: starting datanode, logging to /usr/local/hadoop-0.20.2/bin/../logs/hadoop-root-datanode-daishubin.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop-0.20.2/bin/../logs/hadoop-root-secondarynamenode-daishubin.out
starting jobtracker, logging to /usr/local/hadoop-0.20.2/bin/../logs/hadoop-root-jobtracker-daishubin.out
localhost: starting tasktracker, logging to /usr/local/hadoop-0.20.2/bin/../logs/hadoop-root-tasktracker-daishubin.out
[root@yourHostName hadoop-0.20.2]# jps
6601 DataNode
6927 TaskTracker
6492 NameNode
6732 SecondaryNameNode
7023 Jps
6816 JobTracker
用jps查看如果能看到以上信息,说明安装、启动成功了
六、wordcount
[root@yourHostName hadoop-0.20.2]# hadoop fs -mkdir input
[root@yourHostName tmp]# hadoop fs -put HibernateUtil.java
[root@yourHostName hadoop-0.20.2]# hadoop jar hadoop-0.20.2-examples.jar wordcount input output
...
[root@yourHostName hadoop-0.20.2]# hadoop fs -ls output/
drwxr-xr-x - root supergroup 0 2011-08-18 01:00 /user/root/output/_logs
-rw-r--r-- 1 root supergroup 1144 2011-08-18 01:00 /user/root/output/part-r-00000
[root@yourHostName hadoop-0.20.2]# hadoop fs -cat output/part-r-00000
就能查看结果了~