集群信息如下:
一、首先是搞好master
1、创建用户组
groupadd hadoop 添加一个组
useradd hadoop -g hadoop 添加用户
2、jdk的安装
安装版本:jdk-7u79-linux-x64.gz
使用 tar zxf jdk-7u79-linux-x64.gz -C /opt/ 命令将其解压到/opt目录下,并将解压后的文件夹jdk1.7.0_79改名为java.
jdk安装好就要配置环境变量了,使用vi /etc/profile命令编辑添加如下内容:
export JAVA_HOME=/opt/java
export CLASSPATH=.:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH
配置好之后要用命令source /etc/profile使配置文件生效,这样jdk就安装完毕了。安装完之后不要忘了将所有者设置为hadoop。
使用命令chown -R hadoop:hadoop java/
3.hadoop的安装
hadoop的版本是hadoop-0.20.2.tar.gz,也把它解压到/opt目录下面,改名为hadoop。
hadoop也要设置环境变量,使用vi /etc/profile命令编辑添加如下内容:
export HADOOP_HOME=/opt/hadoop
export PATH=$HADOOP_HOME/bin:$PATH
同样也要执行source /etc/profile使配置文件生效,然后执行命令使用命令chown -R hadoop:hadoop hadoop/将其所有者改为hadoop
4、修改地址解析文件/etc/hosts,加入
[root@dcw ~]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
#192.168.75.131 dcw.localdomain dcw
192.168.75.131 master
5、修改hadoop的配置文件
首先切换到hadoop用户,su hadoop
①修改hadoop目录下的conf/hadoop-env.sh文件
加入java的安装路径export JAVA_HOME=/opt/java
②把hadoop目录下的conf/core-site.xml文件修改成如下:这里配置的是HDFS的地址和端口号。
mkdir /hadoop
mkdir /hadoop/name
mkdir /hadoop/data
mkdir /hadoop/mapred_system
mkdir /hadoop/mapred_local
chown -R hadoop:hadoop /hadoop
mkdir /hadoop/tmp
[root@dcw ~]# cat /opt/hadoop/conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.75.131:9000</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/hadoop/name</value>
</property>
</configuration>
备注:如没有配置hadoop.tmp.dir参数,此时系统默认的临时目录为:/tmp/hadoo-hadoop。而这个目录在每次重启后都会被干掉,必须重新执行format才行,否则会出错。
③把hadoop目录下的conf/ hdfs-site.xml文件修改成如下:
配置的备份方式默认为3,
(备注:replication 是数据副本数量,默认为3,salve少于3台就会报错)
[root@dcw ~]# cat /opt/hadoop/conf/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/hadoop/data</value>
</property>
</configuration>
④把hadoop目录下的conf/ mapred-site.xml文件修改成如下:
配置的是JobTracker的地址和端口
[root@dcw ~]# cat /opt/hadoop/conf/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>192.168.75.131:9001</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/hadoop/mapred_system</value>
</property>
<property>
<name>mapred.localdir</name>
<value>/hadoop/mapred_local</value>
</property>
</configuration>
⑤把hadoop目录下的conf/ masters文件修改成如下:
master
⑥把hadoop目录下的conf/ slaves文件修改成如下:
master
由于是单机,所以6,7步可以略过
6、复制虚拟机
7、SSH设置无密码验证
单机也可以设置下,否则会在start-all.sh时候会提示输入密码
[hadoop@master ~]$ ssh-keygen -t rsa -P ''
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
8b:80:6c:fa:f0:c7:49:fe:ec:19:fc:44:c8:e9:b2:f1 hadoop@master
[hadoop@master ~]$
[hadoop@master ~]$ cd .ssh/
[hadoop@master .ssh]$ ls
id_rsa id_rsa.pub known_hosts
[hadoop@master .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@master .ssh]$ chmod 600 ~/.ssh/authorized_keys
[hadoop@master .ssh]$ vim /etc/ssh/sshd_config
[hadoop@master .ssh]$ vim /etc/ssh/sshd_config
[hadoop@master .ssh]$ exit
logout
[root@dcw ~]# vim /etc/ssh/sshd_config
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
[root@dcw ~]# service sshd restart
Stopping sshd: [ OK ]
Starting sshd: [ OK ]
[hadoop@master ~]$ ssh master
无需密码
8、运行hadoop
使用Hadoop用户,切换到hadoop/bin目录下
格式化分布式文件系统./hadoop namenode -format
[hadoop@master bin]$ ./hadoop namenode -format
15/04/01 19:08:15 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/192.168.75.131
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
15/04/01 19:08:15 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
15/04/01 19:08:15 INFO namenode.FSNamesystem: supergroup=supergroup
15/04/01 19:08:15 INFO namenode.FSNamesystem: isPermissionEnabled=true
15/04/01 19:08:15 INFO common.Storage: Image file of size 96 saved in 0 seconds.
15/04/01 19:08:15 INFO common.Storage: Storage directory /hadoop/tmp/dfs/name has been successfully formatted.
15/04/01 19:08:15 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.75.131
************************************************************/
以下是重新格式化后的
[hadoop@master bin]$ ./hadoop namenode -format
15/04/01 13:56:21 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/192.168.75.131
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
Re-format filesystem in /hadoop/dfs/name ? (Y or N) y
Format aborted in /hadoop/dfs/name
15/04/01 13:56:30 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.75.131
************************************************************/
[hadoop@master bin]$
执行命令./start-all.sh启动hadoop(提示输入密码,确实第7步)
[hadoop@master bin]$ ./start-all.sh
starting namenode, logging to /opt/hadoop/bin/../logs/hadoop-hadoop-namenode-master.out
hadoop@master's password:
master: starting datanode, logging to /opt/hadoop/bin/../logs/hadoop-hadoop-datanode-master.out
hadoop@master's password:
master: starting secondarynamenode, logging to /opt/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-master.out
starting jobtracker, logging to /opt/hadoop/bin/../logs/hadoop-hadoop-jobtracker-master.out
hadoop@master's password:
master: starting tasktracker, logging to /opt/hadoop/bin/../logs/hadoop-hadoop-tasktracker-master.out
[hadoop@master bin]$ ps -ef|grep -v grep|grep opt
hadoop 4406 1 1 13:59 pts/2 00:00:02 /opt/java/bin/java -Xmx1000m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.log.dir=/opt/hadoop/bin/../logs -Dhadoop.log.file=hadoop-hadoop-namenode-master.log -Dhadoop.home.dir=/opt/hadoop/bin/.. -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath /opt/hadoop/bin/../conf:/opt/java/lib/tools.jar:/opt/hadoop/bin/..:/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/commons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar org.apache.hadoop.hdfs.server.namenode.NameNode
hadoop 4507 1 1 13:59 ? 00:00:02 /opt/java/bin/java -Xmx1000m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.log.dir=/opt/hadoop/bin/../logs -Dhadoop.log.file=hadoop-hadoop-datanode-master.log -Dhadoop.home.dir=/opt/hadoop/bin/.. -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath /opt/hadoop/bin/../conf:/opt/java/lib/tools.jar:/opt/hadoop/bin/..:/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/commons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar org.apache.hadoop.hdfs.server.datanode.DataNode
hadoop 4621 1 0 13:59 ? 00:00:01 /opt/java/bin/java -Xmx1000m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.log.dir=/opt/hadoop/bin/../logs -Dhadoop.log.file=hadoop-hadoop-secondarynamenode-master.log -Dhadoop.home.dir=/opt/hadoop/bin/.. -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath /opt/hadoop/bin/../conf:/opt/java/lib/tools.jar:/opt/hadoop/bin/..:/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/commons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
hadoop 4677 1 1 13:59 pts/2 00:00:02 /opt/java/bin/java -Xmx1000m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.log.dir=/opt/hadoop/bin/../logs -Dhadoop.log.file=hadoop-hadoop-jobtracker-master.log -Dhadoop.home.dir=/opt/hadoop/bin/.. -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath /opt/hadoop/bin/../conf:/opt/java/lib/tools.jar:/opt/hadoop/bin/..:/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/commons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar org.apache.hadoop.mapred.JobTracker
hadoop 4799 1 1 13:59 ? 00:00:02 /opt/java/bin/java -Xmx1000m -Dhadoop.log.dir=/opt/hadoop/bin/../logs -Dhadoop.log.file=hadoop-hadoop-tasktracker-master.log -Dhadoop.home.dir=/opt/hadoop/bin/.. -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath /opt/hadoop/bin/../conf:/opt/java/lib/tools.jar:/opt/hadoop/bin/..:/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/commons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar org.apache.hadoop.mapred.TaskTracker
在master上执行jps命令查看运行的进程如下:
[hadoop@master bin]$ jps
4507 DataNode
4799 TaskTracker
5022 Jps
4677 JobTracker
4406 NameNode
4621 SecondaryNameNode
[hadoop@master bin]$ ./hadoop dfsadmin -safemode leave
Safe mode is OFF
查看节点情况,看到类似界面显示available的Datanodes
[hadoop@master bin]$ ./hadoop dfsadmin -report
Configured Capacity: 18763378688 (17.47 GB)
Present Capacity: 4126322688 (3.84 GB)
DFS Remaining: 4126285824 (3.84 GB)
DFS Used: 36864 (36 KB)
DFS Used%: 0%
Under replicated blocks: 1
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)
Name: 192.168.75.131:50010
Decommission Status : Normal
Configured Capacity: 18763378688 (17.47 GB)
DFS Used: 36864 (36 KB)
Non DFS Used: 14637056000 (13.63 GB)
DFS Remaining: 4126285824(3.84 GB)
DFS Used%: 0%
DFS Remaining%: 21.99%
Last contact: Wed Apr 01 16:50:56 CST 2015
9、运行单词统计程序
WordCount是hadoop自带的实例,统计一批文本文件中各单词出现的资料,输出到指定的output目录中,输出目录如果已经存在会报错。
[hadoop@master hadoop]$cd /opt/hadoop
[hadoop@master hadoop]$ hadoop fs -mkdir input
[hadoop@master hadoop]$ hadoop fs -copyFromLocal /opt/hadoop/*.txt input/
[hadoop@master hadoop]$ hadoop jar hadoop-0.20.2-examples.jar wordcount input output
15/04/01 17:11:08 INFO input.FileInputFormat: Total input paths to process : 4
15/04/01 17:11:09 INFO mapred.JobClient: Running job: job_201504011400_0001
15/04/01 17:11:10 INFO mapred.JobClient: map 0% reduce 0%
15/04/01 17:11:29 INFO mapred.JobClient: map 50% reduce 0%
15/04/01 17:11:35 INFO mapred.JobClient: map 100% reduce 0%
15/04/01 17:11:47 INFO mapred.JobClient: map 100% reduce 100%
15/04/01 17:11:49 INFO mapred.JobClient: Job complete: job_201504011400_0001
15/04/01 17:11:49 INFO mapred.JobClient: Counters: 17
15/04/01 17:11:49 INFO mapred.JobClient: Job Counters
15/04/01 17:11:49 INFO mapred.JobClient: Launched reduce tasks=1
15/04/01 17:11:49 INFO mapred.JobClient: Launched map tasks=4
15/04/01 17:11:49 INFO mapred.JobClient: Data-local map tasks=4
15/04/01 17:11:49 INFO mapred.JobClient: FileSystemCounters
15/04/01 17:11:49 INFO mapred.JobClient: FILE_BYTES_READ=179182
15/04/01 17:11:49 INFO mapred.JobClient: HDFS_BYTES_READ=363457
15/04/01 17:11:49 INFO mapred.JobClient: FILE_BYTES_WRITTEN=358510
15/04/01 17:11:49 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=133548
15/04/01 17:11:49 INFO mapred.JobClient: Map-Reduce Framework
15/04/01 17:11:49 INFO mapred.JobClient: Reduce input groups=10500
15/04/01 17:11:49 INFO mapred.JobClient: Combine output records=10840
15/04/01 17:11:49 INFO mapred.JobClient: Map input records=8968
15/04/01 17:11:49 INFO mapred.JobClient: Reduce shuffle bytes=179200
15/04/01 17:11:49 INFO mapred.JobClient: Reduce output records=10500
15/04/01 17:11:49 INFO mapred.JobClient: Spilled Records=21680
15/04/01 17:11:49 INFO mapred.JobClient: Map output bytes=524840
15/04/01 17:11:49 INFO mapred.JobClient: Combine input records=47258
15/04/01 17:11:49 INFO mapred.JobClient: Map output records=47258
15/04/01 17:11:49 INFO mapred.JobClient: Reduce input records=10840
下面我只是单机搭建了下注:master即使master又是slave.
一、首先是搞好master
1、创建用户组
groupadd hadoop 添加一个组
useradd hadoop -g hadoop 添加用户
2、jdk的安装
安装版本:jdk-7u79-linux-x64.gz
使用 tar zxf jdk-7u79-linux-x64.gz -C /opt/ 命令将其解压到/opt目录下,并将解压后的文件夹jdk1.7.0_79改名为java.
jdk安装好就要配置环境变量了,使用vi /etc/profile命令编辑添加如下内容:
export JAVA_HOME=/opt/java
export CLASSPATH=.:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH
配置好之后要用命令source /etc/profile使配置文件生效,这样jdk就安装完毕了。安装完之后不要忘了将所有者设置为hadoop。
使用命令chown -R hadoop:hadoop java/
3.hadoop的安装
hadoop的版本是hadoop-0.20.2.tar.gz,也把它解压到/opt目录下面,改名为hadoop。
hadoop也要设置环境变量,使用vi /etc/profile命令编辑添加如下内容:
export HADOOP_HOME=/opt/hadoop
export PATH=$HADOOP_HOME/bin:$PATH
同样也要执行source /etc/profile使配置文件生效,然后执行命令使用命令chown -R hadoop:hadoop hadoop/将其所有者改为hadoop
4、修改地址解析文件/etc/hosts,加入
[root@dcw ~]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
#192.168.75.131 dcw.localdomain dcw
192.168.75.131 master
5、修改hadoop的配置文件
首先切换到hadoop用户,su hadoop
①修改hadoop目录下的conf/hadoop-env.sh文件
加入java的安装路径export JAVA_HOME=/opt/java
②把hadoop目录下的conf/core-site.xml文件修改成如下:这里配置的是HDFS的地址和端口号。
mkdir /hadoop
mkdir /hadoop/name
mkdir /hadoop/data
mkdir /hadoop/mapred_system
mkdir /hadoop/mapred_local
chown -R hadoop:hadoop /hadoop
mkdir /hadoop/tmp
[root@dcw ~]# cat /opt/hadoop/conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.75.131:9000</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/hadoop/name</value>
</property>
</configuration>
备注:如没有配置hadoop.tmp.dir参数,此时系统默认的临时目录为:/tmp/hadoo-hadoop。而这个目录在每次重启后都会被干掉,必须重新执行format才行,否则会出错。
③把hadoop目录下的conf/ hdfs-site.xml文件修改成如下:
配置的备份方式默认为3,
(备注:replication 是数据副本数量,默认为3,salve少于3台就会报错)
[root@dcw ~]# cat /opt/hadoop/conf/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/hadoop/data</value>
</property>
</configuration>
④把hadoop目录下的conf/ mapred-site.xml文件修改成如下:
配置的是JobTracker的地址和端口
[root@dcw ~]# cat /opt/hadoop/conf/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>192.168.75.131:9001</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/hadoop/mapred_system</value>
</property>
<property>
<name>mapred.localdir</name>
<value>/hadoop/mapred_local</value>
</property>
</configuration>
⑤把hadoop目录下的conf/ masters文件修改成如下:
master
⑥把hadoop目录下的conf/ slaves文件修改成如下:
master
由于是单机,所以6,7步可以略过
6、复制虚拟机
7、SSH设置无密码验证
单机也可以设置下,否则会在start-all.sh时候会提示输入密码
[hadoop@master ~]$ ssh-keygen -t rsa -P ''
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
8b:80:6c:fa:f0:c7:49:fe:ec:19:fc:44:c8:e9:b2:f1 hadoop@master
[hadoop@master ~]$
[hadoop@master ~]$ cd .ssh/
[hadoop@master .ssh]$ ls
id_rsa id_rsa.pub known_hosts
[hadoop@master .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@master .ssh]$ chmod 600 ~/.ssh/authorized_keys
[hadoop@master .ssh]$ vim /etc/ssh/sshd_config
[hadoop@master .ssh]$ vim /etc/ssh/sshd_config
[hadoop@master .ssh]$ exit
logout
[root@dcw ~]# vim /etc/ssh/sshd_config
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
[root@dcw ~]# service sshd restart
Stopping sshd: [ OK ]
Starting sshd: [ OK ]
[hadoop@master ~]$ ssh master
无需密码
8、运行hadoop
使用Hadoop用户,切换到hadoop/bin目录下
格式化分布式文件系统./hadoop namenode -format
[hadoop@master bin]$ ./hadoop namenode -format
15/04/01 19:08:15 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/192.168.75.131
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
15/04/01 19:08:15 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
15/04/01 19:08:15 INFO namenode.FSNamesystem: supergroup=supergroup
15/04/01 19:08:15 INFO namenode.FSNamesystem: isPermissionEnabled=true
15/04/01 19:08:15 INFO common.Storage: Image file of size 96 saved in 0 seconds.
15/04/01 19:08:15 INFO common.Storage: Storage directory /hadoop/tmp/dfs/name has been successfully formatted.
15/04/01 19:08:15 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.75.131
************************************************************/
以下是重新格式化后的
[hadoop@master bin]$ ./hadoop namenode -format
15/04/01 13:56:21 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/192.168.75.131
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
Re-format filesystem in /hadoop/dfs/name ? (Y or N) y
Format aborted in /hadoop/dfs/name
15/04/01 13:56:30 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.75.131
************************************************************/
[hadoop@master bin]$
执行命令./start-all.sh启动hadoop(提示输入密码,确实第7步)
[hadoop@master bin]$ ./start-all.sh
starting namenode, logging to /opt/hadoop/bin/../logs/hadoop-hadoop-namenode-master.out
hadoop@master's password:
master: starting datanode, logging to /opt/hadoop/bin/../logs/hadoop-hadoop-datanode-master.out
hadoop@master's password:
master: starting secondarynamenode, logging to /opt/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-master.out
starting jobtracker, logging to /opt/hadoop/bin/../logs/hadoop-hadoop-jobtracker-master.out
hadoop@master's password:
master: starting tasktracker, logging to /opt/hadoop/bin/../logs/hadoop-hadoop-tasktracker-master.out
[hadoop@master bin]$ ps -ef|grep -v grep|grep opt
hadoop 4406 1 1 13:59 pts/2 00:00:02 /opt/java/bin/java -Xmx1000m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.log.dir=/opt/hadoop/bin/../logs -Dhadoop.log.file=hadoop-hadoop-namenode-master.log -Dhadoop.home.dir=/opt/hadoop/bin/.. -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath /opt/hadoop/bin/../conf:/opt/java/lib/tools.jar:/opt/hadoop/bin/..:/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/commons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar org.apache.hadoop.hdfs.server.namenode.NameNode
hadoop 4507 1 1 13:59 ? 00:00:02 /opt/java/bin/java -Xmx1000m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.log.dir=/opt/hadoop/bin/../logs -Dhadoop.log.file=hadoop-hadoop-datanode-master.log -Dhadoop.home.dir=/opt/hadoop/bin/.. -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath /opt/hadoop/bin/../conf:/opt/java/lib/tools.jar:/opt/hadoop/bin/..:/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/commons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar org.apache.hadoop.hdfs.server.datanode.DataNode
hadoop 4621 1 0 13:59 ? 00:00:01 /opt/java/bin/java -Xmx1000m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.log.dir=/opt/hadoop/bin/../logs -Dhadoop.log.file=hadoop-hadoop-secondarynamenode-master.log -Dhadoop.home.dir=/opt/hadoop/bin/.. -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath /opt/hadoop/bin/../conf:/opt/java/lib/tools.jar:/opt/hadoop/bin/..:/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/commons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
hadoop 4677 1 1 13:59 pts/2 00:00:02 /opt/java/bin/java -Xmx1000m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.log.dir=/opt/hadoop/bin/../logs -Dhadoop.log.file=hadoop-hadoop-jobtracker-master.log -Dhadoop.home.dir=/opt/hadoop/bin/.. -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath /opt/hadoop/bin/../conf:/opt/java/lib/tools.jar:/opt/hadoop/bin/..:/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/commons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar org.apache.hadoop.mapred.JobTracker
hadoop 4799 1 1 13:59 ? 00:00:02 /opt/java/bin/java -Xmx1000m -Dhadoop.log.dir=/opt/hadoop/bin/../logs -Dhadoop.log.file=hadoop-hadoop-tasktracker-master.log -Dhadoop.home.dir=/opt/hadoop/bin/.. -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath /opt/hadoop/bin/../conf:/opt/java/lib/tools.jar:/opt/hadoop/bin/..:/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/commons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar org.apache.hadoop.mapred.TaskTracker
在master上执行jps命令查看运行的进程如下:
[hadoop@master bin]$ jps
4507 DataNode
4799 TaskTracker
5022 Jps
4677 JobTracker
4406 NameNode
4621 SecondaryNameNode
[hadoop@master bin]$ ./hadoop dfsadmin -safemode leave
Safe mode is OFF
查看节点情况,看到类似界面显示available的Datanodes
[hadoop@master bin]$ ./hadoop dfsadmin -report
Configured Capacity: 18763378688 (17.47 GB)
Present Capacity: 4126322688 (3.84 GB)
DFS Remaining: 4126285824 (3.84 GB)
DFS Used: 36864 (36 KB)
DFS Used%: 0%
Under replicated blocks: 1
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)
Name: 192.168.75.131:50010
Decommission Status : Normal
Configured Capacity: 18763378688 (17.47 GB)
DFS Used: 36864 (36 KB)
Non DFS Used: 14637056000 (13.63 GB)
DFS Remaining: 4126285824(3.84 GB)
DFS Used%: 0%
DFS Remaining%: 21.99%
Last contact: Wed Apr 01 16:50:56 CST 2015
9、运行单词统计程序
WordCount是hadoop自带的实例,统计一批文本文件中各单词出现的资料,输出到指定的output目录中,输出目录如果已经存在会报错。
[hadoop@master hadoop]$cd /opt/hadoop
[hadoop@master hadoop]$ hadoop fs -mkdir input
[hadoop@master hadoop]$ hadoop fs -copyFromLocal /opt/hadoop/*.txt input/
[hadoop@master hadoop]$ hadoop jar hadoop-0.20.2-examples.jar wordcount input output
15/04/01 17:11:08 INFO input.FileInputFormat: Total input paths to process : 4
15/04/01 17:11:09 INFO mapred.JobClient: Running job: job_201504011400_0001
15/04/01 17:11:10 INFO mapred.JobClient: map 0% reduce 0%
15/04/01 17:11:29 INFO mapred.JobClient: map 50% reduce 0%
15/04/01 17:11:35 INFO mapred.JobClient: map 100% reduce 0%
15/04/01 17:11:47 INFO mapred.JobClient: map 100% reduce 100%
15/04/01 17:11:49 INFO mapred.JobClient: Job complete: job_201504011400_0001
15/04/01 17:11:49 INFO mapred.JobClient: Counters: 17
15/04/01 17:11:49 INFO mapred.JobClient: Job Counters
15/04/01 17:11:49 INFO mapred.JobClient: Launched reduce tasks=1
15/04/01 17:11:49 INFO mapred.JobClient: Launched map tasks=4
15/04/01 17:11:49 INFO mapred.JobClient: Data-local map tasks=4
15/04/01 17:11:49 INFO mapred.JobClient: FileSystemCounters
15/04/01 17:11:49 INFO mapred.JobClient: FILE_BYTES_READ=179182
15/04/01 17:11:49 INFO mapred.JobClient: HDFS_BYTES_READ=363457
15/04/01 17:11:49 INFO mapred.JobClient: FILE_BYTES_WRITTEN=358510
15/04/01 17:11:49 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=133548
15/04/01 17:11:49 INFO mapred.JobClient: Map-Reduce Framework
15/04/01 17:11:49 INFO mapred.JobClient: Reduce input groups=10500
15/04/01 17:11:49 INFO mapred.JobClient: Combine output records=10840
15/04/01 17:11:49 INFO mapred.JobClient: Map input records=8968
15/04/01 17:11:49 INFO mapred.JobClient: Reduce shuffle bytes=179200
15/04/01 17:11:49 INFO mapred.JobClient: Reduce output records=10500
15/04/01 17:11:49 INFO mapred.JobClient: Spilled Records=21680
15/04/01 17:11:49 INFO mapred.JobClient: Map output bytes=524840
15/04/01 17:11:49 INFO mapred.JobClient: Combine input records=47258
15/04/01 17:11:49 INFO mapred.JobClient: Map output records=47258
15/04/01 17:11:49 INFO mapred.JobClient: Reduce input records=10840