1.安装好三台centos机器,配置好网络,使三台机器能够互通
本次试用的是virtualbox虚拟机,在virtualbox上分别安装好三台centos6.5后,一般为了使centos既可以在内网与主机间互通,又可以访问外网,需要为每台机器配置双网卡,其中一个用net网络模式,另一个用hostonly模式即可达到该要求。
2.修改主机名
/etc/sysconfig/network 三台机器分别修改hostname=master,hostname=slave1,hostname=slave2.
配置hosts,三台机器的/etc/hosts 中末尾全部添加:
master ip1
slave1 ip2
slave2 ip3
重启三台机器,测试ping master, ping slave1, ping slave2能否互通
3.在三台机器中配置ssh免密码
a.在三台机器中创建公钥私钥
在master中:
cd /root/.ssh
ssh-keygen -t rsa -P “”
ssh slave1
cd /root/.ssh
ssh-keygen -t rsa -P “”
ssh slave2
cd /root/.ssh
ssh-keygen -t rsa -P “”
b.slave1,slave2机器中分别复制一份公钥,并重命名
slave1: cp id_rsa.pub id1
slave2: cp id_rsa.pub id2
c.将复制的公钥传送到master的/home中,在/root/.ssh中创建authorized_keys,将传到master中的两个公钥及自己的公钥内容复制到
authorized_keys,最后将authorized_keys移动到/root/.ssh,并分发到slave1,slave2的/root/.ssh下。
slave:
scp id1 master:/home
master:
vi /root/.ssh/authorized_keys
cat /home/id1 >> /root/.ssh/authorized_keys
cat /home/id2 >> /root/.ssh/authorized_keys
cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
scp authorized_keys slave1:/root/.ssh
scp authorized_keys slave2:/root/.ssh
4.安装JDK.
在master上安装后,直接传到2个slave上,/etc/profile 环境变量设置也可直接传
5.hadoop安装
解压,修改配置文件:masters,slaves,hadoop-env.sh core-site.xml hdfs-site.xml mapred-site.xml
export JAVA_HOME=/usr/jdk1.7.0_79
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
</configuration>
关闭所有机器上的防火墙:
service iptables stop
service iptables off
service iptables status
bin/hadoop namenode -format
bin/start-all.sh
若需要改配置文件,则先bin/stop-all.sh,再改配置文件,删除/tmp文件夹,重新bin/hadoop namenode -format
4.hadoop shell执行wordcount程序:
[root@master home]# mkdir hadooplocalfile
[root@master home]# ls
hadooplocalfile
[root@master home]# cd hadooplocalfile
[root@master hadooplocalfile]# echo “Hello world” > file1.txt
[root@master hadooplocalfile]# echo “Hello hadoop” > file2.txt
[root@master hadooplocalfile]# ls
file1.txt file2.txt
[root@master hadooplocalfile]# more file1.txt
Hello world
[root@master hadooplocalfile]# cd
[root@master ~]# cd /usr/hadoop-1.0.1
[root@master hadoop-1.0.1]# bin/hadoop fs -mkdir input
[root@master hadoop-1.0.1]# bin/hadoop fs -ls /
Found 3 items
drwxr-xr-x - root supergroup 0 2015-12-21 10:57 /test
drwxr-xr-x - root supergroup 0 2015-12-21 10:40 /tmp
drwxr-xr-x - root supergroup 0 2015-12-21 11:14 /user
[root@master hadoop-1.0.1]# bin/hadoop fs -ls /user
Found 1 items
drwxr-xr-x - root supergroup 0 2015-12-21 11:14 /user/root
[root@master hadoop-1.0.1]# bin/hadoop fs -ls /user/root
Found 1 items
drwxr-xr-x - root supergroup 0 2015-12-21 11:14 /user/root/input
[root@master hadoop-1.0.1]# bin/hadoop fs -put /home/hadooplocalfile/file*.txt input
[root@master hadoop-1.0.1]# bin/hadoop -ls input
Unrecognized option: -ls
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
[root@master hadoop-1.0.1]# bin/hadoop fs -ls input···············
Found 2 items
-rw-r–r– 2 root supergroup 12 2015-12-21 11:16 /user/root/input/file1.txt
-rw-r–r– 2 root supergroup 13 2015-12-21 11:16 /user/root/input/file2.txt
[root@master hadoop-1.0.1]# bin/hadoop jar /usr/hadoop-1.0.1/hadoop-examples-1.0.1.jar wordcount input output
****hdfs://master:9000/user/root/input
15/12/21 11:18:01 INFO input.FileInputFormat: Total input paths to process : 2
15/12/21 11:18:02 INFO mapred.JobClient: Running job: job_201512211045_0001
15/12/21 11:18:03 INFO mapred.JobClient: map 0% reduce 0%
15/12/21 11:18:23 INFO mapred.JobClient: map 50% reduce 0%
15/12/21 11:18:26 INFO mapred.JobClient: map 100% reduce 0%
15/12/21 11:18:38 INFO mapred.JobClient: map 100% reduce 100%
15/12/21 11:18:43 INFO mapred.JobClient: Job complete: job_201512211045_0001
15/12/21 11:18:43 INFO mapred.JobClient: Counters: 29
15/12/21 11:18:43 INFO mapred.JobClient: Job Counters
15/12/21 11:18:43 INFO mapred.JobClient: Launched reduce tasks=1
15/12/21 11:18:43 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=27147
15/12/21 11:18:43 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
15/12/21 11:18:43 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
15/12/21 11:18:43 INFO mapred.JobClient: Launched map tasks=2
15/12/21 11:18:43 INFO mapred.JobClient: Data-local map tasks=2
15/12/21 11:18:43 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=13091
15/12/21 11:18:43 INFO mapred.JobClient: File Output Format Counters
15/12/21 11:18:43 INFO mapred.JobClient: Bytes Written=25
15/12/21 11:18:43 INFO mapred.JobClient: FileSystemCounters
15/12/21 11:18:43 INFO mapred.JobClient: FILE_BYTES_READ=55
15/12/21 11:18:43 INFO mapred.JobClient: HDFS_BYTES_READ=243
15/12/21 11:18:43 INFO mapred.JobClient: FILE_BYTES_WRITTEN=64327
15/12/21 11:18:43 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=25
15/12/21 11:18:43 INFO mapred.JobClient: File Input Format Counters
15/12/21 11:18:43 INFO mapred.JobClient: Bytes Read=25
15/12/21 11:18:43 INFO mapred.JobClient: Map-Reduce Framework
15/12/21 11:18:43 INFO mapred.JobClient: Map output materialized bytes=61
15/12/21 11:18:43 INFO mapred.JobClient: Map input records=2
15/12/21 11:18:43 INFO mapred.JobClient: Reduce shuffle bytes=61
15/12/21 11:18:43 INFO mapred.JobClient: Spilled Records=8
15/12/21 11:18:43 INFO mapred.JobClient: Map output bytes=41
15/12/21 11:18:43 INFO mapred.JobClient: Total committed heap usage (bytes)=234758144
15/12/21 11:18:43 INFO mapred.JobClient: CPU time spent (ms)=3220
15/12/21 11:18:43 INFO mapred.JobClient: Combine input records=4
15/12/21 11:18:43 INFO mapred.JobClient: SPLIT_RAW_BYTES=218
15/12/21 11:18:43 INFO mapred.JobClient: Reduce input records=4
15/12/21 11:18:43 INFO mapred.JobClient: Reduce input groups=3
15/12/21 11:18:43 INFO mapred.JobClient: Combine output records=4
15/12/21 11:18:43 INFO mapred.JobClient: Physical memory (bytes) snapshot=403873792
15/12/21 11:18:43 INFO mapred.JobClient: Reduce output records=3
15/12/21 11:18:43 INFO mapred.JobClient: Virtual memory (bytes) snapshot=3180650496
15/12/21 11:18:43 INFO mapred.JobClient: Map output records=4
[root@master hadoop-1.0.1]# bin/hadoop fs -ls output
Found 3 items
-rw-r–r– 2 root supergroup 0 2015-12-21 11:18 /user/root/output/_SUCCESS
drwxr-xr-x - root supergroup 0 2015-12-21 11:18 /user/root/output/_logs
-rw-r–r– 2 root supergroup 25 2015-12-21 11:18 /user/root/output/part-r-00000
[root@master hadoop-1.0.1]# bin/hadoop fs -cat output/part-r-00000
Hello 2
hadoop 1
world 1