一、前期的准备
1、安装环境的准备
ip地址的配置等参考地点
2、查看centos的位数
[root@CDHnode1 ~]# file /bin/ls
/bin/ls: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=3d705971a4c4544545cb78fd890d27bf792af6d4, stripped
3、解压hadoop并设置软链
[root@CDHnode1 hadoop-2.6.0-cdh5.4.5]# ln -sf /home/hadoopcdh/soft/hadoop-2.6.0-cdh5.4.5 /opt/hadoop
4、hosts的映射文件
[root@CDHnode1 ~]# vi /etc/hosts
192.168.146.189 CDHnode1
5、准备Hadoop专用用户和组
[root@CDHnode1 ~]# groupadd hadoop
[root@CDHnode1 ~]# useradd -g hadoop hadoop
[root@CDHnode1 ~]# passwd hadoop
Changing password for user hadoop.
New password:
BAD PASSWORD: The password is shorter than 8 characters
Retype new password:
passwd: all authentication tokens updated successfully.
二、ssh的安装
1、切换到hadoop下
[root@CDHnode1 ~]# su hadoop
[hadoop@CDHnode1 root]$ cd
[hadoop@CDHnode1 ~]$ pwd
/home/hadoop
2、开始创建并生成目录
创建.ssh目录,生成密钥
[hadoop@CDHnode1 ~]$ mkdir .ssh
[hadoop@CDHnode1 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
f0:fe:1c:b4:f8:66:ec:14:69:b2:4a:bf:6c:f0:fa:11 hadoop@CDHnode1
The key's randomart image is:
+--[ RSA 2048]----+
| |
| |
| . |
| o . |
| E = |
| .. B o |
| .o=.+ |
| . +o*+. |
| o+=== |
+-----------------+
切换到.ssh目录下,进行查看公钥和私钥
[hadoop@CDHnode1 ~]$ cd .ssh
[hadoop@CDHnode1 .ssh]$ ls
id_rsa id_rsa.pub
将公钥复制到日志文件里
[hadoop@CDHnode1 .ssh]$ cp id_rsa.pub authorized_keys
[hadoop@CDHnode1 .ssh]$ ls
authorized_keys id_rsa id_rsa.pub
退回到/home/hadoop/,来赋予权限
[hadoop@CDHnode1 .ssh]$ cd ..
[hadoop@CDHnode1 ~]$ pwd
/home/hadoop
[hadoop@CDHnode1 ~]$ chmod 700 .ssh
[hadoop@CDHnode1 ~]$ chmod 600 .ssh/*
3、切换到root用户下,安装ssh插件(openssh)
[hadoop@CDHnode1 ~]$ su root
Password:
[root@CDHnode1 hadoop]# yum -y install openssh-clients
切换到/home/hadoop/,测试ssh无密码访问
[root@CDHnode1 hadoop]# su hadoop
[hadoop@CDHnode1 ~]$ ssh cdhhadoop
The authenticity of host 'cdhhadoop (192.168.146.189)' can't be established.
ECDSA key fingerprint is dc:c3:a8:6a:ac:10:63:15:43:52:51:ce:c9:9b:40:7d.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'cdhhadoop,192.168.146.189' (ECDSA) to the list of known hosts.
Last login: Sat May 27 23:08:14 2017
[hadoop@CDHnode1 ~]$
三、hadoop的环境变量的设置
1、将刚改名的hadoop文件,权限赋给hadoop用户
[root@CDHnode1 opt]# chown -R hadoop:hadoop hadoop
[root@CDHnode1 opt]# ls -l
total 0
lrwxrwxrwx. 1 hadoop hadoop 42 May 27 22:49 hadoop -> /home/hadoopcdh/soft/hadoop-2.6.0-cdh5.4.5
lrwxrwxrwx. 1 root root 32 May 5 08:58 jdk1.8 -> /home/hadoopcdh/soft/jdk1.8.0_60
2、先创建hadoop数据目录 ,将整个data目录权限赋予给hadoop用户
[root@CDHnode1 opt]# mkdir -p /data/dfs/name
[root@CDHnode1 opt]# mkdir -p /data/dfs/data
[root@CDHnode1 opt]# mkdir -p /data/tmp
[root@CDHnode1 opt]# chown -R hadoop:hadoop hadoop /data
[root@CDHnode1 opt]# ls -l /data
total 0
drwxr-xr-x. 4 hadoop hadoop 30 May 27 23:13 dfs
drwxr-xr-x. 2 hadoop hadoop 6 May 27 23:13 tmp
3、修改hadoop对应的配置文件,切换到hadoop用户,切换到hadoop目录
[root@CDHnode1 opt]# su hadoop
[hadoop@CDHnode1 opt]$ ls
hadoop jdk1.8
[hadoop@CDHnode1 opt]$ cd hadoop/
[hadoop@CDHnode1 hadoop]$
4、修改etc/hadoop/core-site.xml配置文件,添加如下信息
(1)、配置的是HDFS(hadoop)的分布式文件系统的地址及端口号
<property>
<name>fs.defaultFS</name>
<value>hdfs://cdhhadoop:9000</value>
</property>
(2)、HDFS路径的存放数据的公共目录
<property>
<name>hadoop.tmp.dir</name>
<value>file:/data/tmp</value>
</property>
(3)、下面配置的是,因为在hadoop1.0中引入了安全机制,所以从客户端发出的作业提交者全变成了hadoop,不管原始提交者是哪个用户,为了解决该问题,引入了安全违章功能,允许一个超级用户来代替其他用户来提交作业或者执行命令,而对外来看,执行者仍然是普通用户。所以
配置设为任意客户端
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
配置设为任意用户组
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
5、修改etc/hadoop/hdfs-site.xml配置文件,添加如下信息
配置的是namenode文件目录
<property>
<name>dfs.namenode.name.dir</name>
<value>/data/dfs/name</value>
<final>true</final>
</property>
配置的是datanode文件目录
<property>
<name>dfs.datanode.data.dir</name>
<value>/data/dfs/data</value>
<final>true</final>
</property>
配置的是数据块副本和HDFS权限
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
6、修改etc/hadoop/mapred-site.xml配置文件,添加如下信息。
[root@CDHnode1 hadoop]# cp mapred-site.xml.template mapred-site.xml
与hadoop1.0不同的是,使用的是yarn
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
7、修改etc/hadoop/yarn-site.xml配置文件,添加如下信息。
为了能够运行mapreduce程序,我们需要让.nodemanger在启动时加载shuffle。
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
8、修改etc/hadoop/slaves,添加如下信息。
[root@CDHnode1 hadoop]# vi slaves
cdhhadoop
9、设置Hadoop环境变量
vi /etc/profile
export HADOOP_HOME=/opt/hadoop
export PATH=$HADOOP_HOME/bin:$PATH
10、格式化namenode
hadoop namenode -format
11、启动集群
[hadoop@cdhhadoop hadoop]$ sbin/start-all.sh