Hadoop-HDFS部署
Hadoop2.X: hadoop.apache.org
HDFS: 分布式文件系统,存储
MapReduce: 分布式计算
Yarn: 资源(memory+cpu)和JOB调度监控
文档地址: http://hadoop.apache.org/docs/r2.8.2/
部署方式:
1.单机模式standalone 1个java进程
2.伪分布模式Pseudo-Distributed Mode 开发|学习 多个java进程
3.集群模式Cluster Mode :生产 多台机器多个java进程
http://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/hadoop-common/SingleCluster.html
Hadoop-HDFS 部署:
1.创建hadoop服务的一个用户
[root@xkhadoop001 ~]# useradd hadoop
[root@xkhadoop001 ~]# id hadoop
uid=515(hadoop) gid=515(hadoop) groups=515(hadoop)
(给Hadoop用户增加sudo权限)
[root@xkhadoop001 ~]# vi /etc/sudoers
hadoop ALL=(root) NOPASSWD:ALL
2.部署JAVA
Oracle jdk1.8(Open JDK尽量不要使用)
见编译文档
2.1 解压+环境变量
2.2 CDH课 /usr/java
[root@xkhadoop001 ~]# which java
/usr/java/jdk1.8.0_45/bin/java
3.部署ssh服务是运行
[root@xkhadoop001 .ssh]# service sshd ststus
Usage: /etc/init.d/sshd {start|stop|restart|reload|force-reload|condrestart|try-restart|status}
4.解压hadoop
(1)、先找到之前编译过的Hadoop的压缩包
[root@xkhadoop001 ~]# cd /opt/sourcecode/hadoop-2.8.1-src/hadoop-dist
[root@xkhadoop001 target]# ll
total 572404
drwxr-xr-x. 2 root root 4096 Nov 10 15:19 antrun
drwxr-xr-x. 3 root root 4096 Nov 10 15:19 classes
-rw-r--r--. 1 root root 2112 Nov 10 15:19 dist-layout-stitching.sh
-rw-r--r--. 1 root root 645 Nov 10 15:20 dist-tar-stitching.sh
drwxr-xr-x. 9 root root 4096 Nov 10 15:20 hadoop-2.8.1
-rw-r--r--. 1 root root 194983644 Nov 10 15:20 hadoop-2.8.1.tar.gz
-rw-r--r--. 1 root root 30236 Nov 10 15:20 hadoop-dist-2.8.1.jar
-rw-r--r--. 1 root root 391023487 Nov 10 15:20 hadoop-dist-2.8.1-javadoc.jar
-rw-r--r--. 1 root root 27736 Nov 10 15:20 hadoop-dist-2.8.1-sources.jar
-rw-r--r--. 1 root root 27736 Nov 10 15:20 hadoop-dist-2.8.1-test-sources.jar
drwxr-xr-x. 2 root root 4096 Nov 10 15:20 javadoc-bundle-options
drwxr-xr-x. 2 root root 4096 Nov 10 15:20 maven-archiver
drwxr-xr-x. 3 root root 4096 Nov 10 15:19 maven-shared-archive-resources
drwxr-xr-x. 3 root root 4096 Nov 10 15:19 test-classes
drwxr-xr-x. 2 root root 4096 Nov 10 15:19 test-dir
(2)、把编译过Hadoop源码的压缩包复制到/opt/software/
[root@xkhadoop001 target]# cp hadoop-2.8.1.tar.gz /opt/software/
[root@xkhadoop001 target]# cd
[root@xkhadoop001 ~]# cd /opt/software/
[root@xkhadoop001 software]# ll
total 208564
drwxr-xr-x. 6 root root 4096 Nov 10 2015 apache-maven-3.3.9
-rw-r--r--. 1 root root 8617253 Nov 4 13:28 apache-maven-3.3.9-bin.zip
drwxr-xr-x. 7 root root 4096 Aug 21 2009 findbugs-1.3.9
-rw-r--r--. 1 root root 7546219 Nov 4 13:28 findbugs-1.3.9.zip
-rw-r--r--. 1 root root 194983644 Nov 14 19:27 hadoop-2.8.1.tar.gz
drwxr-xr-x. 10 root root 4096 Nov 8 20:46 protobuf-2.5.0
-rw-r--r--. 1 root root 2401901 Nov 4 13:29 protobuf-2.5.0.tar.gz
(3)、解压tar包
[root@xkhadoop001 software]# tar -xzvf hadoop-2.8.1.tar.gz
[root@xkhadoop001 software]# ll
total 208572
drwxr-xr-x. 6 root root 4096 Nov 10 2015 apache-maven-3.3.9
-rw-r--r--. 1 root root 8617253 Nov 4 13:28 apache-maven-3.3.9-bin.zip
drwxr-xr-x. 7 root root 4096 Aug 21 2009 findbugs-1.3.9
-rw-r--r--. 1 root root 7546219 Nov 4 13:28 findbugs-1.3.9.zip
drwxr-xr-x. 9 root root 4096 Nov 10 15:20 hadoop-2.8.1
-rw-r--r--. 1 root root 194983644 Nov 14 19:27 hadoop-2.8.1.tar.gz
drwxr-xr-x. 10 root root 4096 Nov 8 20:46 protobuf-2.5.0
-rw-r--r--. 1 root root 2401901 Nov 4 13:29 protobuf-2.5.0.tar.gz
(4)配置软连接并赋权限
[root@xkhadoop001 software]# ln -s hadoop-2.8.1 hadoop
[root@xkhadoop001 software]# chown -R hadoop:hadoop hadoop
[root@xkhadoop001 software]# chown -R hadoop:hadoop hadoop/*
[root@xkhadoop001 software]# chown -R hadoop:hadoop hadoop-2.8.1
[hadoop@xkhadoop001 software]$ ll
total 208572
drwxr-xr-x. 6 root root 4096 Nov 10 2015 apache-maven-3.3.9
-rw-r--r--. 1 root root 8617253 Nov 4 13:28 apache-maven-3.3.9-bin.zip
drwxr-xr-x. 7 root root 4096 Aug 21 2009 findbugs-1.3.9
-rw-r--r--. 1 root root 7546219 Nov 4 13:28 findbugs-1.3.9.zip
lrwxrwxrwx. 1 hadoop hadoop 12 Nov 14 19:38 hadoop -> hadoop-2.8.1
drwxr-xr-x. 9 hadoop hadoop 4096 Nov 10 15:20 hadoop-2.8.1
-rw-r--r--. 1 root root 194983644 Nov 14 19:27 hadoop-2.8.1.tar.gz
drwxr-xr-x. 10 root root 4096 Nov 8 20:46 protobuf-2.5.0
-rw-r--r--. 1 root root 2401901 Nov 4 13:29 protobuf-2.5.0.tar.gz
注:
bin: 命令
etc:配置文件
sbin: 用来启动关闭hadoop进程
5.切换hadoop用户和配置
[root@xkhadoop001 ~]# su - hadoop
[hadoop@xkhadoop001 ~]$ ll
total 0
[hadoop@xkhadoop001 ~]$ cd /opt/software/
[hadoop@xkhadoop001 hadoop]$ cd etc/hadoop/
[hadoop@xkhadoop001 hadoop]$ ll
total 156
-rw-r--r--. 1 root root 4942 Nov 10 15:20 capacity-scheduler.xml
-rw-r--r--. 1 root root 1335 Nov 10 15:20 configuration.xsl
-rw-r--r--. 1 root root 318 Nov 10 15:20 container-executor.cfg
-rw-r--r--. 1 root root 774 Nov 10 15:19 core-site.xml
-rw-r--r--. 1 root root 3719 Nov 10 15:19 hadoop-env.cmd
-rw-r--r--. 1 root root 4666 Nov 10 15:19 hadoop-env.sh
-rw-r--r--. 1 root root 2598 Nov 10 15:19 hadoop-metrics2.properties
-rw-r--r--. 1 root root 2490 Nov 10 15:19 hadoop-metrics.properties
-rw-r--r--. 1 root root 9683 Nov 10 15:19 hadoop-policy.xml
-rw-r--r--. 1 root root 775 Nov 10 15:20 hdfs-site.xml
-rw-r--r--. 1 root root 1449 Nov 10 15:20 httpfs-env.sh
-rw-r--r--. 1 root root 1657 Nov 10 15:20 httpfs-log4j.properties
-rw-r--r--. 1 root root 21 Nov 10 15:20 httpfs-signature.secret
-rw-r--r--. 1 root root 620 Nov 10 15:20 httpfs-site.xml
-rw-r--r--. 1 root root 3518 Nov 10 15:20 kms-acls.xml
-rw-r--r--. 1 root root 1611 Nov 10 15:20 kms-env.sh
-rw-r--r--. 1 root root 1631 Nov 10 15:20 kms-log4j.properties
-rw-r--r--. 1 root root 5546 Nov 10 15:20 kms-site.xml
-rw-r--r--. 1 root root 13661 Nov 10 15:19 log4j.properties
-rw-r--r--. 1 root root 931 Nov 10 15:20 mapred-env.cmd
-rw-r--r--. 1 root root 1383 Nov 10 15:20 mapred-env.sh
-rw-r--r--. 1 root root 4113 Nov 10 15:20 mapred-queues.xml.template
-rw-r--r--. 1 root root 758 Nov 10 15:20 mapred-site.xml.template
-rw-r--r--. 1 root root 10 Nov 10 15:20 slaves
-rw-r--r--. 1 root root 2316 Nov 10 15:19 ssl-client.xml.example
-rw-r--r--. 1 root root 2697 Nov 10 15:19 ssl-server.xml.example
-rw-r--r--. 1 root root 2191 Nov 10 15:20 yarn-env.cmd
-rw-r--r--. 1 root root 4567 Nov 10 15:20 yarn-env.sh
-rw-r--r--. 1 root root 690 Nov 10 15:20 yarn-site.xml
注:
hadoop-env.sh : hadoop配置环境
core-site.xml : hadoop 核心配置文件
hdfs-site.xml : hdfs服务的 --> 会起进程
[mapred-site.xml : mapred计算所需要的配置文件] 只当在jar计算时才有
yarn-site.xml : yarn服务的 --> 会起进程
slaves: 集群的机器名称
[hadoop@xkhadoop001 hadoop]$ vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
[hadoop@xkhadoop hadoop]$ vi hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
6.配置hadoop用户的ssh的信任关系
[hadoop@xkhadoop hadoop]$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
[hadoop@xkhadoop hadoop]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@xkhadoop hadoop]$ chmod 0600 ~/.ssh/authorized_keys
7.格式化
[hadoop@xkhadoop hadoop]$ bin/hdfs namenode -format
Storage directory: /tmp/hadoop-hadoop/dfs/name
1.默认的存储路径哪个配置?
2.hadoop-hadoop指的什么意思?
core-site.xml
hadoop.tmp.dir: /tmp/hadoop-${user.name}
hdfs-site.xml
dfs.namenode.name.dir : file://${hadoop.tmp.dir}/dfs/name
8.启动HDFS服务
[hadoop@xkhadoop sbin]$ ./start-dfs.sh
Starting namenodes on [localhost]
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is e7:e5:27:8f:1d:20:0c:d2:fe:0b:b3:c7:9f:2c:26:92.
Are you sure you want to continue connecting (yes/no)? ys^H^[^[
Please type 'yes' or 'no': yes
localhost: Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
localhost: Error: JAVA_HOME is not set and could not be found.
localhost: Error: JAVA_HOME is not set and could not be found.
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
RSA key fingerprint is e7:e5:27:8f:1d:20:0c:d2:fe:0b:b3:c7:9f:2c:26:92.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: Error: JAVA_HOME is not set and could not be found.
注:JAVA_HOME变量没有找到,但是我which一下找到了java的目录,无法启动HDFS服务
[hadoop@xkhadoop sbin]$ which java
/usr/java/jdk1.8.0_45/bin/java
[hadoop@xkhadoop sbin]$ vi ../etc/hadoop/hadoop-env.sh
# The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.8.0_45
注:之前是JAVA_HOME = 全局的环境变量JAVA_HOME,我后来改成给他一个固定的地址,再执行一次start-dfs.sh的shell脚本
[hadoop@xkhadoop sbin]$ ./start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-namenode-xkhadoop.out
localhost: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-datanode-xkhadoop.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-secondarynamenode-xkhadoop.out
namenode(名称节点) : localhost
datanode(数据节点) : localhost
secondary namenode(第二名称节点): 0.0.0.0
http://localhost:50070/
默认的端口:50070