一、伪集群配置
hadoop配置文件都存放在$HADOOP_HOME/etc/hadoop文件夹中
1.core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost/</value>
</property>
</configuration>
2.hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
3.mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
</configuration>
4.hadoop-env.sh
JAVA_HOME原本引用环境变量${JAVA_HOME},若再启动hadoop时出现一下报错
localhost: Error: JAVA_HOME is not set and could not be found.
则将hadoop-env.sh中的配置中的JAVA_HOME该为绝对路径
# The java implementation to use.
export JAVA_HOME=/usr/software/java/jdk1.7.0_80
5.关闭防火墙或者添加规则
[root@localhost sbin]# systemctl stop firewalld
[root@localhost sbin]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:firewalld(1)
8月 28 11:23:27 localhost.localdomain systemd[1]: Starting firewalld - dynamic firewall daemon...
8月 28 11:23:28 localhost.localdomain systemd[1]: Started firewalld - dynamic firewall daemon.
8月 28 14:22:35 localhost.localdomain systemd[1]: Stopping firewalld - dynamic firewall daemon...
8月 28 14:22:37 localhost.localdomain systemd[1]: Stopped firewalld - dynamic firewall daemon.
6.SSH配置公钥免密登录
[root@localhost ~]# ssh-keygen -t rsa -f ~/.ssh/id_rsa
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:ZHyrcRwGJU/0ALDPNuIJ6oDS5x/UDzovaKtf8hXHKZE root@localhost.localdomain
The key's randomart image is:
+---[RSA 2048]----+
| ..++= |
| o * o |
| . E = . |
| * * + |
| . o S B |
|.. . + = @ |
|+ o o.B o . |
|.o oo+.= |
| o++oo.. |
+----[SHA256]-----+
[root@localhost ~]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
二、Hadoop服务的启动
hadoop的启动脚本位于sbin目录中
[root@localhost sbin]# ls
distribute-exclude.sh hdfs-config.cmd mr-jobhistory-daemon.sh start-all.cmd start-dfs.cmd start-yarn.cmd stop-all.sh stop-dfs.sh stop-yarn.sh
hadoop-daemon.sh hdfs-config.sh refresh-namenodes.sh start-all.sh start-dfs.sh start-yarn.sh stop-balancer.sh stop-secure-dns.sh yarn-daemon.sh
hadoop-daemons.sh httpfs.sh slaves.sh start-balancer.sh start-secure-dns.sh stop-all.cmd stop-dfs.cmd stop-yarn.cmd yarn-daemons.sh
我们可以通过start-all.sh来启动hadoop的所有服务
[root@localhost sbin]# start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [localhost]
localhost: namenode running as process 21275. Stop it first.
localhost: datanode running as process 21385. Stop it first.
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:ZgCkJ8K9D73BLDKDKNBEInyreE+p9i4lQHO2KAVQUjk.
ECDSA key fingerprint is MD5:51:ae:88:6e:88:7b:8a:cc:8d:33:3f:6d:19:a8:24:b8.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /usr/software/hadoop/hadoop-2.2.0/logs/hadoop-root-secondarynamenode-localhost.localdomain.out
starting yarn daemons
resourcemanager running as process 20413. Stop it first.
localhost: nodemanager running as process 21671. Stop it first.
启动后我们可以通过jps命令查看所有java进程,以确hadoop服务完全启动
[root@localhost sbin]# jps
21671 NodeManager
21385 DataNode
22151 SecondaryNameNode
22359 Jps
21275 NameNode
20413 ResourceManager
1120 Bootstrap
通过 hdfs fsck / -files -blocks 命令能查看HDFS的具体信息
[root@localhost sbin]# hdfs fsck / -files -blocks
Connecting to namenode via http://localhost:50070
FSCK started by root (auth:SIMPLE) from /127.0.0.1 for path / at Tue Aug 28 14:24:17 CST 2018
/ <dir>
Status: HEALTHY
Total size: 0 B
Total dirs: 1
Total files: 0
Total symlinks: 0
Total blocks (validated): 0
Minimally replicated blocks: 0
Over-replicated blocks: 0
Under-replicated blocks: 0
Mis-replicated blocks: 0
Default replication factor: 1
Average block replication: 0.0
Corrupt blocks: 0
Missing replicas: 0
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Tue Aug 28 14:24:17 CST 2018 in 14 milliseconds
The filesystem under path '/' is HEALTHY