转载链接:https://blog.csdn.net/qq_41045774/article/details/92851175
文章目录
3设备用root账户
三台虚拟机:master,slave1,slave2
192.168.199.130 master
192.168.199.131 slave1
192.168.199.132 slave2
永久主机名设置
vi /etc/hostname 然后重启
https://blog.csdn.net/u014204541/article/details/80761165
静态ip地址
centos https://mp.csdn.net/mdeditor/92839899
ubuntu https://mp.csdn.net/mdeditor/93243251
配置免密登录
去掉 127.0.开头的和主机名的映射,不然Hadoop 拒绝远程 9000 端口访问
https://blog.csdn.net/mzjwx/article/details/78547573
vim /etc/hosts
192.168.199.130 master
192.168.199.131 slave1
192.168.199.132 slave2
每台设备都这样操作一下
ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master0
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave2
每台设备都这样操作一下
Hadoop和Java的安装 加入环境变量
centos
[root@localhost hadoop-3.1.2]# vim /etc/profile
JAVA_HOME=/usr/local/software/jdk1.8.0_212
JRE_HOME=/usr/local/software/jdk1.8.0_212/jre
CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
export JAVA_HOME JRE_HOME CLASS_PATH PATH
export JAVA_HOME=/usr/local/software/jdk1.8.0_212
export HADOOP_HOME=/usr/local/software/hadoop-3.1.2
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
#以下看情况添加
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
环境变量生效source /etc/profile
ubuntu vim ~/.barshrc source ~/.bashrc
以下是master 6个文件配置
cd /usr/local/software/hadoop-3.1.2
1 [root@cent hadoop-3.1.2]# vim etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/local/software/jdk1.8.0_212
export HADOOP_HOME=/usr/local/software/hadoop-3.1.2
export PATH=$PATH:/usr/local/software/hadoop-3.1.2/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_COMMON_LIB_NATIVE_DIR"
2 vim etc/hadoop/core-site.xml 。。。和namenode有关
master是主机名,code1这里不改没关系,表示只配置一个hdfs,即namenode,也可以配置2个hdfs,即code1会开启一个namenode
这个hfds地址和hbase-site.xml中的一致
<configuration>
<!-- master主机名 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:8020</value> 文件系统默认8020端口,hbase.rootdir同
</property>
<!-- Size of read/write buffer used in SequenceFiles. -->
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
====下面不要设置了,存储dfs下面的namenode和datanode的 tmp文件在开启集群完毕后,就没有了,
====若是DataNode节点没起,都没法找到tmp文件
<!-- 指定hadoop临时目录,自行创建,不创建,系统默认tmp,每次都会清空,这样就会报错 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/softwre/hadoop-3.1.2/tmp</value>
</property>
</configuration>
3 vim etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/software/hadoop-3.1.2/dfs/data</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/software/hadoop-3.1.2/dfs/name</value>
</property>
</configuration>
``
4 vim etc/hadoop/mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
5 vim etc/hadoop/yarn-site.xml 与MapReduce有关
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<!-- 针对mapreduce报错的配置 -->
<property>
<name>mapreduce.application.classpath</name>
<value>/usr/local/software/hadoop-3.1.2/share/hadoop/mapreduce/*,/usr/local/software/hadoop-3.1.2/share/hadoop/mapreduce/lib/*</value>
</property>
6 [root@cent hadoop-3.1.2]# vim etc/hadoop/workers
slave1 ip 地址
slave2 ip地址
还有4个配置
[root@cent hadoop-3.1.2]# vim sbin/start-dfs.sh
[root@cent hadoop-3.1.2]# vim sbin/stop-dfs.sh
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
[root@cent hadoop-3.1.2]# vim sbin/start-yarn.sh
[root@cent hadoop-3.1.2]# vim sbin/stop-yarn.sh
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
主机格式化namenode
hadoop namenode -format
slave1,slave2配置
除了 [root@cent hadoop-3.1.2]# vim etc/hadoop/workers , vim etc/hadoop/mapred-site.xml, vim etc/hadoop/yarn-site.xml 不用配置。其他都要。也要主机格式化namenode
1、slave1,slave2可查[root@cent hadoop-3.1.2]# cd logs文件 看报错
或者namenode 格式化 看报错
2、slave: mv: 无法获取"/usr/local/software/hadoop-3.1.2/logs/hadoop-root-nodemanager-slave.out.4" 的文件状态(stat): 没有那个文件或目录,我把vim etc/hadoop/mapred-site.xml 配置删了,就不报了。但一会又报了,抽风了
在master主机 开启集群
jps检查
主机如下
4210 NodeManager
4035 ResourceManager
3508 DataNode
25716 Jps
3764 SecondaryNameNode
3337 NameNode
slave如下
9936 Jps
9077 DataNode
9257 NodeManager
感谢 https://blog.csdn.net/qq_25863199/article/details/88791498
。
。
**问题
问题一 格式化后,datanode起不来** .hadoop namenode多次格式化可能导致的问题
报错java.io.IOException: Incompatible clusterIDs in /usr/local/softwre/hadoop-3.1.2/tmp/dfs/data: namenode clusterID = CID-a36b2872-1780-4854-97c8-18c23a1bb054; datanode clusterID = CID-2c83826e-89bb-4397-b59d-a589381acf54
解决 https://blog.csdn.net/cl723401/article/details/82892703
有时hadoop-3.1.2只有tmpw文件,conf bin都没了,等一会儿就有了
总结,datanode启动时会检查并匹配namenode的版本文件里的clusterID,如果两者不匹配,就会出现"Incompatible clusterIDs"的异常。
…格式化前,删除hdfs的tmp文件,和储存namenode和datanode的目录,不然系统格式化后,不会自动删除格式化前的这些文件,还会生成新的这些文件,前后clusterID会冲突
https://www.cnblogs.com/wangshen31/p/9900987.html
https://www.cnblogs.com/neo98/articles/6305999.html
首先确认下master和slave的hdfs-site.xml配置中的dfs.namenode.name.dir目录下的current/VERSION文件是否一致,可以直接拷贝成一样的,再重启集群
cat /usr/local/software/hadoop-3.1.2/dfs/name/current/VERSION
vim /usr/local/software/hadoop-3.1.2/dfs/data/current/VERSION
data里的clusterID修改成namel里的clusterID
默认端口
hadoop 9870
hbase 8020