1、基础环境
1.1 版本
-
CentOS 7.8
-
JDK 1.8
-
Hadoop 2.6.5
1.2 设置网络
1、vi /etc/sysconfig/network-scripts/ifcfg-eth0 2、添加修改内容: DEVICE=eth0 TYPE=Ethernet ONBOOT=yes NM_CONTROLLED=yes BOOTPROTO=static IPADDR=192.168.72.11 NETMASK=255.255.255.0 GATEWAY=192.168.72.2 DNS1=223.5.5.5 DNS2=114.114.114
1.2 设置主机名
1、vi /etc/sysconfig/network 2、修改内容: NETWORKING=yes HOSTNAME=nod01 3、添加ip与主机名映射关系: 192.168.72.11 node01
1.3 关闭防火墙
service iptables stop chkconfig iptables off
1.4 关闭SELinux
1、vi /etc/selinux/config 2、修改内容: SELINUX=disabled
1.5 时间同步
1、安装NTP yum install ntp -y 2、添加配置 vim /etc/ntp.conf server ntp1.aliyun.com 3、启动NTP service ntpd start 4、将NTP加入服务 chkconfig ntpd on
1.6 安装JDK
1、rpm -i jdk-8u181-linux-x64.rpm 2、添加Java环境变量 vim /etc/profile export JAVA_HOME=Java路径 export PATH=$PATH:$JAVA_HOME/bin 3、生效 source /etc/profile
1.7 配置SSH免密登录
1、生成密钥 ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
2、伪分布式模式
2.1 准备Hadoop
mkdir /opt/bigdata tar -xf hadoop-2.6.5.tar.gz mv hadoop-2.6.5 /opt/bigdata/
2.2 配置Hadoop环境变量
vi /etc/profile export HADOOP_HOME=/opt/bigdata/hadoop-2.6.5 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin source /etc/profile
2.3 配置Hadoop角色
1、切换到Hadoop配置文件目录 cd $HADOOP_HOME/etc/hadoop 2 配置环境变量 vi hadoop-env.sh export JAVA_HOME=Java路径 3、配置NameNode vi core-site.xml 添加配置: <property> <name>fs.defaultFS</name> <value>hdfs://node01:9000</value> </property> 4、配置hdfs-site.xml <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/var/bigdata/hadoop/local/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/var/bigdata/hadoop/local/dfs/data</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>node01:50090</value> </property> <property> <name>dfs.namenode.checkpoint.dir</name> <value>/var/bigdata/hadoop/local/dfs/secondary</value> </property> 5、配置DataNode vi slaves node01
2.4 初始化、启动
hdfs namenode -format start-dfs.sh 访问: http://node01:50070 http://node01:50090
2.5 使用
创建目录: hdfs dfs -mkdir /bigdata hdfs dfs -mkdir -p /user/root 上传文件: hdfs dfs -put hadoop*.tar.gz /user/root
3、分布式模式
3.1 节点规划
NameNode: node01 SecondaryNameNode: node02 DataNode: node2、node3、node4
3.2 配置节点
cd $HADOOP/etc/hadoop 1、配置NameNode vi core-sit.xml 添加配置: <property> <name>fs.defaultFS</name> <value>hdfs://node01:9000</value> </property> 2、配置hdfs-site.xml <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/var/bigdata/hadoop/full/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/var/bigdata/hadoop/full/dfs/data</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>node02:50090</value> </property> <property> <name>dfs.namenode.checkpoint.dir</name> <value>/var/bigdata/hadoop/full/dfs/secondary</value> </property> 3、配置DataNode vi slaves node02 node03 node04 4、分发 cd /opt scp -r ./big/data/ node02:`pwd` scp -r ./big/data/ node03:`pwd` scp -r ./big/data/ node04:`pwd`
3.3 启动
1、格式化 hdfs namenode -format 2、启动 start-dfs.sh 3、检查进程 jps 4、访问: http://node01:50070 http://node01:50090
3.4 使用
创建目录: hdfs dfs -mkdir /bigdata hdfs dfs -mkdir -p /user/root 上传文件: hdfs dfs -put hadoop*.tar.gz /user/root