说明
1.本说明文档含比赛部分组件的安装部署配置信息仅供参考。
2.蓝色部分为根据不同需求需要修改,请注意。
3.部分命令需要根据比赛需求做出更改,文档中已加入部分备用命令,其他备用命令还在尝试。
一、基础环境
1.换源
cd /etc/yum.repos.d
wget -O bigdata.repo
http://172.16.47.240/bigdata/repofile/bigdata.repo
wget -p http://49.232.221.239/scene.pkg
yum clean all
yum makecache
2.修改主机名
hostnamectl set-hostname master
bash
3.关闭防火墙(停止、查看、开机不启动)
systemctl stop firewalld.service
systemctl status firewalld.service
systemctl disable firewalld.service
4.修改hosts
echo "172.18.52.170 master" >> /etc/hosts
echo "172.18.52.171 slave1" >> /etc/hosts
echo "172.18.52.172 slave2" >> /etc/hosts
5.JDK安装
查询是否有Java:rpm -qa | grep java
卸载:
rpm -e --nodeps java-1.8.0-openjdk
rpm -e --nodeps java-1.7.0-openjdk
rpm -e --nodeps java-1.8.0-openjdk-headless
rpm -e --nodeps java-1.7.0-openjdk-headless
解压+安装+分发
6.配置免密登陆
ssh-keygen -t dsa (备用:ssh-keygen -t rsa )
ssh-copy-id master
scp /root/.ssh/authorized_keys slave1:/root/.ssh/
7.时间同步
master:
tzselect
得到时区环境变量:TZ='Asia/Shanghai'; export TZ
查看是否安装:rpm -qa | grep ntp
安装ntp: yum install -y ntp
关闭ntp: systemctl stop ntpd
在master配置文件 vim /etc/ntp.conf
a)修改1(授权192.168.1.0-192.168.1.255网段上的所有机器可以从这台机器上查询和同步时间)
#restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap为
restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap
b)修改2(集群在局域网中,不使用其他互联网上的时间)
server 0.centos.pool.ntp.org iburst
server 1.centos.pool.ntp.org iburst
server 2.centos.pool.ntp.org iburst
server 3.centos.pool.ntp.org iburst为
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
c)添加3
server 127.127.1.0
fudge 127.127.1.0 stratum 10
硬件时间:echo SYNC_HWCLOCK=yes >> /etc/sysconfig/ntpd
slave:
安装ntp:yum install -y ntpdate
定时任务:crontab -e
设置:~~*/30 8-17 * * *~~ /usr/sbin/ntpdate ~~master~~
测试:date -s "2020-6-11 11:11:11"
ntpdate master
硬件时间:echo SYNC_HWCLOCK=yes >> /etc/sysconfig/ntpd
8.环境变量
export JAVA_HOME=/usr/java/jdk1.8.0_171
export PATH=$JAVA_HOME/bin:$PATH
export HADOOP_HOME=/usr/hadoop/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HIVE_HOME=/usr/hive/apache-hive-2.1.1-bin
export PATH=$HIVE_HOME/bin:$PATH
export ZOOKEEPER_HOME=/usr/zookeeper/zookeeper-3.4.10
export PATH=$ZOOKEEPER_HOME/bin:$PATH
export HBASE_HOME=/usr/hbase/hbase-1.2.4
export PATH=$PATH:$HBASE_HOME/bin
export SPARK_HOME=/usr/spark/spark-2.4.0-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/sbin
export TZ=Asia/Shanghai
刷新:source /etc/profile
分发
二、Hadoop配置文件
修改几个env文件的JAVA_HOME
core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/hadoop/hadoop-2.7.3/data/tmp</value>
</property>
hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/hadoop/hadoop-2.7.3/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/hadoop/hadoop-2.7.3/hdfs/data</value>
</property>
yarn-site.xml
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:18030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:18088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:18025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:18141</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>slave1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>slave1:19888</value>
</property>
slaves
slave1
slave2
解压+安装+配置+分发
格式化集群:hadoop namenode -format
启动:start-yarn.sh start-dfs.sh
启动历史服务器:去配置的历史服务器启动
mr-jobhistory-daemon.sh start historyserver
三、Zookeeper配置文件
zoo.cfg
dataDir=/usr/zookeeper/zookeeper-3.4.10/zkdata
server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888
vim /usr/zookeeper/zookeeper-3.4.10/zkdata/myid
分发
修改slave1和slave2myid
启动ZK服务: zkServer.sh start
查看ZK服务状态: zkServer.sh status
四、MySQL配置
安装:yum -y install mysql-community-server
启动:systemctl start mysqld
查看初始密码:
grep ‘temporary password’ /var/log/mysqld.log
查看密码策略:show variables like ‘%password%’;
set global validate_password_policy=0; //设置密码策略等级
set global validate_password_length=6; //设置密码最小长度
也可以通过/etc/my.cnf配置文件找到相关项进行设置
设置远程登陆:update user set host=’%’ where user=‘root’;
设置密码:SET PASSWORD = PASSWORD(‘123456’);
刷新:flush privileges;
五、Hive配置
hive-env.sh
HADOOP_HOME=/usr/hadoop/hadoop-2.7.3
export HIVE_CONF_DIR=/usr/hive/apache-hive-2.1.1-bin/conf
export HIVE_AUX_JARS_PATH=/usr/hive/apache-hive-2.1.1-bin/lib
客户端
hive-site.xml
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive_remote/warehouse</value>
</property>
<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://slave1:9083</value>
</property>
</configuration>
服务端
hive-site.xml
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive_remote/warehouse</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://slave2:3306/hive?createDatabaseIfNotExist=true&useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
</configuration>
拷贝 mysql驱动
初始化元数据库:schematool -initSchema -dbType mysql
备用:schematool -dbType mysql initSchema
后台启动:hive --service metastore &
六、Spark配置
spark-env.sh
export SPARK_MASTER_IP=master
export SPARK_WORKER_MEMORY=8g
slaves
slave1
slave2
七、Hbase配置
hbase-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_171
export HBASE_MANAGES_ZK=false
hbase-site.xml
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master.info.port</name>
<value>60010</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave1,slave2</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/zookeeper/zookeeper-3.4.10</value>
</property>
regionservers
slave1
slave2
分发
启动:start-hbase.sh