一. 环境说明
本例中安装一个六个节点的集群,一个控制节点,两个管理节点,三个数据节点。控制节点主要安装Ambari、Ambari Metrics等服务,用于集群各服务的控制,监控,任务提交等。管理节点主要安装各服务的Master组件,如Namenode、ResourceManager、Hive、Hbase、KDC、OpenLdap、Ranger、Ambari infra等。数据节点主要安装各服务的Slave组件,如Datanode、NodeManager、Regionserver等。
IP地址 | 主机名 | 节点类型 |
198.218.36.1 | bigdata-cn-01 | 控制节点 |
198.218.36.2 | bigdata-nn-01 | 管理节点 |
198.218.36.3 | bigdata-nn-02 | |
198.218.36.4 | bigdata-dn-01 | 数据节点 |
198.218.36.5 | bigdata-dn-02 | |
198.218.36.6 | bigdata-dn-03 |
二. 环境准备
0、 准备集群批量执行环境
-
编辑/etc/profile文件,添加如下环境变量信息: # vi /etc/profile export all=” 198.218.36.1 198.218.36.2 198.218.36.3 198.218.36.4 198.218.36.5 198.218.36.6” export sync=” 198.218.36.2 198.218.36.3 198.218.36.4 198.218.36.5 198.218.36.6”
1、 配置host (以bigdata-cn-01为例)
-
Centos6:
1
). 编辑 /etc/sysconfig/network ,修改HOSTNAME。
# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=bigdata-cn-01
2
).编辑 /etc/hosts,增加一行
#vi /etc/hosts
198.218.36.1 bigdata-cn-01.cars.com bigdata-cn-01
198.218.36.2 bigdata-nn-01.cars.com bigdata-nn-01
…..
198.218.36.6 bigdata-dn-03.cars.com bigdata-dn-06
退出ssh,再登陆,即生效。
-
Centos7:
1
). hostnamectl set-hostname bigdata-cn-01
2
).编辑 /etc/hosts,增加一行
#vi /etc/hosts
198.218.36.1 bigdata-cn-01.cars.com bigdata-cn-01
198.218.36.2 bigdata-nn-01.cars.com bigdata-nn-01
…..
198.218.36.6 bigdata-dn-03.cars.com bigdata-dn-06
退出ssh,再登陆,即生效。
- 注意:如果是在公司openstack上面安装集群的话,配置hosts文件的时候,ip要配置openstack的内部IP,不要配置浮动IP。
2、SSH免密登陆(适用于centos6 和centos 7)
-
1
)生成秘钥文件(在test1上执行)
for
h in ${all[@]}
do
ssh ${h} -C
"ssh-keygen -t rsa -P '' -f /root/.ssh/id_rsa"
ssh ${h} -C
"cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys"
done
2
)分发秘钥(在test1上执行)
for
h in ${sync[@]}
do
ssh-copy-id ${h}
done
3、集群时间同步
-
一般使用集群中的某一个节点作为时间同步服务器,本例中选择Ambari所在的机器作为ntp服务端。
安装: for h in ${all[@]} do ssh ${h} -C " yum install ntp -y " done 修改服务端/etc/ntp.conf: $ vi /etc/ntp.conf # 取消此处restrict前面的注释,并将对应的ip段设置为自己主机所在的ip段(如果是公司openstack,则要配置内部地址) # Hosts on local network are less restricted. restrict 172.16.0.0 mask 255.255.255.0 nomodify notrap # 将下面的外网server均注释掉,加下面的两行,使用本地clock作为,时间服务器 # Use public servers from the pool.ntp.org project. # Please consider joining the pool (http://www.pool.ntp.org/join.html). #server 0.rhel.pool.ntp.org iburst #server 1.rhel.pool.ntp.org iburst #server 2.rhel.pool.ntp.org iburst #server 3.rhel.pool.ntp.org iburst server 127.127.1.0 fudge 127.127.1.0 stratum 10 启动服务端ntp centos 6: service ntpd start centos 7: systemctl start ntpd 修改客户端(集群中其他节点)/etc/ntp.conf,为了方便,先拷贝一份到本地/tmp目录下,修改好再同步: $ scp root@bigdata-nn-01:/etc/ntp.conf /tmp $ vi /tmp/ntp.conf # 将文件中的public servers全部注释掉,填写服务端的地址(ip 或者 hostname 均可) # Use public servers from the pool.ntp.org project. # Please consider joining the pool (http://www.pool.ntp.org/join.html). #server 0.rhel.pool.ntp.org iburst #server 1.rhel.pool.ntp.org iburst #server 2.rhel.pool.ntp.org iburst #server 3.rhel.pool.ntp.org iburst server 172.16.0.118 iburst 同步配置文件到其他服务器: for h in ${sync[@]} do scp /tmp/ntp.conf ${h}:/etc done 同步集群的时间: for h in ${sync[@]} do ssh $h -C "ntpdate <AMBARI HOSTNAME/IP>" done 配置时间自动同步: centos6: for h in ${all[@]} do ssh ${h} -C "chkconfig ntpd on ; service ntpd start ;" ssh ${h} -C "chkconfig –list | grep ntpd" done centos7: for h in ${all[@]} do ssh ${h} -C " systemctl enable ntpd ; systemctl start ntpd;" ssh ${h} -C " systemctl is-enabled ntpd " done
4、安装Postgresql
-
正式的生产环境中需要分别安装Ambari所用到的元数据库和Hadoop集群用到的元数据库,其中Hadoop集群用到的PG数据库需要配置HA和流同步。这部分工作一般由PG组帮助完成. 下面仅提供简单的PG安装方法(rpm包的方式):
下载postgresql rpm包,下载地址:https://www.postgresql.org/download/linux/redhat/ ,或https://yum.postgresql.org/rpmchart.php主要下载postgresql95-9.5.5-1PGDG.rhel6.x86_64.rpm,postgresql95-contrib-9.5.5-1PGDG.rhel6.x86_64.rpm,postgresql95-devel-9.5.5-1PGDG.rhel6.x86_64.rpm,postgresql95-libs-9.5.5-1PGDG.rhel6.x86_64.rpm,postgresql95-server-9.5.5-1PGDG.rhel6.x86_64.rpm -
安装
yum install postgresql*.rpm -y
-
安装后检查
sudo rpm -aq| grep postgres
-
初始化数据库
centos6:
service postgresql-
9.5
initdb
centos7:
/usr/pgsql-
9.5
/bin/postgresql95-setup initdb
-
数据库开机自启
centos6:
chkconfig postgresql-
9.5
on
centos7:
systemctl enable postgresql-
9.5
-
修改监听地址
vi /var/lib/pgsql/
9.5
/data/postgresql.conf
listen_addresses =
'*'
-
允许所有ip访问
vi /var/lib/pgsql/
9.5
/data/pg_hba.conf
host all all
0.0
.
0.0
/
0
md5
-
启动数据库
centos6:
service postgresql-
9.5
start
centos7:
systemctl start postgresql-
9.5
-
进入命令行修改密码
su - posrgres
psql
postgres=# \password
- 参考鏈接: http://blog.csdn.net/shanzhizi/article/details/46484481
5、安装JDK(在test1上执行,centos6和centos7 通用)
-
1
)准备安装包jdk-8u65-linux-x64.tar.gz并解压
tar -xvf jdk-8u65-linux-x64.tar.gz -C /opt/
2
)下载jce_policy-
8
.zip安装jce
unzip -o -j -q jce_policy-
8
.zip -d /opt/jdk1.
8
.0_65/jre/lib/security/
chown
10
:
143
/opt/jdk1.
8
.0_65/jre/lib/security/local_policy.jar
chown
10
:
143
/opt/jdk1.
8
.0_65/jre/lib/security/US_export_policy.jar
3
)配置环境变量
# vi /etc/profile
export JAVA_HOME=/opt/jdk1.
8
.0_65
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
4
)分发
for
h in ${sync[@]}
do
scp -r /opt/jdk1.
8
.0_65 ${h}:/opt
scp -r /etc/profile ${h}:/etc
ssh ${h} -C “source /etc/profile”
done
5
) 检查
for
h in ${syncall[@]}
do
ssh ${h} -C “java -version”
done
6、关闭防火墙
-
Centos6:
chkconfig iptables off --永久关闭
/etc/init.d/iptables stop --临时关闭
分发:
for
h in ${all[@]}
do
ssh ${h} -C
"chkconfig iptables off; /etc/init.d/iptables stop"
ssh ${h} -C
" /etc/init.d/iptables status "
done
-
Centos7:
systemctl disable firewalld
systemctl stop firewalld
分发:
for
h in ${all[@]}
do
ssh ${h} -C
" systemctl disable firewalld;sytemctl stop firewalld"
ssh ${h} -C
"systemctl status firewalld "
done
7、关闭Selinux(centos6和centos7 通用)
-
在test1上做如下配置:
# setenforce
0
----临时禁止
# vi /etc/selinux/config ---永久禁止
SELINUX=disabled
分发配置
for
h in ${sync[@]}
do
scp -r /etc/selinux/config ${h}:/etc/selinux
done
8、关闭PackageKit
-
vim /etc/yum/pluginconf.d/refresh-packagekit.conf
[main]
enabled=
0
分发到其他机器上:
for
h in ${sync[@]}
do
scp -r /etc/yum/pluginconf.d/refresh-packagekit.conf ${h}:/etc/yum/pluginconf.d/
done
9、设置umask
-
1
) 临时改变:
umask
0022
2
) 永久改变所有用户的umask
echo umask
0022
>>/etc/profile
3
) 分发:
for
h in ${all[@]}
do
ssh root@${h}
"echo umask 0022 >>/etc/profile"
done
10、设置最大文件数
-
Centos6:
临时修改:
ulimit -n
65535
永久修改:
vim /etc/security/limits.conf
* soft nproc
65535
* hard nproc
65535
* soft nofile
65535
* hard nofile
65535
分发:
for
h in ${sync[@]}
do
scp -r /etc/security/limits.conf ${h}:/etc/security
done
-
Centos7:
临时修改:
ulimit -n
65535
永久修改:
1
) vim /etc/security/limits.conf
* soft nproc
65535
* hard nproc
65535
* soft nofile
65535
* hard nofile
65535
2
) vim /etc/systemd/system.conf 和 vim /etc/systemd/user.conf
DefaultLimitCORE=infinity
DefaultLimitNOFILE=
65535
DefaultLimitNPROC=
65535
分发:
for
h in ${sync[@]}
do
scp /etc/security/limits.conf ${h}:/etc/security
scp /etc/systemd/system.conf ${h}:/etc/systemd
scp /etc/systemd/user.conf ${h}:/etc/system
ssh $h -C
"systemctl daemon-reload"
done
- 参考链接: http://smilejay.com/2016/06/centos-7-systemd-conf-limits/
11、关闭THP
-
Centos6:
临时关闭:
# echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
# cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
always madvise [never]
永久关闭:
[root
@getlnx06
~]# vi /etc/rc.local
#!/bin/sh
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here
if
you don't
# want to
do
the full Sys V style init stuff.
touch /var/lock/subsys/local
if
test -f /sys/kernel/mm/redhat_transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
fi
分发:
for
h in ${sync[@]}
do
scp /etc/rc.loacl ${h}:/etc/
done
-
Centos7:
临时禁止:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
永久禁止:
vi /etc/systemd/system/disable-thp.service ,如果此文件不存在,创建即可
[Unit]
Description=Disable Transparent Huge Pages (THP)
[Service]
Type=simple
ExecStart=/bin/sh -c "echo
'never'
> /sys/kernel/mm/transparent_hugepage/enabled && echo
'never'
>
/sys/kernel/mm/transparent_hugepage/defrag"
[Install]
WantedBy=multi-user.target
分发:
for
h in ${sync[@]}
do
scp /etc/systemd/system/disable-thp.service ${h}:/etc/systemd/system/
ssh ${h} -C
"systemctl daemon-reload;systemctl start disable-thp;systemctl enable disable-thp"
done
- 参考链接: https://blacksaildivision.com/how-to-disable-transparent-huge-pages-on-centos
12、安装http server(只需要在ambari所在的机器上安装)
-
Centos6:
安装:yum install httpd -y
自启动:chkconfig httpd on
启动:service httpd start
-
Centos7:
安装:yum install httpd -y
自启动:systemctl enable httpd
启动:systemctl start httpd
三. 设置本地源
1、下载安装包(地址)
- Ambari: http://docs.hortonworks.com/HDPDocuments/Ambari-2.5.0.3/bk_ambari-installation/content/ambari_repositories.html
- HDP: http://docs.hortonworks.com/HDPDocuments/Ambari-2.5.0.3/bk_ambari-installation/content/hdp_25_repositories.html
2、解压安装包到http服务器目录
-
mkdir -p /var/www/html/
tar -zxvf ambari-
2.4
.
2.0
-centos7.tar.gz -C /var/www/html/
tar -zxvf HDP-
2.5
.
3.0
-centos7-rpm.tar.gz -C /var/www/html/
tar -zxvf HDP-UTILS-
1.1
.
0.21
-centos7.tar.gz -C /var/www/html/
3、下载ambari.repo,并编辑
-
编辑:vim ambari.repo
#VERSION_NUMBER=2.5.0.3-7
[ambari-2.5.0.3]
name=ambari Version - ambari-2.5.0.3
baseurl=http://198.218.36.1/ambari/centos7/
gpgcheck=1
gpgkey=http://198.218.36.1/ambari/centos7/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
-
注意:baseurl 和 gpgkey 需要找到有 RPM-GPG-KEY 的那一层目录,同理HDP和HDP-UTILS的配置也是。下载的ambari、HDP、HDP-UTILS里面自带ambari.repo、hdp.repo、hdp-utils.repo
4、安装ambari yum源
-
1
) 集群之间可以root免密(ambari-server上的机器上有amabri repo库就行):
cp ambari.repo /etc/yum.repos.d/ 即可
2
) 集群之间不可以root免密(集群中每台机器都需要配置ambari repo库):
for
h in ${sync[@]}
do
scp -r /etc/yum.repos.d/ambari.repo ${h}:/etc/yum.repos.d
done
四. 安装Ambari
1、安装ambari-server
-
安装:
yum install ambari-server
-
设置:ambari-server setup,更具体的设置选项参考 http://docs.hortonworks.com/HDPDocuments/Ambari-2.5.0.3/bk_ambari-installation/content/set_up_the_ambari_server.html ,注意:再配置ambari过程中,ambari所使用的元数据库一定要提前建好。
-
启动:
启动:ambari-server start
访问页面:http:
//<ambari server host>:8080/
2、安装ambari-agent (optional)
-
如果集群之间不能root免密,还需要在每台机器上安装ambari-agent。
1
)在每台机器上安装ambari agent
for
h in ${sync[@]}
do
ssh $h -C “yum install ambari-agent -y”
done
2
) 修改ambari-agent.ini文件的配置
#vi /etc/ambari-agent/conf/ambari-agent.ini
[server]
Hostname=<ambary server host>
url_port=
8440
secured_url_port=
8441
3
) 分发ambari-agent.ini文件
for
h in ${sync[@]}
do
scp /etc/ambari-agent/conf/ambari-agent.ini $h:/etc/ambari-agent/conf
done
4
)启动ambari agent
# ambari-agent start
五. 界面化安装hdp
1、安装hdp
- 访问:http:// <ambari host name>:8080/ --> admin/admin登录 ,具体的安装步骤如下:
- 点击launch install wizard开始安装,如下图:
- 填写集群的名字
- 选择本地yum源:
- 填写集群要安装的节点信息,并选择ambari agent的安装注册方式
- 确认集群节点的信息
上图底下有个“ Click hereto see the warnings.” 按钮,打开后会显示主机上的各种警告信息,虽然已经安装了 ambari-agent,可是还有其他可能导致安装集群失败的潜在不足,比如 ntp没做,或防火墙规则存在,虽然放行了 ssh,但是等安装hadoop集群,需要打开很多的tcp 端口,可能会导致错误发生。自己视情况检查吧(我这个环境因为安装过HDP,许多提示都是说已经安装了某包,有某些配置文件已存在等等)。一定要确保所有的检测都通过,否则安装过程中会出现错误。
- 选择要安装的服务
- 选择各服务 master组件 安装的节点,一般都放在管理节点上面
- 选择各服务 slaves和client 服务安装的节点,一般安装在数据节点上面
- 根据自身需求自定义一些服务的配置,如果有显示红色标记的配置项,一定要配置,否则无法进行下一步。下面涉及到数据库的地方,一定要提前先建立好相应的数据库,除了下面的一些配置外,生产中还需要一些额外的配置,配置项参考链接:服务配置规范。
-
- 在hive标签页,配置hive元数据库
- 在Ambari Metrics标签页,配置grafana admin的密码
- 配置Smartsense标签页,配置Activity Explorer admin用户的密码
- 在misc标签页,进行服务用户的选择,为了方便我们都合并为了一个用户hadoop。在更改yarn和hdfs用户的时候,会跳出一些其他的参数修改推荐,一律选择apply,参考下图:
- 在hive标签页,配置hive元数据库
-
- 检查所要安装服务的信息:
- 安装,启动和测试,如果安装过程中没有错误,那么直接点击完成,如果出现错误,则需要查看具体的错误,直接点击错误链接即可查看错误信息,错误解决后,点击 retry 按钮,重试安装。
- 点击 complete, 完成安装。
- 参考链接: http://docs.hortonworks.com/HDPDocuments/Ambari-2.5.0.3/bk_ambari-installation/content/ch_Deploy_and_Configure_a_HDP_Cluster.html
2、重新安装hdp
- 停止所有启动的服务
- 删除pg数据库中的ambari元数据库
-
重新设置ambari server
ambari-server setup
-
删除hadoop集群相关配置文件,需在每台机器上执行
yum -y erase hdp-select
python /usr/lib/python2.
6
/site-packages/ambari_agent/HostCleanup.py --silent --skip=users
- 再次界面话安装
六. 服务可用性检查
1.HDFS检查
-
su - hadoop hdfs dfs -put ~/.bashrc / hdfs dfs -cat /.bashrc
2.Yarn和MapReduce检查
-
su - hadoop hdfs dfs -mkdir /testin hdfs dfs -put ~/.bashrc /testin hadoop jar /usr/hdp/2.6.0.3-8/hadoop-mapreduce/hadoop-mapreduce-examples-2.7.3.2.6.0.3-8.jar wordcount /testin/ /testout 同时登录 http://<resource_manager_host>:8088 --> Applications --> running 查看是否有刚才提交的任务
3.Hive检查
-
su - hadoop hive hive> create table people(id string, name string); OK hive> insert into people values("1","louis"); Query ID = hadoop_20170620112529_356f40fa-60d4-4c25-9728-ee3ff6adb49c Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1497866533048_0004) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 7.23 s -------------------------------------------------------------------------------- Loading data to table default.people Table default.people stats: [numFiles=1, numRows=1, totalSize=8, rawDataSize=7] OK Time taken: 11.96 seconds hive> select * from people; OK 1 louis Time taken: 0.393 seconds, Fetched: 1 row(s)
4.Hbase检查
-
su - hadoop hbase shell //创建表 hbase(main):001:0> create 'test','cf' 0 row(s) in 4.8910 seconds => Hbase::Table - test //查看所有表 hbase(main):002:0> list TABLE test 1 row(s) in 0.0560 seconds => ["test"] //往表里面插入数据 hbase(main):003:0> put 'test','row1','cf:a','value1' 0 row(s) in 0.3720 seconds hbase(main):004:0> put 'test','row2','cf:b','value2' 0 row(s) in 0.0270 seconds hbase(main):005:0> put 'test','row3','cf:3','value3' 0 row(s) in 0.0220 seconds //扫描表中的所有数据 hbase(main):006:0> scan 'test' ROW COLUMN+CELL row1 column=cf:a, timestamp=1497929482464, value=value1 row2 column=cf:b, timestamp=1497929489695, value=value2 row3 column=cf:3, timestamp=1497929495188, value=value3 3 row(s) in 0.0680 seconds //获取表中的一条数据 hbase(main):007:0> get 'test','row1' COLUMN CELL cf:a timestamp=1497929482464, value=value1 1 row(s) in 0.0550 seconds
5.spark检查
-
su - hadoop // 使用spark-submit提交任务 spark-submit --class org.apache.spark.examples.SparkPi \ --master yarn --deploy-mode client \ --driver-memory 1G --executor-memory 1G \ --executor-cores 1 /usr/hdp/2.6.0.3-8/spark/lib/spark-examples-1.6.3.2.6.0.3-8-hadoop2.7.3.2.6.0.3-8.jar 40 输出的结果中会有: 17/06/20 11:48:02 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:36, took 4.955870 s Pi is roughly 3.1410777852694465 // 使用spark-shell连接hive spark-shell scala> val result = sqlContext.sql("select * from default.people") scala> result.show +---+-----+ | id| name| +---+-----+ | 1|louis| +---+-----+