1.CDH的概述
目前Hadoop比较流行的主要有2个版本,Apache和Cloudera版本。
-
Apache Hadoop:社区人员比较多,更新频率比较快,但是稳定性比较差,安装配置繁琐,实际使用者少。
-
Cloudera Hadoop(CDH):Cloudera公司的发行版本,基于Apache Hadoop的二次开发,优化了组件兼容和交互接口、简化安装配置、提供界面统一管理程序。
2.Cloudera Manager 介绍
Cloudera Manager 是用于管理cdh集群的端到端应用程序,统一管理和安装。CDH除了可以通过cm安装也可以通过yum,tar,rpm安装。主要由如下几部分组成:
-
服务端/Server:
Cloudera Manager 的核心。主要用于管理 web server 和应用逻辑。它用于安装软件,配置,开始和停止服务,以及管理服务运行的集群。
-
代理/agent: 安装在每台主机上。它负责启动和停止进程,部署配置,触发安装和监控主机。
-
数据库/Database: 存储配置和监控信息。通常可以在一个或多个数据库服务器上运行的多个逻辑数据库。例如,所述的 Cloudera 管理器服务和监视,后台程序使用不同的逻辑数据库。 Cloudera Repository:由cloudera manager 提供的软件分发库。
-
客户端/Clients: 提供了一个与 Server 交互的接口。
3.环境准备
3.1.节点准备(四个节点)
主机名 | IP | CM管理软件 |
---|---|---|
nn01 | 192.168.18.110 | Cloudera Manager Server&Agent ,MariaDB |
dn01 | 192.168.18.111 | Cloudera Manager Agent |
dn02 | 192.168.18.112 | Cloudera Manager Agent |
dn02 | 192.168.18.113 | Cloudera Manager Agent |
本次实验
准备两台虚拟机,每台虚拟机的内存为4G,磁盘为40G;
主机 | IP | CM管理软件 |
---|---|---|
cdh01 | 192.168.140.39 | Cloudera Manager Server&Agent ,MariaDB |
cdh02 | 192.168.140.40 | Cloudera Manager Agent |
网络配置
静态IP设置(每个节点)
vim /etc/sysconfig/network-scripts/ifcfg-ens33
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=ens33
UUID=35157ce9-db59-4b7b-a853-73b50b5f6ef8
DEVICE=ens33
ONBOOT=no
IPADDR=192.168.140.39
PREFIX=24
GATEWAY=192.168.140.2
DNS1=192.168.140.2
IPV6_PRIVACY=no
service network restart #重启网络生效
3.2.配置主机名和hosts解析(所有节点)
编辑/etc/hostname,修改主机名,并使用命令hostname使其立刻生效。编辑文件/etc/hosts,增加如下内容。
实现的步骤:
cdh01:
# 临时修改hostname
[root@localhost ~]# hostname cd01
#修改 hosts的配置 /etc/hosts
[root@localhost ~]# vim /etc/hosts
192.168.140.39 cdh01
192.168.140.39 cdh02
cdh02:
# 临时修改hostname
[root@localhost ~]# hostname cd02
#修改 hosts的配置 /etc/hosts
[root@localhost ~]# vim /etc/hosts
192.168.140.39 cdh01
192.168.140.39 cdh02
# 检测
[root@localhost ~]# hostname
cdh02
3.3.关闭防火墙
# cdh01,cdh02操作
[root@localhost resource]# systemctl stop firewalld.service #临时关闭
[root@localhost resource]# systemctl disable firewalld.service #永久关闭
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
3.4.关闭SELinux
# cdh01,cdh02操作
[root@localhost resource] # sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
[root@localhost resource] # setenforce 0
3.5.配置时间同步-
一般采用NTP的方式;
方式1: chrony方式
chrony既可作时间服务器服务端,也可作客户端。chrony性能比ntp要好很多,且chrony配置简单、管理方便。 但是此次我们采用定时任务同步网络时间的方法。
-
添加定时任务
[root@localhost resource]# echo "$((RANDOM%60)) $((RANDOM%24)) * * * /usr/sbin/ntpdate time1.aliyun.com" >> /var/spool/cron/root
方式2:NTP方式
NTP服务在集群中是非常重要的服务,它是为了保证集群中的每个节点的时间在同一个频道上的服务。如果集群内网有时间同步服务,只需要在每个节点配置上NTP客户端配置,和时间同步服务同步实际就行,但如果没有时间同步服务,那就需要我们配置NTP服务。
规划如下,当可以访问时间同步服务,例如可以直接和亚洲NTP服务进行同步。例如不能访问时,可以将cdh1.example.com配置为NTP服务端。集群内节点和这个服务进行时间同步。
step1 ntpd service
CDH01,CDH02
#修改时区(改为中国标准时区)
[root@localhost resource]#ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
#安装ntp
[root@localhost resource]#yum -y install ntp
# NTP服务,如果没有先安装
[root@localhost resource]#systemctl status ntpd.service
step2 与系统时间一起同步
非常重要 硬件时间与系统时间一起同步。修改配置文件vim /etc/sysconfig/ntpd
。末尾新增代码SYNC_HWCLOCK=yes
CDH01,CDH02
# Command line options for ntpd
#OPTIONS="-x -u ntp:ntp -p /var/run/ntpd.pid"
OPTIONS="-g"
SYNC_HWCLOCK=yes
step3 添加NTP服务列表
编辑vim /etc/ntp/step-tickers
# List of NTP servers used by the ntpdate service.
#0.centos.pool.ntp.org
cdh1
step4 NTP服务端ntp.conf
修改ntp配置文件vim /etc/ntp.conf
cdh01:
driftfile /var/lib/ntp/drift
logfile /var/log/ntp.log
pidfile /var/run/ntpd.pid
leapfile /etc/ntp.leapseconds
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
#允许任何IP的客户端进行时间同步,但不允许修改NTP服务端参数,default类似于0.0.0.0
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
#restrict 10.135.3.58 nomodify notrap nopeer noquery
#允许通过本地回环接口进行所有访问
restrict 127.0.0.1
restrict -6 ::1
# 允许内网其他机器同步时间。网关和子网掩码。注意有些集群的网关可能比价特殊,可以用下面的命令查看
# 查看网关信息:/etc/sysconfig/network-scripts/ifcfg-网卡名;route -n、ip route show
restrict 192.168.33.2 mask 255.255.255.0 nomodify notrap
# 允许上层时间服务器主动修改本机时间
#server asia.pool.ntp.org minpoll 4 maxpoll 4 prefer
# 外部时间服务器不可用时,以本地时间作为时间服务
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server 127.127.1.0 # local clock
fudge 127.127.1.0 stratum 10
step5 NTP客户端ntp.conf
CDH02
driftfile /var/lib/ntp/drift
logfile /var/log/ntp.log
pidfile /var/run/ntpd.pid
leapfile /etc/ntp.leapseconds
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
restrict 127.0.0.1
restrict -6 ::1
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server 192.168.33.3 iburst
step6 NTP服务重启和同步
#重启服务
systemctl restart ntpd.service
#开机自启
chkconfig ntpd on
ntpq -p
#ntpd -q -g
#ss -tunlp | grep -w :123
#手动触发同步
#ntpdate -uv cdh1.example.com
ntpdate -u cdh1.example.com
# 查看同步状态。需要过一段时间,查看状态会变成synchronised
ntpstat
timedatectl
ntptime
step7 NTP服务状态查看
如果显示如下则同步是正常的状态(状态显示 PLL,NANO):
[root@cdh2 ~]# ntptime
ntp_gettime() returns code 0 (OK)
time e0b2b842.b180f51c Fri, Apr 19 2019 11:09:20.333, (.693374110),
maximum error 27426 us, estimated error 0 us, TAI offset 0
ntp_adjtime() returns code 0 (OK)
modes 0x0 (),
offset 0.000 us, frequency 3.932 ppm, interval 1 s,
maximum error 27426 us, estimated error 0 us,
status 0x2001 (PLL,NANO),
time constant 6, precision 0.001 us, tolerance 500 ppm,
或者使用timedatectl命令查看(如果显示 NTP synchronized: yes,则同步成功):
[root@cdh2 ~]# timedatectl
Local time: Fri 2019-04-19 11:09:20 CST
Universal time: Fri 2019-04-19 11:09:20 UTC
RTC time: Fri 2019-04-19 11:09:20
Time zone: Asia/Shanghai (CST, +0800)
NTP enabled: no
NTP synchronized: yes
RTC in local TZ: no
DST active: n/a
*在master节点上安装httpd*
# 查看该centos7是否存在httpd服务
rpm -qa|grep httpd
# 如果不存在该服务就安装
yum install -y httpd
# 启动该服务
systemctl start httpd.service #启动
systemctl stop httpd.service #停止
systemctl restart httpd.service #重启
# 设置该服务是否开机启动
systemctl enable httpd.service #开机启动
systemctl disable httpd.service #开机不启动
# 查看该服务的状态
systemctl status httpd.service
3.6.禁用透明大页面压缩,CDH配置需要
# echo never > /sys/kernel/mm/transparent_hugepage/defrag
# echo never > /sys/kernel/mm/transparent_hugepage/enabled
-
并将上面的两条命令写入开机自启动/etc/rc.local
3.7.优化交换分区
# echo "vm.swappiness = 10" >> /etc/sysctl.conf
# sysctl -p
3.8.配置SSH免密登录
-
所有节点执行如下命令(四次回车):
# ssh-keygen -t rsa
-
用拷贝的方法分发秘钥,所有节点执行如下命令:
#ssh-copy-id [nn01,dn01-dn03]
-
总共四次拷贝,每次拷贝按提示输入
yes
和相应节点的密码。
4.安装 CM 和 CDH
4.2.配置 JDK (所有节点)---采用默认的,这个跳过
#1.检测是否存在orcal的JDK
[root@cdh01 jdk1.8.0_65]# rpm -qa | grep jdk
copy-jdk-configs-3.3-10.el7_5.noarch
java-1.8.0-openjdk-headless-1.8.0.222.b03-1.el7.x86_64
java-1.8.0-openjdk-1.8.0.222.b03-1.el7.x86_64
# 如果出现rpm就要进行删除 rpm -e --nodeps rpm名字
[root@cdh01 jdk1.8.0_65]# rpm -e --nodeps copy-jdk-configs-3.3-10.el7_5.noarch
[root@cdh01 jdk1.8.0_65]# rpm -e --nodeps java-1.8.0-openjdk-headless-1.8.0.222.b03-1.el7.x86_64
[root@cdh01 jdk1.8.0_65]# rpm -e --nodeps java-1.8.0-openjdk-1.8.0.222.b03-1.el7.x86_64
#2.上传解压JDK
#3.配置环境
vim /etc/profile
unset i
unset -f pathmunge #追加JDK的配置
export JAVA_HOME=/data/software/jdk1.8.0_65
export CLASSPATH=$JAVA_HOME/lib/
export PATH=$PATH:$JAVA_HOME/bin
export PATH JAVA_HOME CLASSPATH
[root@localhost software]# source /etc/profile
#检测JDK的版本
[root@cdh01 jdk1.8.0_65]# java -version
java version "1.8.0_65" #出现的内容
Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)
4.2 安装MySQL数据库(在cdh01节点)
#1.卸载以前安装过的mysql文件
rpm -qa | grep -i mysql
#参考如下卸载
rpm -e --nodeps mysql-libs-5.1.71-1.el6.x86_64
#2.上传解压-创建一个文件夹,在文件夹里面解压
[root@cdh01 software]# mkdir mysql
[root@cdh01 software]# cd mysql/
[root@cdh01 mysql]# tar -xvf mysql-5.7.26-1.el7.x86_64.rpm-bundle.tar
#3.卸载自带的maraidb-lib-版本号
rpm -qa|grep mariadb
mariadb-libs-版本号
rpm -e --nodeps mariadb-libs-版本号
rpm -qa|grep mariadb
rpm -e --nodeps mariadb-libs-5.5.64-1.el7.x86_64
#若卸载不了可使用 yum remove 包名 -y 命令
#4.安装mysql-community-common-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-common-5.7.26-1.el7.x86_64.rpm
#5.安装mysql-community-libs-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-libs-5.7.26-1.el7.x86_64.rpm
#6.安装mysql-community-client-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-client-5.7.26-1.el7.x86_64.rpm
#7、装libaio
yum install libaio
#8.安装mysql-community-server-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-server-5.7.26-1.el7.x86_64.rpm
#9.初始化数据库
mysqld --initialize --user=mysql
#10、启动mysql
service mysqld start
#11.获取mysql初始化密码
grep 'temporary password' /var/log/mysqld.log
#12.修改mysql密码
mysqladmin -u root -p password
#13.进入数据库
mysql -uroot -p
#14.修改/etc/my.cnf (默认在这个目录)
vim /etc/my.cnf
#配置内容如下
[mysqld]
datadir=/var/lib/mysql #原本就有的
socket=/var/lib/mysql/mysql.sock#原本就有的
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0#原本就有的
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log#原本就有的
pid-file=/var/run/mysqld/mysqld.pid#原本就有的
sql_mode=STRICT_ALL_TABLES
#15.将mysql 加到 开机启动中
systemctl enable mysqld
#16。重启mysql
service mysqld restart
4.4 安装 MySQL JDBC 驱动(所有节点)
用于各节点连接数据库
#1.切到指定目录
cd /usr/share/java/
#2.解压重命名
mv mysql-connector-java-5.1.47.jar mysql-connector-java.jar
#3.远程scp给其他节点
scp mysql-connector-java.jar root@192.168.140.40:/usr/share/java
4.5.配置CM和CDH
step1.为 Cloudera 各软件创建数据库(在CDH01节点)
Service | Database | User |
---|---|---|
Cloudera Manager Server | scm | scm |
Activity Monitor | amon | amon |
Reports Manager | rman | rman |
Reports Manager | sentry | sentry |
Cloudera Navigator Audit Server | nav | nav |
Cloudera Navigator Metadata Server | navms | navms |
Hive Metastore Server | hive | hive |
Hue | hue | hue |
Oozie | oozie | oozie |
step2.登陆MySQL进行配置
CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm';
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';
CREATE DATABASE hive DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hive.* TO 'hive'@'%' IDENTIFIED BY 'hive';
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'navms';
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
flush privileges;
设置 Cloudera Manager 数据库
# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
接着,输入scm数据库密码
scm
step3.配置 Cloudera Manager 仓库(所有节点)
CM安装成功之后,接下来我们就可以通过CM安装CDH的方式构建企业大数据平台。所以首先需要把CDH的parcels包下载到CM主服务器上。同样的,我们为了加速我们的安装,我们可以把需要下载的软件包提前下载下来,也可以创建CDH私有仓库。
#创建文件夹存储资源
mkdir /data/clouderaManager
#存放资源
cloudera-manager-agent-6.2.0-968826.el7.x86_64.rpm
cloudera-manager-server-6.2.0-968826.el7.x86_64.rpm
cloudera-manager-daemons-6.2.0-968826.el7.x86_64.rpm
step4.安装 CM Server 和 Agent
cdh01:
# yum localinstall cloudera-manager-daemons-6.2.0-968826.el7.x86_64.rpm -y
# yum localinstall cloudera-manager-agent-6.2.0-968826.el7.x86_64.rpm -y
# yum localinstall cloudera-manager-server-6.2.0-968826.el7.x86_64.rpm -y
cdh[02-03]:
# yum localinstall cloudera-manager-daemons-6.2.0-968826.el7.x86_64.rpm -y
# yum localinstall cloudera-manager-agent-6.2.0-968826.el7.x86_64.rpm -y
出现的问题及其解决
2:postfix-2.10.1-6.el7.x86_64 has missing requires of libmysqlclient.so.18()(64bit)
2:postfix-2.10.1-6.el7.x86_64 has missing requires of libmysqlclient.so.18(libmysqlclient_18)(64bit)
重点关注:libmysqlclient.so.18()(64bit)
解决:cd
缺少Percona-XtraDB-Cluster-shared-55-5.5.37-25.10.756.el6.x86_64.rpm这个包
# wget http://www.percona.com/redir/downloads/Percona-XtraDB-Cluster/5.5.37-25.10/RPM/rhel6/x86_64/Percona-XtraDB-Cluster-shared-55-5.5.37-25.10.756.el6.x86_64.rpm
# rpm -ivh Percona-XtraDB-Cluster-shared-55-5.5.37-25.10.756.el6.x86_64.rpm
step5.安装 CDH(在cdh01节点)
#1 创建文件夹上传文件
[root@localhost software]# /opt/cloudera/parcel-repo
上传文件
CDH-6.2.0-1.cdh6.2.0.p0.967373-el7.parcel
manifest.json
#2.生成sha文件
# sha1sum CDH-6.2.0-1.cdh6.2.0.p0.967373-el7.parcel | awk '{ print $1 }' > CDH-6.2.0-1.cdh6.2.0.p0.967373-el7.parcel.sha
#3.修改属主属组
chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo/*
step6.启动 Cloudera Manager Server(在cdh01节点)
# systemctl start cloudera-scm-server
如果启动中有什么问题,可以查看日志。
# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
CDH的web操作参考下一篇博客