Centos7.5+CDH 6.2搭建大数据平台

1.CDH的概述

目前Hadoop比较流行的主要有2个版本,Apache和Cloudera版本。

  • Apache Hadoop:社区人员比较多,更新频率比较快,但是稳定性比较差,安装配置繁琐,实际使用者少。

  • Cloudera Hadoop(CDH):Cloudera公司的发行版本,基于Apache Hadoop的二次开发,优化了组件兼容和交互接口、简化安装配置、提供界面统一管理程序。

 

2.Cloudera Manager 介绍

Cloudera Manager 是用于管理cdh集群的端到端应用程序,统一管理和安装。CDH除了可以通过cm安装也可以通过yum,tar,rpm安装。主要由如下几部分组成:

  • 服务端/Server:

    Cloudera Manager 的核心。主要用于管理 web server 和应用逻辑。它用于安装软件,配置,开始和停止服务,以及管理服务运行的集群。

  • 代理/agent: 安装在每台主机上。它负责启动和停止进程,部署配置,触发安装和监控主机。

  • 数据库/Database: 存储配置和监控信息。通常可以在一个或多个数据库服务器上运行的多个逻辑数据库。例如,所述的 Cloudera 管理器服务和监视,后台程序使用不同的逻辑数据库。 Cloudera Repository:由cloudera manager 提供的软件分发库。

  • 客户端/Clients: 提供了一个与 Server 交互的接口。

 

3.环境准备

3.1.节点准备(四个节点)

主机名IPCM管理软件
nn01192.168.18.110Cloudera Manager Server&Agent ,MariaDB
dn01192.168.18.111Cloudera Manager Agent
dn02192.168.18.112Cloudera Manager Agent
dn02192.168.18.113Cloudera Manager Agent

本次实验

准备两台虚拟机,每台虚拟机的内存为4G,磁盘为40G;

主机IPCM管理软件
cdh01192.168.140.39Cloudera Manager Server&Agent ,MariaDB
cdh02192.168.140.40Cloudera Manager Agent

 

网络配置

静态IP设置(每个节点)

vim /etc/sysconfig/network-scripts/ifcfg-ens33
​
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=ens33
UUID=35157ce9-db59-4b7b-a853-73b50b5f6ef8
DEVICE=ens33
ONBOOT=no
IPADDR=192.168.140.39
PREFIX=24
GATEWAY=192.168.140.2
DNS1=192.168.140.2
IPV6_PRIVACY=no
​
service network restart #重启网络生效

 

3.2.配置主机名和hosts解析(所有节点)

编辑/etc/hostname,修改主机名,并使用命令hostname使其立刻生效。编辑文件/etc/hosts,增加如下内容。

实现的步骤:

cdh01:
# 临时修改hostname
[root@localhost ~]# hostname cd01
​
#修改 hosts的配置 /etc/hosts
[root@localhost ~]# vim /etc/hosts
192.168.140.39 cdh01
192.168.140.39 cdh02
​
cdh02:
# 临时修改hostname
[root@localhost ~]# hostname cd02
​
#修改 hosts的配置 /etc/hosts
[root@localhost ~]# vim /etc/hosts
192.168.140.39 cdh01
192.168.140.39 cdh02
​
# 检测
[root@localhost ~]# hostname
cdh02
​

3.3.关闭防火墙

# cdh01,cdh02操作
[root@localhost resource]# systemctl stop firewalld.service          #临时关闭
[root@localhost resource]# systemctl disable firewalld.service       #永久关闭
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.

3.4.关闭SELinux

# cdh01,cdh02操作
[root@localhost resource] # sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
[root@localhost resource] # setenforce 0

 

3.5.配置时间同步-

一般采用NTP的方式;

方式1: chrony方式

chrony既可作时间服务器服务端,也可作客户端。chrony性能比ntp要好很多,且chrony配置简单、管理方便。 但是此次我们采用定时任务同步网络时间的方法。
  • 添加定时任务

[root@localhost resource]# echo "$((RANDOM%60)) $((RANDOM%24)) * * * /usr/sbin/ntpdate time1.aliyun.com" >> /var/spool/cron/root

方式2:NTP方式

NTP服务在集群中是非常重要的服务,它是为了保证集群中的每个节点的时间在同一个频道上的服务。如果集群内网有时间同步服务,只需要在每个节点配置上NTP客户端配置,和时间同步服务同步实际就行,但如果没有时间同步服务,那就需要我们配置NTP服务。

规划如下,当可以访问时间同步服务,例如可以直接和亚洲NTP服务进行同步。例如不能访问时,可以将cdh1.example.com配置为NTP服务端。集群内节点和这个服务进行时间同步。

step1 ntpd service

CDH01,CDH02
#修改时区(改为中国标准时区)
[root@localhost resource]#ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
#安装ntp 
[root@localhost resource]#yum -y install ntp
​
# NTP服务,如果没有先安装
[root@localhost resource]#systemctl status ntpd.service

 

step2 与系统时间一起同步

非常重要 硬件时间与系统时间一起同步。修改配置文件vim /etc/sysconfig/ntpd。末尾新增代码SYNC_HWCLOCK=yes

CDH01,CDH02
# Command line options for ntpd
#OPTIONS="-x -u ntp:ntp -p /var/run/ntpd.pid"
OPTIONS="-g"
SYNC_HWCLOCK=yes

step3 添加NTP服务列表

编辑vim /etc/ntp/step-tickers

# List of NTP servers used by the ntpdate service.
​
#0.centos.pool.ntp.org
cdh1

step4 NTP服务端ntp.conf

修改ntp配置文件vim /etc/ntp.conf

cdh01:
driftfile /var/lib/ntp/drift
logfile /var/log/ntp.log
pidfile   /var/run/ntpd.pid
leapfile  /etc/ntp.leapseconds
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
#允许任何IP的客户端进行时间同步,但不允许修改NTP服务端参数,default类似于0.0.0.0
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
#restrict 10.135.3.58 nomodify notrap nopeer noquery
#允许通过本地回环接口进行所有访问
restrict 127.0.0.1
restrict  -6 ::1
# 允许内网其他机器同步时间。网关和子网掩码。注意有些集群的网关可能比价特殊,可以用下面的命令查看
# 查看网关信息:/etc/sysconfig/network-scripts/ifcfg-网卡名;route -n、ip route show  
restrict 192.168.33.2 mask 255.255.255.0 nomodify notrap
# 允许上层时间服务器主动修改本机时间
#server asia.pool.ntp.org minpoll 4 maxpoll 4 prefer
# 外部时间服务器不可用时,以本地时间作为时间服务
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server  127.127.1.0     # local clock
fudge   127.127.1.0 stratum 10
​

step5 NTP客户端ntp.conf

CDH02
driftfile /var/lib/ntp/drift
logfile /var/log/ntp.log
pidfile   /var/run/ntpd.pid
leapfile  /etc/ntp.leapseconds
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
restrict 127.0.0.1
restrict -6 ::1
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server 192.168.33.3 iburst
​

step6 NTP服务重启和同步

#重启服务
systemctl restart ntpd.service
#开机自启
chkconfig ntpd on
​
ntpq -p
#ntpd -q -g 
#ss -tunlp | grep -w :123
#手动触发同步
#ntpdate -uv cdh1.example.com
ntpdate -u  cdh1.example.com
​
# 查看同步状态。需要过一段时间,查看状态会变成synchronised
ntpstat
timedatectl
ntptime
​

step7 NTP服务状态查看

如果显示如下则同步是正常的状态(状态显示 PLL,NANO):

[root@cdh2 ~]# ntptime
ntp_gettime() returns code 0 (OK)
  time e0b2b842.b180f51c  Fri, Apr 19 2019  11:09:20.333, (.693374110),
  maximum error 27426 us, estimated error 0 us, TAI offset 0
ntp_adjtime() returns code 0 (OK)
  modes 0x0 (),
  offset 0.000 us, frequency 3.932 ppm, interval 1 s,
  maximum error 27426 us, estimated error 0 us,
  status 0x2001 (PLL,NANO),
  time constant 6, precision 0.001 us, tolerance 500 ppm,
​

或者使用timedatectl命令查看(如果显示 NTP synchronized: yes,则同步成功):

[root@cdh2 ~]#  timedatectl
      Local time: Fri 2019-04-19 11:09:20 CST
  Universal time: Fri 2019-04-19 11:09:20 UTC
        RTC time: Fri 2019-04-19 11:09:20
       Time zone: Asia/Shanghai (CST, +0800)
     NTP enabled: no
NTP synchronized: yes
 RTC in local TZ: no
      DST active: n/a
​

*在master节点上安装httpd*

​
# 查看该centos7是否存在httpd服务
rpm -qa|grep httpd
 
# 如果不存在该服务就安装
yum install -y httpd
 
# 启动该服务
systemctl start httpd.service #启动
systemctl stop httpd.service #停止
systemctl restart httpd.service #重启
 
# 设置该服务是否开机启动
systemctl enable httpd.service #开机启动
systemctl disable httpd.service #开机不启动
 
# 查看该服务的状态
systemctl status httpd.service

 

3.6.禁用透明大页面压缩,CDH配置需要

# echo never > /sys/kernel/mm/transparent_hugepage/defrag
# echo never > /sys/kernel/mm/transparent_hugepage/enabled
  • 并将上面的两条命令写入开机自启动/etc/rc.local

     

3.7.优化交换分区

# echo "vm.swappiness = 10" >> /etc/sysctl.conf
# sysctl -p

 

3.8.配置SSH免密登录

  • 所有节点执行如下命令(四次回车):

# ssh-keygen -t rsa 
  • 用拷贝的方法分发秘钥,所有节点执行如下命令:

 #ssh-copy-id [nn01,dn01-dn03]
  • 总共四次拷贝,每次拷贝按提示输入yes和相应节点的密码。

4.安装 CM 和 CDH

 

4.2.配置 JDK (所有节点)---采用默认的,这个跳过

#1.检测是否存在orcal的JDK
[root@cdh01 jdk1.8.0_65]# rpm -qa | grep jdk
copy-jdk-configs-3.3-10.el7_5.noarch
java-1.8.0-openjdk-headless-1.8.0.222.b03-1.el7.x86_64
java-1.8.0-openjdk-1.8.0.222.b03-1.el7.x86_64
# 如果出现rpm就要进行删除 rpm -e --nodeps rpm名字
[root@cdh01 jdk1.8.0_65]# rpm -e --nodeps copy-jdk-configs-3.3-10.el7_5.noarch
[root@cdh01 jdk1.8.0_65]# rpm -e --nodeps java-1.8.0-openjdk-headless-1.8.0.222.b03-1.el7.x86_64
[root@cdh01 jdk1.8.0_65]# rpm -e --nodeps java-1.8.0-openjdk-1.8.0.222.b03-1.el7.x86_64
​
#2.上传解压JDK
​
#3.配置环境
vim /etc/profile
unset i
unset -f pathmunge #追加JDK的配置
export JAVA_HOME=/data/software/jdk1.8.0_65 
export CLASSPATH=$JAVA_HOME/lib/
export PATH=$PATH:$JAVA_HOME/bin
export PATH JAVA_HOME CLASSPATH
[root@localhost software]# source /etc/profile
​
#检测JDK的版本
[root@cdh01 jdk1.8.0_65]# java -version
java version "1.8.0_65"  #出现的内容
Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)

 

4.2 安装MySQL数据库(在cdh01节点)

#1.卸载以前安装过的mysql文件
rpm -qa | grep -i mysql
#参考如下卸载
rpm -e --nodeps mysql-libs-5.1.71-1.el6.x86_64
​
#2.上传解压-创建一个文件夹,在文件夹里面解压
[root@cdh01 software]# mkdir mysql
[root@cdh01 software]# cd mysql/
[root@cdh01 mysql]# tar -xvf mysql-5.7.26-1.el7.x86_64.rpm-bundle.tar 
​
#3.卸载自带的maraidb-lib-版本号
rpm -qa|grep mariadb
mariadb-libs-版本号
rpm -e --nodeps mariadb-libs-版本号
rpm -qa|grep mariadb
rpm -e --nodeps mariadb-libs-5.5.64-1.el7.x86_64
​
#若卸载不了可使用 yum remove 包名 -y 命令
​
#4.安装mysql-community-common-5.7.26-1.el7.x86_64.rpm
 rpm -ivh mysql-community-common-5.7.26-1.el7.x86_64.rpm
 
#5.安装mysql-community-libs-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-libs-5.7.26-1.el7.x86_64.rpm
​
#6.安装mysql-community-client-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-client-5.7.26-1.el7.x86_64.rpm
​
#7、装libaio
yum install libaio
​
#8.安装mysql-community-server-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-server-5.7.26-1.el7.x86_64.rpm
​
#9.初始化数据库
mysqld --initialize --user=mysql
​
#10、启动mysql
service mysqld start
​
​
#11.获取mysql初始化密码
grep 'temporary password' /var/log/mysqld.log
​
#12.修改mysql密码
mysqladmin -u root -p password
​
#13.进入数据库
mysql -uroot -p
​
#14.修改/etc/my.cnf (默认在这个目录)
vim /etc/my.cnf
#配置内容如下
[mysqld]
datadir=/var/lib/mysql    #原本就有的
socket=/var/lib/mysql/mysql.sock#原本就有的
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0#原本就有的
 
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
 
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
 
#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
 
#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1
 
binlog_format = mixed
 
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
 
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
 
[mysqld_safe]
log-error=/var/log/mysqld.log#原本就有的
pid-file=/var/run/mysqld/mysqld.pid#原本就有的
 
sql_mode=STRICT_ALL_TABLES
​
#15.将mysql 加到 开机启动中
 systemctl enable mysqld
 
#16。重启mysql
service mysqld restart

4.4 安装 MySQL JDBC 驱动(所有节点)

用于各节点连接数据库

#1.切到指定目录
cd /usr/share/java/
#2.解压重命名
mv mysql-connector-java-5.1.47.jar mysql-connector-java.jar
#3.远程scp给其他节点
scp mysql-connector-java.jar root@192.168.140.40:/usr/share/java

 

4.5.配置CM和CDH

step1.为 Cloudera 各软件创建数据库(在CDH01节点)

ServiceDatabaseUser
Cloudera Manager Serverscmscm
Activity Monitoramonamon
Reports Managerrmanrman
Reports Managersentrysentry
Cloudera Navigator Audit Servernavnav
Cloudera Navigator Metadata Servernavmsnavms
Hive Metastore Serverhivehive
Huehuehue
Oozieoozieoozie

step2.登陆MySQL进行配置

CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm';
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';
CREATE DATABASE hive DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hive.* TO 'hive'@'%' IDENTIFIED BY 'hive';
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'navms';
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
flush privileges; 

设置 Cloudera Manager 数据库

# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
接着,输入scm数据库密码
scm

step3.配置 Cloudera Manager 仓库(所有节点)

CM安装成功之后,接下来我们就可以通过CM安装CDH的方式构建企业大数据平台。所以首先需要把CDH的parcels包下载到CM主服务器上。同样的,我们为了加速我们的安装,我们可以把需要下载的软件包提前下载下来,也可以创建CDH私有仓库。

#创建文件夹存储资源
mkdir /data/clouderaManager
#存放资源
cloudera-manager-agent-6.2.0-968826.el7.x86_64.rpm    
cloudera-manager-server-6.2.0-968826.el7.x86_64.rpm
cloudera-manager-daemons-6.2.0-968826.el7.x86_64.rpm

 

step4.安装 CM Server 和 Agent

cdh01:
# yum localinstall cloudera-manager-daemons-6.2.0-968826.el7.x86_64.rpm -y
# yum localinstall cloudera-manager-agent-6.2.0-968826.el7.x86_64.rpm -y
# yum localinstall cloudera-manager-server-6.2.0-968826.el7.x86_64.rpm -y
​
cdh[02-03]:
# yum localinstall cloudera-manager-daemons-6.2.0-968826.el7.x86_64.rpm -y
# yum localinstall cloudera-manager-agent-6.2.0-968826.el7.x86_64.rpm -y
​

 

出现的问题及其解决

2:postfix-2.10.1-6.el7.x86_64 has missing requires of libmysqlclient.so.18()(64bit)
2:postfix-2.10.1-6.el7.x86_64 has missing requires of libmysqlclient.so.18(libmysqlclient_18)(64bit)
重点关注:libmysqlclient.so.18()(64bit)
解决:cd
缺少Percona-XtraDB-Cluster-shared-55-5.5.37-25.10.756.el6.x86_64.rpm这个包
​
# wget http://www.percona.com/redir/downloads/Percona-XtraDB-Cluster/5.5.37-25.10/RPM/rhel6/x86_64/Percona-XtraDB-Cluster-shared-55-5.5.37-25.10.756.el6.x86_64.rpm
# rpm -ivh Percona-XtraDB-Cluster-shared-55-5.5.37-25.10.756.el6.x86_64.rpm
 
​

step5.安装 CDH(在cdh01节点)

#1 创建文件夹上传文件
[root@localhost software]#  /opt/cloudera/parcel-repo
上传文件
CDH-6.2.0-1.cdh6.2.0.p0.967373-el7.parcel
manifest.json
#2.生成sha文件
# sha1sum CDH-6.2.0-1.cdh6.2.0.p0.967373-el7.parcel | awk '{ print $1 }' > CDH-6.2.0-1.cdh6.2.0.p0.967373-el7.parcel.sha
​
#3.修改属主属组
 chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo/*
​

step6.启动 Cloudera Manager Server(在cdh01节点)

# systemctl start cloudera-scm-server

如果启动中有什么问题,可以查看日志。

# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log

 

CDH的web操作参考下一篇博客

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值