1. 安装文件下载
cloudera manager文件下载, 红框中的不需要下载。
下载allkeys.asc文件,如下:
cdh6文件下载,下载对应的系统文件:
2. 系统配置
2.1 环境介绍
CentOS7.5.1804 3.10.0-862.el7.x86_64
2.2 网络环境配置
所有节点配置/etc/hostname、/etc/hosts文件
2.3 禁用selinux
所有节点修改/etc/selinux/config文件如下:
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
# SELINUX=enforcing
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
执行sestatus命令,查看是否修改成功。
2.4 禁用防火墙
所有节点执行如下命令:
systemctl stop firewalld
systemctl disable firewalld
systemctl status firewalld
2.5 所有节点修改 /etc/security/limits.conf
修改如下:
* soft nofile 65535
* hard nofile 1029345
* soft nproc unlimited
* hard nproc unlimited
* soft memlock unlimited
* hard memlock unlimited
2.6 集群同步时钟
所有节点卸载 yum -y remove chrony
所有节点安装 yum install ntp -y
修改/etc/ntp.conf文件:
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server 0.pool.ntp.org
server 1.pool.ntp.org
server 2.pool.ntp.org
启动服务:systemctl start ntpd;
添加到开机启动:systemctl enable ntpd。
验证始终同步,在所有节点执行ntpq -p命令,左边出现*号表示同步成功。
2.7 设置swap
所有节点执行:
echo vm.swappiness = 10 >> /etc/sysctl.conf
2.8 设置透明大页面
以下语句所有节点执行:
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
将如下语句添加到/etc/rc.d/rc.local文件中:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
执行命令:chmod +x /etc/rc.d/rc.local
2.9 配置免密登录
使用如下命令配置集群之间的免密登录:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
ssh-copy-id -p <port> -i ~/.ssh/id_rsa.pub "<user>@<hostname>"
chmod 0600 ~/.ssh/authorized_keys
2.10 配置web服务器与cloudera manager系统repo源
在cloudera manager server节点的/var/www/html目录下创建文件夹cm6和cdh6,把下载好的cloudera manager文件上传到cm6目录下,cdh6的文件上传到cdh6目录下。
执行如下命令:
yum install -y createrepo
cd /var/www/html/cm6
createrepo .
添加本地源:vim /etc/yum.repos.d/cloudera-repo.repo
[cloudera-repo]
name=cloudera-repo
baseurl=http://<hostname>/cm6
enabled=1
gpgcheck=0
注:此处的hostname为cloudera manager server节点,复制cloudera-repo.repo文件到集群所有节点的/etc/yum.repos.d/目录下。
在所有节点的/usr/share/java/目录下放mysql-connector-java.jar (jar包名一定要修改成这个名字)。
执行如下命令:
chmod -R ugo+rX /var/www/html/cm6
chmod -R ugo+rX /var/www/html/cdh6
2.11 安装http服务
所有节点执行如下命令:
yum -y install httpd
systemctl start httpd
systemctl enable httpd
2.12 安装大数据平台的所有依赖
yum -y install chkconfig python bind-utils psmisc libxslt zlib sqlite cyrus-sasl-plain cyrus-sasl-gssapi fuse fuse-libs redhat-lsb postgresql* portmap mod_ssl openssl-devel python-psycopg2 MySQL-python
yum install python-pip
pip install psycopg2==2.7.5 --ignore-installed
2.13 安装jdk
复制/var/www/html/cm6/oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm到集群的所有节点。
所有节点安装jdk:
rpm -ivh /var/www/html/cm6/oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm
配置jdk环境变量:
export JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
export CLASSPATH=.:$JAVA_HOME/jre/lib:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
export PATH
2.14 安装cloudera manager的元数据库
在cloudera manager server节点安装,
先卸载系统自带的包:
查看系统是否已存在相应的安装包:rpm -qa | grep -i mariadb
卸载已存在的安装包:yum remove <package name>
安装mariadb:
yum install mariadb-server
systemctl stop mariadb
修改/etc/my.cnf文件,官方推荐修改如下:
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0
# Settings user and group are ignored when systemd is used.
# If you need to run mysqld under a different user or group,
# customize your systemd unit file for mariadb according to the
# instructions in http://fedoraproject.org/wiki/Systemd
key_buffer = 16M
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
#In later versions of MariaDB, if you enable the binary log and do not set
#a server_id, MariaDB will not start. The server_id must be unique within
#the replicating group.
server_id=1
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mariadb/mariadb.log
pid-file=/var/run/mariadb/mariadb.pid
执行如下命令:
systemctl enable mariadb
systemctl start mariadb
初始化mysql密码:
/usr/bin/mysql_secure_installation
[...]
Enter current password for root (enter for none):
OK, successfully used password, moving on...
[...]
Set root password? [Y/n] Y
New password:
Re-enter new password:
[...]
Remove anonymous users? [Y/n] Y
[...]
Disallow root login remotely? [Y/n] N
[...]
Remove test database and access to it [Y/n] Y
[...]
Reload privilege tables now? [Y/n] Y
[...]
All done! If you've completed all of the above steps, your MariaDB
installation should now be secure.
Thanks for using MariaDB!
创建数据库:
登录mysql (mysql -uroot -p),使用下面的命令,创建下面所有的库:
CREATE DATABASE <database> DEFAULT CHARACTER SET <character set> DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON <database>.* TO '<user>'@'%' IDENTIFIED BY '<password>';
FLUSH PRIVILEGES;
SHOW DATABASES;
SHOW GRANTS FOR '<user>'@'%';
2.15 初始化数据库
scm_prepare_database.sh脚本的语法如下:
/opt/cloudera/cm/schema/scm_prepare_database.sh [options] <databaseType> <databaseName> <databaseUser> <password>
如下:
/opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
执行成功的log如下:
Enter SCM password:
JAVA_HOME=/usr/java/jdk1.8.0_141-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing: /usr/java/jdk1.8.0_141-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[ main] DbCommandExecutor INFO Successfully connected to database.
All done, your SCM database is configured correctly!
3. 安装cloudera manager
3.1 安装cloudera-manager-server
yum install -y cloudera-manager-server
3.2 官方推荐开启Auto-TLS
开启之后没有安装成功,本人推荐不开启,此步骤可跳过。
Cloudera Manager添加任何主机之前,必须启用Auto-TLS。要使用嵌入式Cloudera Manager CA启用auto-TLS,请运行以下命令:
sudo JAVA_HOME=/usr/java/jdk1.8.0_141-cloudera /opt/cloudera/cm-agent/bin/certmanager setup --configure-services
用你的JDK版本替换jdk1.8.0_141-cloudera。如果希望将文件存储在指定目录中(/var/lib/cloudera-sc -server/certmanager),请添加--location选项,如下所示:
sudo JAVA_HOME=/usr/java/jdk1.8.0_141-cloudera /opt/cloudera/cm-agent/bin/certmanager --location /opt/cloudera/CMCA setup --configure-services
3.3 启动Cloudera Manager Server
systemctl start cloudera-scm-server
tail -300f /var/log/cloudera-scm-server/cloudera-scm-server.log
当你看到这条日志时,Cloudera Manager管理控制台已经启动了:
INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
浏览器访问:http://<server_host>:7180
3.4 搜索已配置好的所有服务器,如下图:
搜索完成,点击右下角“继续”。
3.5 选择自定义存储库,写入已配置好的存储库,如下图:
3.6 使用Parcel选项,点击“更多选项”,点击“-”删除其它所有地址,输入http://<ip>/cdh6,点击“保存更改”
3.7 点击“继续”,进入下一步安装jdk,jdk已安装不要勾选框,直接继续
3.8 提供ssh登录凭据,如下图:
3.10 看见这个界面就可以开香槟了,如果有报错的节点,查看日志去处理,如果处理不了,那就悲剧了,去重新安装吧
点击右下角“继续”
3.11 检查主机正确性
点击右下角“完成”。
3.12 选择服务,选择“自定义服务”,如下图:
此处不多介绍,根据需求去选择
3.13 到这里基本就完成了安装,如果有报错的地方,去查看日志处理,解决不了,重新安装。
3.14 Oozie的WebUI配置:
在安装oozie服务的所有server中,
将ext-2.2.zip解压到:/opt/cloudera/parcels/CDH/lib/oozie/libext目录下;
执行如下命令:chown oozie:oozie -R ext-2.2;
在管理界面重启oozie服务。
4. 删除主机
4.1 选择要删除的主机
4.2 点击停止主机上的角色
4.3 点击Begin Maintenance(解除授权)
注意:如果删除主机有DataNode服务,注意数据迁移问题,此处根据自己的实际情况选择。
执行完成后,授权状态变为已解除授权。
4.4 点击从集群中删除
此时,已经从集群中成功删除主机,但是cloudera manager中仍然有主机状态。
4.5 点击Remove From Cloudera Manager
从Cloudera Manager中移除主机,此时在cloudera manager中就成功删除主机。
4.6 校验hdfs的block
此处根据删除主机是否有datanode服务和4.3步骤的选择有关,根据自己的情况确定。
[bigdata@testserver5 ~]$ hdfs fsck /
Connecting to namenode via http://testserver3:9870/fsck?ugi=bigdata&path=%2F
FSCK started by bigdata (auth:SIMPLE) from /172.16.3.15 for path / at Fri Jan 14 15:48:41 CST 2022
Status: HEALTHY
Number of data-nodes: 5
Number of racks: 1
Total dirs: 3690
Total symlinks: 0
Replicated Blocks:
Total size: 34347236977 B
Total files: 5031 (Files currently being written: 9)
Total blocks (validated): 4755 (avg. block size 7223393 B) (Total open file blocks (not validated): 7)
Minimally replicated blocks: 4755 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 2.9566772
Missing blocks: 0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Erasure Coded Block Groups:
Total size: 0 B
Total files: 0
Total block groups (validated): 0
Minimally erasure-coded block groups: 0
Over-erasure-coded block groups: 0
Under-erasure-coded block groups: 0
Unsatisfactory placement block groups: 0
Average block group size: 0.0
Missing block groups: 0
Corrupt block groups: 0
Missing internal blocks: 0
FSCK ended at Fri Jan 14 15:48:41 CST 2022 in 92 milliseconds
The filesystem under path '/' is HEALTHY
说明:
Status:表示hdfs当前block的状态,两种状态:HEALTHY/CORRUPT
Total size:表示/目录下文件的大小
Total file:表示/目录下有多少个文件
Mis-replicated blocks:表示不满足块副本存储位置策略的块
Missing blocks:表示丢失的block数量
Missing replicas:表示丢失的副本数量
5. 总结
1.从安装方式上来看,CDH6与CDH5变化不大,这也方便了CDH5的用户可以较为快速的迁移到CDH6,以及适应CDH6的安装与使用。
2.安装向导界面有一些变化,现在可以一目了然的看到一共多少步骤,以及每个步骤是干什么。
3.安装条件前置没有任何变化,包括防火墙,Selinux关闭,ntp同步等等。可以参考Fayson之前的文章《CDH安装前置准备》
4.进到主界面变化也不大,主要是Cloudera的logo变成了黑色,与Cloudera主页的整体风格一致。
5.在配置Cloudera Manager连接到数据库时的脚本有所变化。以前是/usr/share/cmf/schema/scm_prepare_database.sh,现在是/opt/cloudera/cm/schema/scm_prepare_database.sh
6.Cloudera Manager服务的状态在Redhat7通过systemctl status cloudera-scm-server查看是显示正确,而以前是不正确的,可以参考Fayson之前的文章《Cloudera Manager Server服务在RedHat7状态显示异常分析》
7.Cloudera Manager的rpm安装包由之前的7个变成了5个,去掉了之前的JDK6的包,然后自带JDK1.8.0_141,将不再支持JDK1.7。
8.注意CM的安装除了下载rpm包以外,还要下载allkeys.asc文件,否则安装agent的时候会报以下错误:
9.对于离线安装CDH6.0,分发Parcel出现hash校验失败的问题,是因为在CM6中修复了一个bug,让它不再忽略由http服务器发送的“Content-Encoding”的header信息,但是我们在Redhat中安装的httpd服务,当它传输parcel文件时,默认会错误的设置“Content-Encoding”。于是CM server会错误的认为parcel文件已经被httpd压缩并尝试解压缩。所以会导致失败。解决办法是参考2.8章节的,设置httpd的conf文件,AddType application/x-gzip .gz .tgz .parcel,然后重启httpd服务和CM服务。这个问题在beta的时候就已经存在了,具体请参考《Redhat7.4安装CDH6.0_beta1时分发Parcel异常分析》
10.在安装过程中会有页面提示Auto-TLS,该步骤可以忽略,不过如果对主机通信或者CM页面访问有SSL/TLS需求的话,也可以按照提示进行配置。