文件下载
首先一些安装CDH6集群的必须文件要先在外网环境先下载好。
Cloudera Manager 6.0.1
CM6 RPM:https://archive.cloudera.com/cm6/6.0.1/redhat7/yum/RPMS/x86_64/
需要下载该链接下的所有RPM文件,由于jdk1.8我在环境准备部分已经手动安装了,所以可以不用下载RPMS/x86_64/
目录下的jdk包oracle-j2sdk1.8-1.8.0+update141-1.x86_64.rpm
,但是其他4个rpm包一定要下载,保存到cloudera-repos
目录下。
ASC文件:https://archive.cloudera.com/cm6/6.0.1/allkeys.asc
同时还需要下载一个asc文件,同样保存到cloudera-repos
目录下:
/upload/cloudera-repos/
├── allkeys.asc
├── cloudera-manager-agent-6.0.1-610811.el7.x86_64.rpm
├── cloudera-manager-daemons-6.0.1-610811.el7.x86_64.rpm
├── cloudera-manager-server-6.0.1-610811.el7.x86_64.rpm
└── cloudera-manager-server-db-2-6.0.1-610811.el7.x86_64.rpm
0 directories, 4 files
MySQL JDBC驱动
要求使用5.1.26以上版本的jdbc驱动,可点击这里直接下载mysql-connector-java-5.1.47.tar.gz
CDH 6.0.1
CDH6 Parcels:https://archive.cloudera.com/cdh6/6.0.1/parcels/
需要下载CDH-6.0.1-1.cdh6.0.1.p0.590678-el7.parcel
和manifest.json
这两个文件
配置Cloudera Manager yum库
注意:不要尝试使用FTP搭建CM的YUM库!
首先安装httpd
和createrepo
:yum -y install httpd createrepo
启动httpd
服务并设置开机自启动:systemctl start httpd
systemctl enable httpd
然后进入到前面准备好的存放Cloudera Manager RPM包的目录cloudera-repos
下:cd /upload/cloudera-repos/
生成RPM元数据:createrepo .
[root@cdh601 cloudera-repos]# createrepo .
Spawning worker 0 with 2 pkgs
Spawning worker 1 with 2 pkgs
Workers Finished
Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete
然后将cloudera-repos
目录移动到httpd的html目录下:mv cloudera-repos /var/www/html/
确保可以通过浏览器查看到这些RPM包:
接着在Cloudera Manager Server主机上创建cm6的repo文件(要把哪个节点作为Cloudera Manager Server节点,就在这个节点上创建repo文件):cd /etc/yum.repos.d
vim cloudera-manager.repo
添加如下内容:
[cloudera-manager]
name=Cloudera Manager 6.0.1
baseurl=http://cdh601/cloudera-repos/
gpgcheck=0
enabled=1
保存,退出,然后执行yum clean all && yum makecache
命令:
安装Cloudera Manager Server
这一步只需要在CM Server节点上操作。
执行下面的命令:yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server
将会需要很多依赖包,所以说还是有必要搭一个局域网内yum源的:
配置本地Parcel存储库
Cloudera Manager Server安装完成后,进入到本地Parcel存储库目录:cd /opt/cloudera/parcel-repo
将第一部分下载的CDH Parcel文件(CDH-6.0.1-1.cdh6.0.1.p0.590678-el7.parcel
和manifest.json
)上传至该目录下,然后执行命令生成sha文件:sha1sum CDH-6.0.1-1.cdh6.0.1.p0.590678-el7.parcel | awk '{ print $1 }' > CDH-6.0.1-1.cdh6.0.1.p0.590678-el7.parcel.sha
然后执行下面的命令修改文件所有者:chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo/*
最终/opt/cloudera/parcel-repo
目录内容如下:
安装数据库
MySQL的安装在环境准备部分中已经有说明,这里就跳过MySQL安装了。
数据库配置
CDH官方给的有一份推荐的MySQL的配置内容:
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
sql_mode=STRICT_ALL_TABLES
配置mysql jdbc驱动
从前面下载好的mysql-connector-java-5.1.47.tar.gz
包中解压出mysql-connector-java-5.1.47-bin.jar
文件,将mysql-connector-java-5.1.47-bin.jar
文件上传至CM Server节点上的/usr/share/java/
目录下并重命名为mysql-connector-java.jar
(如果/usr/share/java/
目录不存在,需要手动创建):tar zxvf mysql-connector-java-5.1.47.tar.gz
mkdir -p /usr/share/java/
cp mysql-connector-java-5.1.47-bin.jar /usr/share/java/mysql-connector-java.jar
创建CDH所需要的数据库
根据所需要安装的服务参照下表创建对应的数据库以及数据库用户,数据库必须使用utf8编码,创建数据库时要记录好用户名及对应密码:
服务名 | 数据库名 | 用户名 |
---|---|---|
Cloudera Manager Server | scm | scm |
Activity Monitor | amon | amon |
Reports Manager | rman | rman |
Hue | hue | hue |
Hive Metastore Server | metastore | hive |
Sentry Server | sentry | sentry |
Cloudera Navigator Audit Server | nav | nav |
Cloudera Navigator Metadata Server | navms | navms |
Oozie | oozie | oozie |
我这里就先创建4个数据库及对应用户:
mysql> CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
Query OK, 1 row affected (0.11 sec)
mysql> CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
Query OK, 1 row affected (0.00 sec)
mysql> CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
Query OK, 1 row affected (0.00 sec)
mysql> CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
Query OK, 1 row affected (0.00 sec)
mysql> GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm';
Query OK, 0 rows affected, 1 warning (0.16 sec)
mysql> GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.00 sec)
查看授权是否正确:
mysql> SHOW GRANTS FOR 'scm'@'%';
+----------------------------------------------+
| Grants for scm@% |
+----------------------------------------------+
| GRANT USAGE ON *.* TO 'scm'@'%' |
| GRANT ALL PRIVILEGES ON `scm`.* TO 'scm'@'%' |
+----------------------------------------------+
2 rows in set (0.00 sec)
mysql> SHOW GRANTS FOR 'amon'@'%';
+------------------------------------------------+
| Grants for amon@% |
+------------------------------------------------+
| GRANT USAGE ON *.* TO 'amon'@'%' |
| GRANT ALL PRIVILEGES ON `amon`.* TO 'amon'@'%' |
+------------------------------------------------+
2 rows in set (0.00 sec)
mysql> SHOW GRANTS FOR 'rman'@'%';
+------------------------------------------------+
| Grants for rman@% |
+------------------------------------------------+
| GRANT USAGE ON *.* TO 'rman'@'%' |
| GRANT ALL PRIVILEGES ON `rman`.* TO 'rman'@'%' |
+------------------------------------------------+
2 rows in set (0.00 sec)
mysql> SHOW GRANTS FOR 'hive'@'%';
+----------------------------------------------------------+
| Grants for metastore@% |
+----------------------------------------------------------+
| GRANT USAGE ON *.* TO 'hive'@'%' |
| GRANT ALL PRIVILEGES ON `metastore`.* TO 'hive'@'%' |
+----------------------------------------------------------+
2 rows in set (0.00 sec)
设置Cloudera Manager 数据库
Cloudera Manager Server包含一个配置数据库的脚本。
- mysql数据库与CM Server是同一台主机
执行命令:/opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
- mysql数据库与CM Server不在同一台主机上
执行命令:/opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h <mysql-host-ip> --scm-host <cm-server-ip> scm scm
安装CDH节点
启动Cloudera Manager Server服务
systemctl start cloudera-scm-server
然后等待Cloudera Manager Server启动,可能需要稍等一会儿,可以通过命令tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
去监控服务启动状态。
当看到INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
日志打印出来后,说明服务启动成功,可以通过浏览器访问Cloudera Manager WEB界面了。
访问Cloudera Manager WEB界面
打开浏览器,访问地址:http://<server_host>:7180
,默认账号和密码都为admin:
欢迎页面
首先是Cloudera Manager的欢迎页面,点击页面右下角的【继续】按钮进行下一步:
接受条款
勾选接受条款,点击【继续】进行下一步:
版本选择
这里我就选择免费版了:
第二个欢迎界面
选择版本以后会出现第二个欢迎界面,不过这个是安装集群的欢迎页:
选择主机
这一步是要搜索并选择用于安装CDH集群的主机,在主机名称后面的输入框中输入各个节点的hostname,中间使用英文逗号分隔开,然后点击搜索,在结果列表中勾选要安装CDH的节点即可:
指定存储库
Cloudera Manager Agent
这里选择自定义,填写上面使用httpd搭建好的Cloudera Manager YUM 库URL:
CDH and other software
如果我们之前的【配置本地Parcel存储库】步骤操作无误的话,这里会自动选择【使用Parcel】,并加载出CDH版本,确认无误后点击【继续】:
JDK安装选项
这一步骤我就不再勾选安装JDK了,因为我在环境准备部分已经安装过了。取消勾选,然后继续:
SSH登录配置
用于配置集群主机之间的SSH登录,填写root用户的密码,根据集群配置填写合适的【同时安装数量】值即可:
安装Agent
到这一步会自动进行节点Agent的安装,稍等一会儿,即可安装完成:
安装Parcels
这一步同样是自动安装,分配步骤的速度主要取决于网络环境,耐心等待即可(我的3台虚拟机性能实在是太差了,这一步等了好久):
主机检查
等待检查完成即可:
安装CDH集群
选择服务类型
这里我选择自定义服务,HDFS,Hive,Yarn:
角色分配
CDH会自动给出一个角色分配,如果觉得不合理,我们可以手动调整一下,注意角色分配均衡:
数据库设置
因为我选择的服务中只有Hive需要数据库,故这里只需要配置Hive的metastore数据库。注意要将mysql的jdbc驱动放到hive metastore主机的/usr/share/java/
目录下: