docker单节点安装CDH(离线安装)
安装前准备环境
- 系统 : CentOS Linux release 7.9.2009 (Core)
- mysql : 5.7.32
- jdk : jdk-8u45
- cdh : 5.16.1
搭建步骤
注 : 本地搭建在docker内,故不装mysql,组件元数据存储在宿主机内的mysql
机器配置 :
内存 : 32G
CPU : i7-6700 CPU
上传软件包
# 将所需要用的的文件上传到/root/software/cdh下,该目录可以任意指定:用来存放安装包等
[root@quickstart ~]# mkdir -p /root/software/cdh/
[root@quickstart ~]# cd /root/software/cdh/
[root@quickstart cdh]# ll
total 3069728
-rw-r--r-- 1 root root 2127506677 Nov 26 11:19 CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel
-rw-r--r-- 1 root root 41 Nov 26 11:18 CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel.sha1
-rw-r--r-- 1 root root 841524318 Nov 26 11:18 cloudera-manager-centos7-cm5.16.1_x86_64.tar.gz
-rw-r--r-- 1 root root 173271626 Nov 26 11:18 jdk-8u45-linux-x64.gz
-rw-r--r-- 1 root root 66538 Nov 26 11:18 manifest.json
-rw-r--r-- 1 root root 1007502 Nov 26 11:18 mysql-connector-java-5.1.47.jar
[root@quickstart cdh]#
安装JDK
# 将 jdk 解压到/usr/java/下,注意一定是在该目录下,因为CM启动服务时,会去这个目录下找 jdk。该目录需要自行创建
[root@quickstart cdh]# mkdir -p /usr/java
[root@quickstart cdh]# tar -zxvf jdk-8u45-linux-x64.gz -C /usr/java/ && cd /usr/java
# 配置jdk的环境变量
[root@quickstart jdk1.8.0_45]# vim /etc/profile
# 追加
#JAVA_HOME
export JAVA_HOME=/usr/java/jdk1.8.0_45
export PATH=$PATH:$JAVA_HOME/bin
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
[root@quickstart jdk1.8.0_45]# source /etc/profile
# 检查是否安装成功
[root@quickstart jdk1.8.0_45]# java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
[root@quickstart jdk1.8.0_45]#
配置免密
[root@quickstart jdk1.8.0_45]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:1UgwqRRem05Guba902epmNt5yzukkI3IblWEZ43Y+Dk root@quickstart.cloudera
The key's randomart image is:
+---[RSA 2048]----+
| ..=+.= o |
| ..oo=+o* . |
| ...=.o=.. |
| .+o. E |
| oS+ = . |
| + * . . |
| . . + o . |
| o o+oo* |
| . +o+=++ |
+----[SHA256]-----+
[root@quickstart jdk1.8.0_45]# ssh-copy-id quickstart
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'quickstart (172.17.0.2)' can't be established.
ECDSA key fingerprint is SHA256:bjlAdm5grZgA3Jb96dDtMRsvnkbjo6r2m0O4eQHPBNs.
ECDSA key fingerprint is MD5:70:69:4f:b8:83:6c:ee:3e:d2:03:23:16:c2:73:51:4a.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@quickstart's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'quickstart'"
and check to make sure that only the key(s) you wanted were added.
[root@quickstart jdk1.8.0_45]#
安装cm中需要的依赖。
[root@quickstart jdk1.8.0_45]# yum -y install ntp python-lxml httpd mod_ssl cyrus-sasl-plain cyrus-sasl-devel cyrus-sasl-gssapi
启动相关服务并加入到开机启动中
# 1. ntpd服务
[root@quickstart ~]# service ntpd start
Redirecting to /bin/systemctl start ntpd.service
[root@quickstart ~]# chkconfig ntpd on
Note: Forwarding request to 'systemctl enable ntpd.service'.
Created symlink from /etc/systemd/system/multi-user.target.wants/ntpd.service to /usr/lib/systemd/system/ntpd.service.
[root@quickstart ~]#
# 2. httpd服务
[root@quickstart ~]# service httpd start
Redirecting to /bin/systemctl start httpd.service
[root@quickstart ~]# chkconfig httpd on
Note: Forwarding request to 'systemctl enable httpd.service'.
Created symlink from /etc/systemd/system/multi-user.target.wants/httpd.service to /usr/lib/systemd/system/httpd.service.
[root@quickstart ~]#
# 3. 关闭防火墙,并开机默认关闭
yum install firewalld systemd -y
systemctl stop firewalld.service
systemctl status firewalld
在mysql中创建库,并授予远程登录权限
# 创建原因:各个组件是由【各自组件名为用户名】的【用户】管理的。没有这些库,在安装chd的时候无法通过数据库的验证,即无法管理。
# 如果不能给权限
# UPDATE mysql.user SET Grant_priv='Y', Super_priv='Y' WHERE User='root';
# FLUSH PRIVILEGES;
create database cmf DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
grant all privileges on cmf.* to cmf@'%' identified by 'cmf';
grant all privileges on cmf.* to cmf@'quickstart' identified by 'cmf';
create database metastore DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
grant all privileges on metastore.* to metastore@'%' identified by 'metastore';
grant all privileges on metastore.* to metastore@'quickstart' identified by 'metastore';
create database oozie DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
grant all privileges on oozie.* to oozie@'%' identified by 'oozie';
grant all privileges on oozie.* to oozie@'quickstart' identified by 'oozie';
create database hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
grant all privileges on hive.* to hive@'%' identified by 'hive';
grant all privileges on hive.* to hive@'quickstart' identified by 'hive';
create database hue DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
grant all privileges on hue.* to hue@'%' identified by 'hue';
grant all privileges on hue.* to hue@'quickstart' identified by 'hue';
create database monitor DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
grant all privileges on monitor.* to monitor@'%' identified by 'monitor';
grant all privileges on monitor.* to monitor@'quickstart' identified by 'monitor';
flush privileges;
安装CM
[root@quickstart ~]# mkdir -p /opt/cloudera-manager
# 注意:不要解压在其他目录中,否则安装会出现很多问题,因为这是默认路径。
# 解压先前上传文件的tar.gz文件到上述目录下:
[root@quickstart cloudera-manager]# tar -zxvf /root/software/cdh/cloudera-manager-centos7-cm5.16.1_x86_64.tar.gz -C ./
[root@quickstart cloudera-manager]# ll total 8
drwxr-xr-x 4 1106 4001 4096 Nov 21 2018 cloudera
drwxr-xr-x 9 1106 4001 4096 Nov 21 2018 cm-5.16.1
[root@quickstart cloudera-manager]#
修改配置文件中主节点hostname
[root@quickstart cloudera-manager]# vim /opt/cloudera-manager/cm-5.16.1/etc/cloudera-scm-agent/config.ini
#将server_host=localhost修改为server_host=quickstart
在节点上添加用户:cloudera-scm(必须为该名字)
[root@quickstart cloudera-manager]# useradd --system --home=/opt/cloudera-manager/cm-5.16.1/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
[root@quickstart cloudera-manager]#
创建cdh实体文件存放目录并修改所有者为 cloudera-scm 用户
注意:该目录也必须为该路径,为cm的默认配置目录
[root@quickstart ~]# mkdir -p /opt/cloudera/parcels
[root@quickstart ~]# chown cloudera-scm:cloudera-scm /opt/cloudera/parcels
#该目录就是安装好后的cdh实体文件的目录
创建cdh安装包文件目录,并将安装包复制到这个文件下
注意:目录必须为这个,理由同上
# 1、创建目录
[root@quickstart ~]# mkdir -p /opt/cloudera/parcel-repo
# 2、移动安装包到该目录下
[root@quickstart ~]# cp /root/software/cdh/cloudera-manager-centos7-cm5.16.1_x86_64.tar.gz /opt/cloudera/parcel-repo/cloudera-manager-centos7-cm5.16.1_x86_64.tar.gz
[root@quickstart ~]# cp /root/software/cdh/CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel /opt/cloudera/parcel-repo/CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel
# 注意末尾是sha不是sha1
[root@quickstart ~]# cp /root/software/cdh/CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel.sha1 /opt/cloudera/parcel-repo/CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel.sha
[root@quickstart ~]# cp /root/software/cdh/manifest.json /opt/cloudera/parcel-repo/manifest.json
# 3、更改该目录及文件的所有者,cloudera-scm 为我们上述创建的那个用户
[root@quickstart ~]# chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo
# 该目录是安装包的目录,即cm在安装cdh版hadoop时,会从这个目录读取安装文件进行解压安装
配置mysql
数据库驱动包
注意:必须为该目录
# 1.创建驱动包存放目录,组件连接数据库时会去该目录下找。
[root@quickstart parcel-repo]# mkdir /usr/share/java
# 2. 必须重新命名
[root@quickstart parcel-repo]# cp /root/software/cdh/mysql-connector-java-5.1.47.jar /usr/share/java/mysql-connector-java.jar
初始化数据库用户
# -h后不需要空格,紧跟你的主机名,--scm-host后跟你的hosts中配置的映射
[root@quickstart parcel-repo]# /opt/cloudera-manager/cm-5.16.1/share/cmf/schema/scm_prepare_database.sh mysql -h172.17.0.1 --scm-host quickstart cmf cmf cmf
JAVA_HOME=/usr/java/jdk1.8.0_45
Verifying that we can write to /opt/cloudera-manager/cm-5.16.1/etc/cloudera-scm-server
Creating SCM configuration file in /opt/cloudera-manager/cm-5.16.1/etc/cloudera-scm-server
Executing: /usr/java/jdk1.8.0_45/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera-manager/cm-5.16.1/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /opt/cloudera-manager/cm-5.16.1/etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
log4j:ERROR Could not find value for key log4j.appender.A
log4j:ERROR Could not instantiate appender named "A".
[2020-11-26 12:32:09,080] INFO 0[main] - com.cloudera.enterprise.dbutil.DbCommandExecutor.testDbConnection(DbCommandExecutor.java) - Successfully connected to database.
All done, your SCM database is configured correctly!
启动cm server
# 1. 启动
[root@quickstart parcel-repo]# /opt/cloudera-manager/cm-5.16.1/etc/init.d/cloudera-scm-server start
/opt/cloudera-manager/cm-5.16.1/etc/init.d/cloudera-scm-server: line 109: pstree: command not found
Starting cloudera-scm-server: [ OK ]
[root@quickstart parcel-repo]#
#查看启动服务日志
[root@quickstart parcel-repo]# tail -f -n 1000 /opt/cloudera-manager/cm-5.16.1/log/cloudera-scm-server/cloudera-scm-server.log
# 启动了7180端口,说明server启动成功。
2020-11-26 12:38:46,731 INFO WebServerImpl:org.mortbay.log: jetty-6.1.26.cloudera.4
2020-11-26 12:38:46,732 INFO WebServerImpl:org.mortbay.log: Started SelectChannelConnector@0.0.0.0:7180
2020-11-26 12:38:46,732 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
2020-11-26 12:38:49,344 INFO ParcelUpdateService:com.cloudera.parcel.components.LocalParcelManagerImpl: Discovered parcel on CM server: CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel
2020-11-26 12:38:49,344 INFO ParcelUpdateService:com.cloudera.parcel.components.LocalParcelManagerImpl: Created torrent file: /opt/cloudera/parcel-repo/CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel.torrent
2020-11-26 12:38:49,346 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: Creating single-file torrent for CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel...
2020-11-26 12:38:49,346 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: Hashing data from CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel with 8 threads (4058 pieces)...
2020-11-26 12:38:49,633 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: ... 10% complete
2020-11-26 12:38:49,853 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: ... 20% complete
2020-11-26 12:38:50,084 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: ... 30% complete
2020-11-26 12:38:50,296 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: ... 40% complete
2020-11-26 12:38:50,505 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: ... 50% complete
2020-11-26 12:38:50,717 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: ... 60% complete
2020-11-26 12:38:50,927 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: ... 70% complete
2020-11-26 12:38:51,136 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: ... 80% complete
2020-11-26 12:38:51,345 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: ... 90% complete
2020-11-26 12:38:51,565 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: Hashed 1 file(s) (2127506677 bytes) in 4058 pieces (4058 expected) in 2218.1ms.
2020-11-26 12:38:51,568 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: Single-file torrent information:
2020-11-26 12:38:51,568 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: Torrent name: CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel
2020-11-26 12:38:51,568 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: Announced at: Seems to be trackerless
2020-11-26 12:38:51,568 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: Created on..: Thu Nov 26 12:38:49 CST 2020
2020-11-26 12:38:51,568 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: Created by..: cm-server
2020-11-26 12:38:51,568 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: Pieces......: 4058 piece(s) (524288 byte(s)/piece)
2020-11-26 12:38:51,568 INFO ParcelUpdateService:com.turn.ttorrent.common.Torrent: Total size..: 2,127,506,677 byte(s)
2020-11-26 12:38:52,013 INFO ScmActive-0:com.cloudera.server.cmf.components.ScmActive: ScmActive completed successfully.
2020-11-26 12:39:11,239 INFO 1353021032@scm-web-5:com.cloudera.server.web.cmf.CMFUserDetailsService: First user 'admin' logging in.
2020-11-26 12:39:11,550 INFO 1353021032@scm-web-5:com.cloudera.server.web.cmf.AuthenticationSuccessEventListener: Authentication success for user: 'admin' from 192.168.50.186
启动 agent
# 1. 启动cm agent
[root@quickstart ~]# /opt/cloudera-manager/cm-5.16.1/etc/init.d/cloudera-scm-agent start
/opt/cloudera-manager/cm-5.16.1/etc/init.d/cloudera-scm-agent: line 109: pstree: command not found
Starting cloudera-scm-agent: [ OK ]
# 2. Cloudera 建议将 /proc/sys/vm/swappiness 设置为最大值 10
# 在/etc/sysctl.conf 文件里添加如下参数
# vm.swappiness=10
[root@quickstart ~]# vim /etc/sysctl.conf
[root@quickstart ~]# sysctl vm.swappiness=10
vm.swappiness = 10
http://192.168.50.173:7180/
报错解决
HDFS_NameNode启动失败
解决办法
修改NameNode 数据目录
HDFS_DataNode启动失败
原因 : 端口被占用
解决办法
[root@quickstart ~]# yum whatprovides \nc
Loaded plugins: fastestmirror, ovl
Loading mirror speeds from cached hostfile
2:nmap-ncat-6.40-19.el7.x86_64 : Nmap's Netcat replacement
Repo : base
Matched from:
Provides : nc
[root@quickstart ~]#
[root@quickstart ~]# yum install nmap-ncat-6.40-19.el7.x86_64
Loaded plugins: fastestmirror, ovl
Loading mirror speeds from cached hostfile
Resolving Dependencies
--> Running transaction check
---> Package nmap-ncat.x86_64 2:6.40-19.el7 will be installed
--> Processing Dependency: libpcap.so.1()(64bit) for package: 2:nmap-ncat-6.40-19.el7.x86_64
--> Running transaction check
---> Package libpcap.x86_64 14:1.5.3-12.el7 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
======================================================================================================
Package Arch Version Repository Size
======================================================================================================
Installing:
nmap-ncat x86_64 2:6.40-19.el7 base 206 k
Installing for dependencies:
libpcap x86_64 14:1.5.3-12.el7 base 139 k
Transaction Summary
======================================================================================================
Install 1 Package (+1 Dependent package)
Total download size: 345 k
Installed size: 740 k
Is this ok [y/d/N]: y
Downloading packages:
(1/2): libpcap-1.5.3-12.el7.x86_64.rpm | 139 kB 00:00:00
(2/2): nmap-ncat-6.40-19.el7.x86_64.rpm | 206 kB 00:00:00
------------------------------------------------------------------------------------------------------
Total 1.3 MB/s | 345 kB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : 14:libpcap-1.5.3-12.el7.x86_64 1/2
Installing : 2:nmap-ncat-6.40-19.el7.x86_64 2/2
Verifying : 2:nmap-ncat-6.40-19.el7.x86_64 1/2
Verifying : 14:libpcap-1.5.3-12.el7.x86_64 2/2
Installed:
nmap-ncat.x86_64 2:6.40-19.el7
Dependency Installed:
libpcap.x86_64 14:1.5.3-12.el7
Complete!
[root@quickstart ~]# nc -l 50010
^C
[root@quickstart ~]#
分析:
原因:当应用程序崩溃后, 它会留下一个滞留的socket,以便能够提前重用socket, 当尝试绑定socket并重用它,你需要将socket的flag设置为SO_REUSEADDR,但是HDFS不是这么做的。解决办法是使用设置SO_REUSEADDR的应用程序绑定到这个端口, 然后停止这个应用程序。可以使用netcat工具实现。
解决办法: 安装nc工具, 使用nc工具占用50010端口, 然后关闭nc服务, 再次启动DataNode后正常。
进入首页
CM配置问题
zookeeper警告
HDFS-副本不足的块**
分析原因:
部署的只有一台机器。而CDH安装初始化时默认是3个副本。
因此出现这个报错,是初始化设置的副本备份数与DataNode的个数不同造成的。
解决方案:
1.设置目标备份数为1
2.通过命令修改当前备份数
点击集群-HDFS-配置,搜索dfs.replication,设置为1后保存更改。
dfs.replication这个参数其实只在文件被写入dfs时起作用,虽然更改了配置文件,但是不会改变之前写入的文件的备份数,所以我们还需要步骤2
通过命令更改备份数:
这里的-R 1的数字1就对应我们的DataNode个数。
通过命令更改备份数:
这里的-R 1的数字1就对应我们的DataNode个数。
su hdfs
hadoop fs -setrep -R 1 /
打成docker镜像后需要做的事情
-
迁移元数据
-
宿主机的MySQL 需要关闭大小写敏感
-
保证/etc/hosts文件内的信息正确
-
重启时候清空规则
-
启动服务
-
访问CDH