大数据学习之CDH6.3.2搭建

CDH6.3.2搭建

大数据中的CDH,全称为Cloudera Distribution Including Apache Hadoop,是由Cloudera公司提供的一个集成了Apache Hadoop及相关生态系统的发行版本。它是一个大数据平台,旨在简化和加速大数据处理分析的部署和管理。

企业中可用其对服务器集群进行管理

1、前提条件和准备工作

hostnamectl set-hostname cdh01
hostnamectl set-hostname cdh02
hostnamectl set-hostname cdh03

2、修改IP和Host映射关系(所有节点)

在window中也配置一下

192.168.115.201 cdh01
192.168.115.202 cdh02
192.168.115.203 cdh03

3、关闭防火墙和SELINUX(所有节点)

# 禁用防火墙
systemctl disable firewalld
systemctl stop firewalld

# 关闭selinux
vim /etc/selinux/config
# 修改config文件中的SELINUX为disable
SELINUX=disabled

setenforce 0

4、配置免密登陆,便于节点之间传输文件(所有节点)

cdh01和cdh01,cdh02,cdh03之前配置免密

# 在cdh01中执行
ssh-keygen -t rsa     # 一路回车,生成无密码密钥对。

#各节点分别执行
ssh-copy-id cdh01
ssh-copy-id cdh02
ssh-copy-id cdh03

#测试
ssh cdh01
ssh cdh02
ssh cdh03

5、安装 JDK(所有节点)

# 卸载自带jdk
rpm -e --nodeps `rpm -qa|grep jdk`

# 默认安装在/usr/java文件夹下 
rpm -ivh oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm

# 配置环境变量:编辑/etc/profile 或者 ~/.bash_profile
vim /etc/profile
---
export JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
export PATH=$PATH:$JAVA_HOME/bin
---

source /etc/profile 

# 测试
java -version

6、配置yum源(所有节点)

# 下载yum源配置文件
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
# 非ECS服务器,需要执行下列命令替换地址
sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repo

# 刷新缓存
yum clean all
yum makecache

6、配置时间同步服务(所有节点)

# 安装chrony
yum -y install chrony

# 配置chrony
vim /etc/chrony.conf 

---
server ntp1.aliyun.com 
server ntp2.aliyun.com 
server ntp3.aliyun.com 
#server 0.centos.pool.ntp.org iburst
---

# 启动chrony
systemctl start chronyd

systemctl restart ntpd.service
ntpdc -np

7、安装第三方依赖及配置系统参数(所有节点)

yum install -y chkconfig python bind-utils psmisc libxslt zlib sqlite cyrus-sasl-plain cyrus-sasl-gssapi fuse fuse-libs redhat-lsb

8、修改系统参数(所有节点)

sysctl vm.swappiness=10
echo 'vm.swappiness=10'>> /etc/sysctl.conf
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled

sysctl -w net.ipv6.conf.all.disable_ipv6=1
sysctl -w net.ipv6.conf.default.disable_ipv6=1

echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6

9、安装MySQL服务器(cdh01)

  • 在线安装
4、安装mysql5.7
下载yum Repository
wget -i -c http://dev.mysql.com/get/mysql57-community-release-el7-10.noarch.rpm
安装yum Repository
yum -y install mysql57-community-release-el7-10.noarch.rpm

# 替换key
rpm --import https://repo.mysql.com/RPM-GPG-KEY-mysql-2022

安装mysql5.7
yum -y install mysql-community-server



卸载yum Repository
因为安装了Yum Repository,以后每次yum操作都会自动更新,需要把这个卸载掉:
yum -y remove mysql57-community-release-el7-10.noarch

  • 离线安装

    -- 先查找mariadb依赖
    rpm -qa | grep mariadb
    -- 卸载mariadb-libs-5.5.68-1.el7.x86_64
    rpm -e --nodeps mariadb-libs-5.5.68-1.el7.x86_64
    
    -- 依次按编号安装MySQL的rpm包
    rpm -ivh 01_mysql-community-common-5.7.16-1.el7.x86_64.rpm
    rpm -ivh 02_mysql-community-libs-5.7.16-1.el7.x86_64.rpm 
    rpm -ivh 03_mysql-community-libs-compat-5.7.16-1.el7.x86_64.rpm
    rpm -ivh 04_mysql-community-client-5.7.16-1.el7.x86_64.rpm
    -- 若提示libaio依赖缺失,则yum install libaio
    rpm -ivh 05_mysql-community-server-5.7.16-1.el7.x86_64.rpm
    
  • 后续工作

    开机自启动
    systemctl enable mysqld.service
    启动mysql
    systemctl start mysqld.service
    查看状态
    systemctl status mysqld.service
    
    获取临时密码
    grep "password" /var/log/mysqld.log
    
    登录mysql
    mysql -uroot -p 
    
    关闭密码验证
    set global validate_password_policy=0;
    set global validate_password_length=1;
    
    设置密码
    alter user user() identified by "123456";
    
    修改权限
    use mysql;
    GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123456' WITH GRANT OPTION;  --修改权限
    delete from user where host!='%'; -- 删除其它权限
    flush privileges;  --刷新权限
    select host,user,authentication_string from user; --查看权限
    

10、创建后面安装CDH时需要的数据库。

create database hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
create database hue DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
create database monitor DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
create database cmf DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
create database oozie DEFAULT CHARSET utf8 COLLATE utf8_general_ci;

exit

11、配置mysql的驱动包(所有节点)

 #整理mysql的驱动包, mysql驱动包必须放在/usr/share/java目录,并且需要重命名mysql-connector-java.jar, 所有节点都需要安装
 # 下载mysql驱动包

 mv mysql-connector-java-5.1.48.jar mysql-connector-java.jar
 mkdir /usr/share/java
 cp mysql-connector-java.jar /usr/share/java/mysql-connector-java.jar

11、安装Cloudera Manager

将下载好的安装包上传服务器

12、全部节点都安装daemons和agent

yum -y localinstall cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm
yum -y localinstall cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm

13、cdh01上安装server

yum -y localinstall cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm

14、修改agent配置,指向server节点cdh01(所有节点)

sed -i "s/server_host=localhost/server_host=cdh01/g" /etc/cloudera-scm-agent/config.ini

15、修改主节点cdh01的server配置

vim /etc/cloudera-scm-server/db.properties

# Copyright (c) 2012 Cloudera, Inc. All rights reserved.
#
# This file describes the database connection.
#

# The database type
# Currently 'mysql', 'postgresql' and 'oracle' are valid databases.
com.cloudera.cmf.db.type=mysql

# The database host
# If a non standard port is needed, use 'hostname:port'
com.cloudera.cmf.db.host=cdh01:3306

# The database name
com.cloudera.cmf.db.name=cmf

# The database user
com.cloudera.cmf.db.user=root

# The database user's password
com.cloudera.cmf.db.password=123456

# The db setup type
# After fresh install it is set to INIT
# and will be changed post config.
# If scm-server uses Embedded DB then it is set to EMBEDDED
# If scm-server uses External DB then it is set to EXTERNAL
com.cloudera.cmf.db.setupType=EXTERNAL

16、部署离线parcel源 , cdh01: /opt/cloudera/parcel-repo目录

CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel
CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha
CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha256
manifest.json

-- 注:官方文档给的是sha1,但是不太行,得使用sha
CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha1

17、 启动cm的sever和agent,开始安装cdh集群

# 在cdh01上启动server
systemctl start cloudera-scm-server
# 另开一个窗口,查看相关日志。有异常就解决异常
tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
# 这个异常可以忽略 ERROR ParcelUpdateService:com.cloudera.parcel.component
ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifes
# 等待日志输出 started jetty server, 则cm-server启动成功

# 在全部节点上启动agent
systemctl start cloudera-scm-agent
 # 当在 cdh01上 netstat ‐antp | grep 7180 有内容时,说明我们可以访问web页面了
# 查看运行状态
systemctl status cloudera-scm-agent
systemctl status cloudera-scm-server

systemctl restart cloudera-scm-agent
systemctl restart cloudera-scm-server

18、安装CDH

19、测试hive

CREATE EXTERNAL TABLE IF NOT EXISTS student(
    id string ,
    `name` string ,
    age string  ,
    gender string  ,
    clazz string 
) 
ROW FORMAT DELIMITED 
    FIELDS TERMINATED BY ',' 
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' 
    OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'  
location '/data/student'; 


  • 8
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值