大数据cloudera集群部署安装详解

centos7环境下部署cloudera大数据。

节点名称IP配置说明
Manager-node192.168.42.1004C/8G/100G管理节点
Agent1-node192.168.42.1018C/32G/1T数据节点
Agent2-node192.168.42.1028C/32G/1T数据节点
Agent3-node192.168.42.1038C/32G/1T数据节点

一、 部署前做准备

系统性能参数调整

CDH服务为发挥更好的性能,需对以下参数做些调整。直接将下面脚本刷到系统中即可。

cat << EOF >> /etc/sysctl.conf
vm.swappiness = 0
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv4.ip_local_port_range = 1024 65000
EOF
sysctl -p
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
cat << EOF >> /etc/rc.local
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
EOF

配置NTP时间同步服务

在所有节点上安装和启用。如果是内网,侧需要其他机器同步集群内一台主机。

yum -y install ntp

service ntpd restart

在管理节点上配置时间同步服务器

vi /etc/ntp.conf

restrict 127.0.0.1

restrict -6 ::1

restrict default nomodify notrap

server ntp1.aliyun.com prefer 

includefile /etc/ntp/crypto/pw

keys /etc/ntp/keys

在数据节点上配置时间同步客户端

vi /etc/ntp.conf

restrict 127.0.0.1

restrict -6 ::1

restrict default kod nomodify notrap nopeer noquery

restrict -6 default kod nomodify notrap nopeer noquery

#这里是主节点的主机名或者ip

server Manager-node

includefile /etc/ntp/crypto/pw

keys /etc/ntp/keys

配置文件完成,保存退出,启动服务,执行如下命令:service ntpd start 

JDK安装

在集群中所有服务器(包含CM管理节点和各个agent节点)中安装JDK

wget thttp://download.oracle.com/otn/java/jdk/7u80-b15/jdk-7u80-linux-x64.tar.gz?AuthParam=1528156044_59d0d3a22c59b5ac6d9f0dddd4418808

tar -zxvf jdk-7u80-linux-x64.tar.gz -C /usr/local/java
 cat >>~/.bashrc <<EOF
export JAVA_HOME=/usr/local/jdk1.7.0_80
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH

EOF

source ~/.bashrc

MYSQL数据库安装

root# wget https://dev.mysql.com/get/mysql80-community-release-el7-1.noarch.rpm

root# sudo rpm -Uvh mysql80-community-release-el7-1.noarch.rpm

root# yum repolist all | grep mysql

root# sudo yum-config-manager --enable mysql57-community

修改只安装mysql57-community-server

root# vi /etc/yum.repos.d/mysql-community.repo

# Enable to use MySQL 5.7
[mysql57-community]
name=MySQL 5.7 Community Server
baseurl=http://repo.mysql.com/yum/mysql-5.7-community/el/6/$basearch/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql

root# yum repolist enabled | grep mysql

root# yum install mysql-community-server

root# systemctl start mysqld.service

root# sudo grep 'temporary password' /var/log/mysqld.log

mysql>mysql -uroot -p

mysql>ALTER USER 'root'@'localhost' IDENTIFIED BY 'MyNewPass4!';

 Cloudera官网优化mysql配置建议:

vi /etc/my.cnf

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0

key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1

max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M

#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log

#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1

binlog_format = mixed

read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M

# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit  = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

sql_mode=STRICT_ALL_TABLES

 

配置开机启动

/usr/local/mysql

在support-files下执行

cp mysql.server /etc/init.d/mysql

启动mysql

/etc/init.d/mysql start

修改初始密码

alter user root@localhost identified by '123456HW';

flush privileges;

配置Cloudera Manager需要访问的mysql用户名和密码

  • Oozie Server - Contains Oozie workflow, coordinator, and bundle data. Can grow very large.
  • Sqoop Server - Contains entities such as the connector, driver, links and jobs. Relatively small.
  • Activity Monitor - Contains information about past activities. In large clusters, this database can grow large. Configuring an Activity Monitor database is only necessary if a MapReduce service is deployed.
  • Reports Manager - Tracks disk utilization and processing activities over time. Medium-sized.
  • Hive Metastore Server - Contains Hive metadata. Relatively small.
  • Hue Server - Contains user account information, job submissions, and Hive queries. Relatively small.
  • Sentry Server - Contains authorization metadata. Relatively small.
  • Cloudera Navigator Audit Server - Contains auditing information. In large clusters, this database can grow large.
  • Cloudera Navigator Metadata Server - Contains authorization, policies, and audit report metadata. Relatively small.
roleDatabaseUserPassword
root root123456HW
Activity Monitoramonamonamon
Reports Managerrmanrmanrman
Hive Metastore Serverhivehivehive
Sentry Serversentrysentrysentry
Cloudera Navigator Audit Servernavnavnav
Cloudera Navigator Metadata Servernavmsnavmsnavms
Oozieoozieoozieoozie
Huehuehuehue
Cloudera Manager Servercmfcmfcmf

建库,授权

create database amon DEFAULT CHARACTER SET utf8;
grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';
 
create database rman DEFAULT CHARACTER SET utf8;
grant all on rman.* TO 'rman'@'%' IDENTIFIED BY 'rman!';
 
create database hive DEFAULT CHARACTER SET utf8;
grant all on hive.* TO 'hive'@'%' IDENTIFIED BY 'hive';
 
create database sentry DEFAULT CHARACTER SET utf8;
grant all on sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';
 
create database nav DEFAULT CHARACTER SET utf8;
grant all on nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';
 
create database navms DEFAULT CHARACTER SET utf8;
grant all on navms.* TO 'navms'@'%' IDENTIFIED BY 'navms';
 
create database oozie DEFAULT CHARACTER SET utf8;
grant all on oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
 
create database hue DEFAULT CHARACTER SET utf8;
grant all on hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';
 
create database cmf DEFAULT CHARACTER SET utf8;
grant all on cmf.* TO 'cmf'@'%' IDENTIFIED BY 'cmf';
 
flush privileges;

 

等密验证

在manager节点上操作:

ssh-keygen -t rsa

ssh-copy-id root@192.168.42.101

ssh-copy-id root@192.168.42.102

ssh-copy-id root@192.168.42.103

4、关闭防火墙

systemctl stop firewalld

systemctl disable firewalld

setenforce 0

 

安装JDBC driver

wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz

tar zxvf mysql-connector-java-5.1.46.tar.gz
sudo mkdir -p /usr/share/java/
cd mysql-connector-java-5.1.46
sudo cp mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar

二、部署

1、安装Cloudera Manager

在Manager-node部署CM,安装最新的CM可在Cloudera官网上下载5.15版本进行二进制安装,我这里进行yum 安装。

sudo rpm --import https://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/RPM-GPG-KEY-cloudera
sudo yum install cloudera-manager-daemons cloudera-manager-server

初始化Cloudera Manager Mysql脚本(mysql与cm在同一节点上安装)

sudo /usr/share/cmf/schema/scm_prepare_database.sh mysql -h 192.168.42.100 --scm-host 192.168.42.100 scm scm
Enter SCM password:
JAVA_HOME=/usr/java/jdk1.7.0_80-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing:  /usr/java/jdk1.7.0_80-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/usr/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[                          main] DbCommandExecutor              INFO  Successfully connected to database.
All done, your SCM database is configured correctly!

启动cms

systemctl start cloudera-scm-server

登录到cmf:   http://192.168.42.100:7180/cmf

用户名密码admin/admin

     

2、安装CDH和其他软件

群集安装前准备,在manager节点上登录cm,http://192.168.42.100:7180/cmf

3、安装集群

可按以下选择进行安装,

选择服务

  • Core Hadoop

    HDFS, YARN (MapReduce 2 Included), ZooKeeper, Oozie, Hive, and Hue

  • Core with HBase

    HDFS, YARN (MapReduce 2 Included), ZooKeeper, Oozie, Hive, Hue, and HBase

  • Core with Impala

    HDFS, YARN (MapReduce 2 Included), ZooKeeper, Oozie, Hive, Hue, and Impala

  • Core with Search

    HDFS, YARN (MapReduce 2 Included), ZooKeeper, Oozie, Hive, Hue, and Solr

  • Core with Spark

    HDFS, YARN (MapReduce 2 Included), ZooKeeper, Oozie, Hive, Hue, and Spark

  • All Services

    HDFS, YARN (MapReduce 2 Included), ZooKeeper, Oozie, Hive, Hue, HBase, Impala, Solr, Spark, and Key-Value Store Indexer

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
Cloudera集群是一种分布式计算系统,用于处理和存储大数据。它是基于Apache Hadoop生态系统构建的,并提供了一个完善的管理平台和工具集,以帮助用户轻松管理和操作集群Cloudera集群的主要组成部分包括: 1. Hadoop分布式文件系统(HDFS):Cloudera集群使用HDFS来存储和管理大数据。它将数据分散存储在多个物理机器上,提供高可靠性和容错能力。 2. YARN(Yet Another Resource Negotiator):YARN是Cloudera集群的资源管理器,负责分配集群资源和协调作业执行。它允许用户在集群上同时运行多个应用程序,并根据不同的需求动态调整资源分配。 3. MapReduce:Cloudera集群使用MapReduce计算模型处理大数据。MapReduce将任务分解为独立的Map和Reduce阶段,以并行处理数据和生成结果。 4. Cloudera Manager:Cloudera Manager是一个用于集群管理和监控的工具。它提供了可视化界面,让用户可以轻松配置、管理和监控集群的各个组件和服务,包括HDFS、YARN、MapReduce等。 通过Cloudera集群,用户可以轻松地处理大规模数据,并应用机器学习、数据挖掘、数据分析等技术进行数据探索和价值挖掘。Cloudera的开放性也使得用户可以方便地集成其他工具和技术,扩展集群的功能和应用范围。 总之,Cloudera集群是一个功能强大且易于使用的大数据处理平台,它提供了分布式文件系统、资源管理器、计算模型和集群管理工具,使得用户可以高效地处理和分析大规模的数据。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值