- 关闭防火墙:
- 关闭SELINUX:
- Cloudera-Manager-Server与Cloudera-Manager-Agents之间SSH免密码登陆
- 安装JDK1.7
- 安装Python 2.6 or2.7
Tarball Files 针对RHEL6/CentOS6为:cloudera-manager-el6-cm5.0.0_x86_64.tar.gz
$ tar -xzvf cloudera-manager*.tar.gz -C /opt/cloudera-manager(/opt/cloudera-manager/cm-5.0.0将是CDH根目录)
$ vim /etc/cloudera-scm-agent/config.ini
下载CDH Parcels
$
$ mv
$ mv manifest.json/var/www/html/cdh5.0
$ chmod -R ugo+rX/var/www/html/cdh5.0
- 安装外部数据库(Oracle/MySQL/PostgreSQL),RDBMS字符集必须支持UTF-8,如Oracle设置为AL32UTF8
- 下载安装数据库驱动(主结点)
- 创建Cloudera CDH配置数据库用户及授权
- 运行脚本自动创建Cloudera CDH配置数据库
- 创建ClouderaCDH
数据库及用户帐号
- Reports Manager(必装)
- Hive Metastore(必装)
- Activity Monitor(仅MRv1需要)
- Cloudera Navigator(Data HubEdition
Trial 或者 ClouderaEnterprise可装)
mysql
mysql
mysql> GRANT ALL ON
mysql
mysql> GRANT ALL ON
mysql
mysql> GRANT ALL ON
(八)根据向导安装
启动Cloudera Manager AdminConsole
添加本地Parcel配置源:http://hadoop-1/cdh5.0/,
- Do one of the followingto open the parcel settings page:
-
- Click
inthe top navigation bar - Clickthe
EditSettings button.
- Click
-
- Select
Administration > Settings. - Clickthe
Parcels category.
- Select
-
- Clickthe
Hosts tab. - Select
Configuration > Viewand Edit. - Clickthe
Parcels category. - Clickthe
EditSettings button.
- Clickthe
-
- Inthe
RemoteParcel Repository URLs list,click toopen an additional row. - Enter the path to theparcel. For example,
http://hostname:80/cdh5.0/. - Click
SaveChanges to commit the changes.
组件安装版本
CDH Packaging and TarballInformation
组件 | 版本 |
Apache Hadoop | 2.3.0-cdh5.0.0 |
Apache Hadoop MRv1 | 2.3.0-mr1-cdh5.0.0 |
Apache Hive | 0.12.0-cdh5.0.0 |
Apache HBase | 0.96.1.1-cdh5.0.0 |
Apache ZooKeeper | 3.4.5-cdh5.0.0 |
Apache Sqoop 1 | 1.4.4-cdh5.0.0 |
Apache Sqoop2 | 1.99.3-cdh5.0.0 |
Apache Pig | 0.12.0-cdh5.0.0 |
Apache Flume | 1.4.0-cdh5.0.0 |
Apache Oozie | 4.0.0-cdh5.0.0 |
Apache Mahout | 0.8-cdh5.0.0 |
Apache Whirr | 0.9.0-cdh5.0.0 |
DataFu | 1.1.0-cdh5.0.0 |
Apache Sentry (incubating) | 1.2.0-cdh5.0.0 |
Parquet | 1.2.5-cdh5.0.0 |
Llama | 1.0.0-cdh5.0.0 |
Apache Spark | 0.9.0-cdh5.0.0 |
Apache Crunch | 0.9.0-cdh5.0.0 |
Apache Avro | 1.7.5-cdh5.0.0 |
Kite SDK | 0.10.0-cdh5.0.0 |
Apache Solr | 4.4.0-cdh5.0.0 |
Cloudera Search | 1.0.0-cdh5.0.0 |
Lily HBase Indexer | 1.3-cdh5.0.0 |
ClouderaManager
服务 | 实例 | 说明 | 路径 |
cloudera-manager | Server | 组件目录 | /opt/cloudera-manager/cm-5.0.0/lib/cloudera-scm-server |
cloudera-manager | Server | 启动目录 | /opt/cloudera-manager/cm-5.0.0/etc/init.d/cloudera-scm-server |
cloudera-manager | Server | 日志目录 | /opt/cloudera-manager/cm-5.0.0/log/cloudera-scm-server |
cloudera-manager | Agent | 组件目录 | /opt/cloudera-manager/cm-5.0.0/lib/cloudera-scm-agent |
cloudera-manager | Agent | 启动目录 | /opt/cloudera-manager/cm-5.0.0/etc/init.d/cloudera-scm-agent |
cloudera-manager | Agent | 日志目录 | /opt/cloudera-manager/cm-5.0.0/log/cloudera-scm-agent |
ManagementService
服务 | 实例 | 说明 | 路径 |
mgmt | alertpublisher | Alert Publisher | /var/lib/cloudera-scm-alertpublisher |
mgmt | alertpublisher | Alert Publisher | /var/log/cloudera-scm-alertpublisher |
mgmt | eventserver | Event Server | /var/lib/cloudera-scm-eventserver |
mgmt | eventserver | Event Server | /var/log/cloudera-scm-eventserver |
mgmt | hostmonitor | Host Monitor | /var/lib/cloudera-host-monitor |
mgmt | hostmonitor | Host Monitor | /var/log/cloudera-scm-firehose |
mgmt | servicemonitor | Service Monitor | /var/lib/cloudera-service-monitor |
mgmt | servicemonitor | Service Monitor | /var/log/cloudera-scm-firehose |
mgmt | headlamp | headlamp | /var/lib/cloudera-scm-headlamp |
mgmt | headlamp | headlamp日志目录 | /var/log/cloudera-scm-headlamp |
ComponentLib
服务 | 路径 |
zookeeper | /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/ |
hadoop | /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hadoop |
hadoop-hdfs | /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hadoop-hdfs |
hadoop-mapreduce | /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hadoop-mapreduce |
hadoop-yarn | /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hadoop-yarn |
hbase | /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hbase |
hive | /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hive |
spark | /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/spark |
ComponentConfig
服务 | 路径 |
zookeeper | /etc/zookeeper/conf |
hadoop | /etc/hadoop/conf |
hadoop-hdfs | /etc/hadoop/conf.cloudera.hdfs |
hadoop-mapreduce | /etc/hadoop/conf.cloudera.mapreduce1 |
hadoop-yarn | /etc/hadoop/conf.cloudera.yarn |
hbase | /etc/hbase/conf |
hive | /etc/hive/conf |
spark | /etc/spark/conf |
ComponentShell
服务 | 路径 |
zookeeper | /usr/bin/zookeeper-client ->/etc/alternatives/zookeeper-client |
zookeeper | /usr/bin/zookeeper-server ->/etc/alternatives/zookeeper-server |
hadoop | /usr/bin/hadoop -> /etc/alternatives/hadoop |
hadoop-hdfs | /usr/bin/hdfs -> /etc/alternatives/hdfs |
hadoop-mapred | /usr/bin/mapred -> /etc/alternatives/mapred |
hadoop-yarn | /usr/bin/yarn -> /etc/alternatives/yarn |
spark | /usr/bin/spark-shell ->/etc/alternatives/spark-shell |
spark | /usr/bin/spark-executor ->/etc/alternatives/spark-executor |
hbase | /usr/bin/hbase -> /etc/alternatives/hbase |
hive | /usr/bin/hive -> /etc/alternatives/hive |
ComponentLog
服务 | 路径 |
zookeeper | /var/log/zookeeper/ |
hadoop-hdfs | /var/log/hadoop-hdfs |
hadoop-mapred | /var/log/hadoop-mapreduce |
hadoop-yarn | /var/log/hadoop-yarn |
spark | /var/log/spark/ |
hbase | /var/log/hbase |
hive | /var/log/hive |
取决与MapReduce Jobs配置运行于YARN或者MapReduceService,登录相应的控制台界面进行查看:
- Clusters
> ClusterName > yarnApplications - Clusters
> ClusterName > mapreduceActivities
$ vim /etc/sysctl.conf
增加 vm.swappiness = 0
- 主节点
- 根据提示设置时区$
tzselect [ 5) Asia-> 9)China -> 1) east China -> 1) Yes ] - 查看系统时间 $date设置系统时间$
date --set "04/25/09 10:19" (月/日/年时:分:秒) - 查看硬件时间 $hwclock --show
- 同步硬件时间 $clock -w
$hwclock--hctosys (hc代表硬件时间,sys代表系统时间) - 修改NTP配置$ vim /etc/ntp.conf
- restrict 172.16.66.0 mask 255.255.255.0
- server 127.127.1.0
- fudge 127.127.1.0
- 重启并加入开机启动$service ntpd restart $ chkconfig ntpdon
- 子节点
- 修改NTP配置$vim /etc/ntp.conf 注释掉其他server,并添加与主节点同步
- server172.16.66.138
- 重启并加入开机启动$service ntpd restart $chkconfig ntpd on
解决办法:
DataNode的root根目录权限设置为0777太高导致不安全,修改为755或者默认权限