文章目录
参考文档
下载地址
https://archive.cloudera.com/cdh5/cdh/5/
组件 | 版本 | 下载地址 | 说明 |
---|---|---|---|
jdk | jdk-8u172-linux-x64 | 点击下载 | |
hadoop | hadoop-2.6.0-cdh5.14.2 | 点击下载 | |
zookeeeper | zookeeper-3.4.5-cdh5.14.2 | 点击下载 | |
hbase | hbase-1.2.0-cdh5.14.2 | 点击下载 | |
hive | hive-1.1.0-cdh5.14.2 | 点击下载 | |
phoenix | apache-phoenix-4.12.0-HBase-1.2 | 点击下载 |
先决条件
时间配置
yum install ntp
ntpdate -u ntp.api.bz
jdk
安装 1.8
tar -zxf jdk-8u144-linux-x64.tar.gz -C /usr
mkdir -p /usr/java
mv /usr/jdk1.8.0_144 /usr/java/latest
vim /etc/profile
shift+g
移动到最后,加上以下
export JAVA_HOME=/usr/java/latest
export JRE_HOME=/usr/java/latest/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
source /etc/profile
一些软件
一般都自带有
yum install ssh rsync
hostname
hostnamectl set-hostname z01
vim /etc/hosts
10.0.0.233 z01
hadoop
tar -zxf hadoop-2.6.0-cdh5.14.2.tar.gz -C /usr/local
mv /usr/local/hadoop-2.6.0-cdh5.14.2 /usr/local/hadoop
cd /usr/local/hadoop
环境变量配置
注意,其他组件配置环境变量省略
vim
和source
的操作
vim /etc/profile
export HADOOP_HOME=/usr/local/hadoop
source /etc/profile
配置
vim etc/hadoop/hadoop-env.sh
# set to the root of your Java installation
export JAVA_HOME=/usr/java/latest
在 /usr/local/hadoop
目录下
vim etc/hadoop/core-site.xml
改为以下
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<!--指定tmp目录,防止默认的tmp目录被删除-->
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hadoopDir/tmp/data</value>
</property>
</configuration>
vim etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
配置 ssh免密登录
测试是否能免密登录
$ ssh localhost
执行以下
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
执行
- 格式化 文件系统
bin/hdfs namenode -format
保存在 /tmp/hadoop-root下~
- 启动 NameNode 和 DataNode
sbin/start-dfs.sh
- 访问
http://z01:50070
- 创建执行MapReduce作业所需的HDFS目录:
bin/hdfs dfs -mkdir /user
查看日志
在 log下
验证是否成功
➜ hadoop jps
11395 DataNode
11671 Jps
11274 NameNode
11549 SecondaryNameNode
YARN配置
- 配置
vim etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
vim etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
- 启动ResourceManager守护程序和NodeManager守护程序:
sbin/start-yarn.sh
- 访问
http://z01:8088
- 运行 MapReduce 任务
- 关闭进程
sbin/stop-yarn.sh
➜ hadoop jps
11395 DataNode
11784 ResourceManager
11274 NameNode
11549 SecondaryNameNode
12190 Jps
11871 NodeManager
Zookeeper
tar -zxf zookeeper-3.4.5-cdh5.14.2.tar.gz -C /usr/local
mv /usr/local/zookeeper-3.4.5-cdh5.14.2 /usr/local/zookeeper
cd /usr/local/zookeeper
环境变量 ZOOKEEP_HOME 配置
export ZOOKEEPER_HOME=/usr/local/zookeeper
配置
vim conf/zoo.cfg
内容如下
ticketTime=2000
clientPort=2181
dataDir=/usr/local/zookeeper/data
dataLogDir=/usr/local/zookeeper/logs
创建 logs
目录
mkdir logs
集群才要 myid
echo 1 > /usr/local/zookeeper/data/myid
启动
bin/zkServer.sh start
测试连接
bin/zkCli.sh -server 127.0.0.1:2181
➜ zookeeper jps
11395 DataNode
12437 Jps
11784 ResourceManager
11274 NameNode
11549 SecondaryNameNode
12414 QuorumPeerMain
11871 NodeManager
HBase
解压
tar -zxf hbase-1.2.0-cdh5.14.2.tar.gz -C /usr/local
mv /usr/local/hbase-1.2.0-cdh5.14.2 /usr/local/hbase
cd /usr/local/hbase
HBASE_HOME
export HBASE_HOME=/usr/local/hbase
hbase-env.sh
vim conf/hbase-env.sh
注释 46 47行
解开 120行 (:120
跳到该行)
export HBASE_PID_DIR=/usr/local/hbase/tmp/pids
修改 128 行
export HBASE_MANAGES_ZK=false
mkdir -p tmp/pids
hbase-site.xml
vim conf/hbase-site.xml
<configuration>
<property>
<name>hbase.tmp.dir</name>
<value>/usr/local/hbase/tmp</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost:2181</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
</configuration>
启动 hbase
bin/start-hbase.sh
查看日志
cat logs/hbase-root-master-z01.log
tail -f logs/hbase-root-master-z01.log
连接hbase
bin/hbase shell
hbase(main):001:0> list
TABLE
0 row(s) in 0.2900 seconds
=> []
phoenix
tar -zxf apache-phoenix-4.14.0-cdh5.14.2-bin.tar.gz -C /usr/local
mv /usr/local/apache-phoenix-4.14.0-cdh5.14.2-bin /usr/local/phoenix
cd /usr/local/phoenix
环境变量配置
export PHOENIX_HOME=/usr/local/phoenix
cd $PHOENIX_HOME
cp phoenix-4.14.0-cdh5.14.2-server.jar $HBASE_HOME/lib
$HBASE_HOME/bin/stop-hbase.sh
$HBASE_HOME/bin/start-hbase.sh
bin/sqlline.py
0: jdbc:phoenix:> !tables
0: jdbc:phoenix:> !quit
配置hbase-site.xml
不配置会在创建索引时报错
root@z01:~
# vim $HBASE_HOME/conf/hbase-site.xml
<property>
<name>hbase.regionserver.wal.codec</name>
<value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>
Hive
安装
tar -zxf hive-1.1.0-cdh5.14.2.tar.gz -C /usr/local
mv /usr/local/hive-1.1.0-cdh5.14.2 /usr/local/hive
cd /usr/local/hive
vim conf/hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:postgresql://localhost:5432/hive_metadata</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.postgresql.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>postgres</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>postgres</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://localhost:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<!-- 设置hdfs上保存的数据仓库路径 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
</configuration>
cp conf/hive-env.sh.template conf/hive-env.sh
chmod +x conf/hive-env.sh
vim conf/hive-env.sh
配置 hadoop home (48行)
HADOOP_HOME=/usr/local/hadoop
安装pg11
yum install https://download.postgresql.org/pub/repos/yum/11/redhat/rhel-7-ppc64le/pgdg-centos11-11-2.noarch.rpm
yum install postgresql11-server
/usr/pgsql-11/bin/postgresql-11-setup initdb
systemctl enable postgresql-11
systemctl start postgresql-11
sudo -i -u postgres psql -d postgres -c "create database hive_metadata"
sudo -i -u postgres psql -d postgres -c "alter user postgres password 'postgres'"
vim /var/lib/pgsql/11/data/pg_hba.conf
改成 md5
81 # IPv4 local connections:
82 host all all 127.0.0.1/32 md5
systemctl restart postgresql-11
初始化脚本
下载驱动
wget http://maven.aliyun.com/nexus/content/groups/public/org/postgresql/postgresql/42.1.4/postgresql-42.1.4.jar
mv postgresql-42.1.4.jar lib
初始化
bin/schematool -dbType postgres -initSchema
日志文件修改
cp conf/hive-log4j.properties.template conf/hive-log4j.properties
修改日志级别和输出日志文件的地址。
hive.root.logger=WARN,DRFA
hive.log.dir=/usr/local/hive/logs/hive
启动
nohup bin/hive --service metastore &
nohup bin/hive --service hiveserver2 &
➜ hive jps
17858 HRegionServer
24131 RunJar
15509 DataNode
15912 NodeManager
15387 NameNode
15821 ResourceManager
24317 Jps
12414 QuorumPeerMain
17742 HMaster
24222 RunJar
15663 SecondaryNameNode
http://10.0.0.233:10002/hiveserver2.jsp
测试
bin/beeline
Beeline version 1.1.0-cdh5.14.2 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000
scan complete in 2ms
Connecting to jdbc:hive2://localhost:10000
Enter username for jdbc:hive2://localhost:10000:
Enter password for jdbc:hive2://localhost:10000:
Connected to: Apache Hive (version 1.1.0-cdh5.14.2)
Driver: Hive JDBC (version 1.1.0-cdh5.14.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000> show tables
. . . . . . . . . . . . . . . .> ;
INFO : Compiling command(queryId=root_20190627035555_9574997c-321b-4da0-8e76-b5ce8bf11032): show tables
INFO : Semantic Analysis Completed
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=root_20190627035555_9574997c-321b-4da0-8e76-b5ce8bf11032); Time taken: 0.903 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing command(queryId=root_20190627035555_9574997c-321b-4da0-8e76-b5ce8bf11032): show tables
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=root_20190627035555_9574997c-321b-4da0-8e76-b5ce8bf11032); Time taken: 0.195 seconds
INFO : OK
+-----------+--+
| tab_name |
+-----------+--+
+-----------+--+
No rows selected (1.478 seconds)