1. 部署方式
使用hadoop-2.6.0-cdh5.7.0存储数据
使用hive-1.1.0-cdh5.7.0
- 使用mysql-5.6.35存储元数据
Hive默认情况下,放元数据的是Derby,遗憾的是Derby是单用户的,在生产环境下一般采用支持多用户的数据库,来进行Meta Store,且进行Master-Slave主从读写分离和备份,最常用是MySQL
2. hadoop伪分布模式部署
获取HADOOP软件包有两种方式
1、通过源码方式maven编译获取(本实验使用源码编译方式)
2、直接获取编译后二进制软件包
3. 前期规划
System | IPaddr | 主机名 | Software |
---|---|---|---|
CentOS-6.8-x86_64 | 192.168.10.10 | hadoop-master | hadoop-2.6.0-cdh5.7.0.tar.gz |
CentOS-6.8-x86_64 | 192.168.10.10 | hadoop-master | apache-hive-1.1.0-cdh5.7.0-bin.tar.gz |
CentOS-6.8-x86_64 | 192.168.10.10 | hadoop-master | mysql-5.6.35-linux-glibc2.5-x86_64.tar.gz |
4. 安装前准备
4.1 配置主机名静态IPaddr(略)
4.2 配置aliyun yum源(略)
4.3 Oracle jdk1.8安装部署(Open jdk尽量不要使用)下载地址(略)
4.4设置java全局环境变量验证(略)
4.5 MySQL安装部署启动(略)
5. cdh版本hadoop安装部署
5.1 配置hadoop用户与用户组
[root@hadoop ~]# useradd -u 515 -m hadoop -s /bin/bash
[root@hadoop ~]# vim /etc/sudoers
hdaoop ALL=(root) NOPASSWD:ALL
5.2 解压源码编译hadoop
[root@hadoop ~]# tar -xf /tmp/hadoop-2.6.0-cdh5.7.0.tar.gz -C /usr/local/
[root@hadoop ~]# cd /usr/local/
[root@hadoop local]# ln -s hadoop-2.6.0-cdh5.7.0/ hadoop
[root@hadoop local]# chown hadoop.hadoop -R /usr/local/hadoop*
5.3 设置hadoop家目录全局环境变量(略)
5.4 配置hdfs文件系统设置
####配置core-site.xml hadoop核心配置文件
[root@hadoop hadoop]# vim etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
####配置hdfs-site.xml hadoop hdfs服务
[root@hadoop hadoop]# vim etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
####配置hadoop-env.sh hadoop配置环境
[root@hadoop hadoop]# vim etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk
5.5配置hadoop用户ssh信任关系
[root@hadoop hadoop]# su - hadoop
[hadoop@hadoop ~]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
41:f4:8c:83:77:7f:1b:05:59:7f:cb:dd:11:7f:0b:e0 hadoop@hadoop
The key's randomart image is:
+--[ RSA 2048]----+
| .o . .+.|
| o +. . ..+|
| . = +E . .*|
| . + . o.O|
| S . o+o|
| . o |
| . |
| |
| |
+-----------------+
[hadoop@hadoop ~]$ cd .ssh/
[hadoop@hadoop .ssh]$ mv id_rsa.pub authorized_keys
[hadoop@hadoop .ssh]$ chmod 600 authorized_keys
####验证免密码登录
[hadoop@hadoop .ssh]$ ssh hadoop
RSA key fingerprint is 8b:a9:2e:6a:0c:5a:db:b0:d9:26:65:36:39:fd:8c:6f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop,192.168.10.10' (RSA) to the list of known hosts.
[hadoop@hadoop ~]$ exit
logout
Connection to 10.70.193.215 closed.
[hadoop@hadoop .ssh]$
5.6 格式化hdfs文件系统
[hadoop@hadoop .ssh]$ hdfs namenode -format
17/12/23 14:57:08 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
17/12/23 14:57:10 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1241967460-127.0.0.1-1514012230107
17/12/23 14:57:10 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
17/12/23 14:57:10 INFO namenode.FSImageFormatProtobuf: Saving image file /tmp/hadoop-hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
17/12/23 14:57:10 INFO namenode.FSImageFormatProtobuf: Image file /tmp/hadoop-hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 323 bytes saved in 0 seconds.
17/12/23 14:57:10 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
17/12/23 14:57:10 INFO util.ExitUtil: Exiting with status 0
17/12/23 14:57:10 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost/127.0.0.1
************************************************************/
5.6 启动HDFS服务验证
[hadoop@hadoop ~]$ start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop-2.8.3/logs/hadoop-hadoop-namenode-hadoop.out
localhost: starting datanode, logging to /usr/local/hadoop-2.8.3/logs/hadoop-hadoop-datanode-hadoop.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-2.8.3/logs/hadoop-hadoop-secondarynamenode-hadoop.out
[hadoop@hadoop ~]$ jps
2615 Jps
1817 NameNode
2153 SecondaryNameNode
1962 DataNode
5.7 配置MapReduce+Yarn设置
####配置mapred-site.xml hadoop mapred计算所需要的配置文件
[root@hadoop hadoop]# vim etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
####配置mapred-site.xml hadoop yarn服务
[root@hadoop hadoop]# vim etc/hadoop/yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
5.8 启动yarn服务验证
[hadoop@hadoop-master ~]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-resourcemanager-hadoop-master.out
localhost: starting nodemanager, logging to /usr/local/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-hadoop-master.out
[hadoop@hadoop-master ~]$ jps
15141 NodeManager
5592 SecondaryNameNode
15177 Jps
5321 NameNode
15050 ResourceManager
5406 DataNode
6. cdh版本 hive部署
6.1 下载对应cdh hive版本
[root@hadoop-master tmp]# wget http://archive.cloudera.com/cdh5/cdh/5/apache-hive-1.1.0-cdh5.7.0-bin.tar.gz
6.2 解压软件 全局环境变量 权限
[root@hadoop-master tmp]# tar xf apache-hive-1.1.0-cdh5.7.0-bin.tar.gz -C /usr/local
[root@hadoop-master tmp]# cd /usr/local
####创建软连接
[root@hadoop-master local]# ln -s apache-hive-1.1.0-cdh5.7.0-bin/ hive
[root@hadoop-master local]# vim /etc/profile
JAVA_HOME=/usr/java/jdk
MAVEN_HOME=/usr/local/apache-maven
HADOOP_HOME=/usr/local/hadoop
HIVE_HOME=/usr/local/hive
MYSQL_BASE=/usr/local/mysql
PATH=/usr/local/python/bin:/usr/local/python/sbin:$MYSQL_BASE/bin/:$JAVA_HOME/bin:$MAVEN_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$PATH
CLASSPATH=$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JAVA_HOME
export MAVEN_HOME
export HADOOP_HOME
export HIVE_HOME
export MYSQL_BASE
export PATH
export CLASSPATH
[root@hadoop local]# source /etc/profile
####验证环境变量是否生效
[root@hadoop-master local]# which hive
/usr/local/hive/bin/hive
####修改用户权限
[root@hadoop-master local]# chown hadoop.hadoop -R hive
[root@hadoop-master local]# chown hadoop.hadoop -R apache-hive-1.1.0-cdh5.7.0-bin/
6.3 Hive配置文件修改
#### hive-env.sh hive环境变量
[root@hadoop-master local]# vim hive/conf/hive-env.sh
HADOOP_HOME=/usr/local/hadoop
#### hive-site.xml hive核心配置文件 (注:chd默认没有相关文件需要自行创建 apache hive默认有相关配置文件)
[root@hadoop-master local]# vim hive/conf/hive-site.sh
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/test_basic?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
</configuration>
6.4 下载拷贝mysql驱动包到$HIVE_HOME/lib
[root@hadoop-master tmp]# wget http://download.softagency.net/MySQL/Downloads/Connector-J/mysql-connector-java-5.1.41.zip
[root@hadoop-master tmp]# unzip mysql-connector-java-5.1.41.zip
[root@hadoop-master tmp]# cd mysql-connector-java-5.1.41
[root@hadoop-master mysql-connector-java-5.1.41]# cp -apR mysql-connector-java-5.1.41-bin.jar /usr/local/hive/lib/
[root@hadoop-master mysql-connector-java-5.1.41]#chown hadoop.hadoop -R /usr/local/hive/lib/
6.5 启动hive服务验证
[root@hadoop-master mysql-connector-java-5.1.41]# su - hadoop
[hadoop@hadoop-master ~]$ hive
which: no hbase in (/usr/local/python/bin:/usr/local/python/sbin:/usr/local/mysql/bin/:/usr/java/jdk/bin:/usr/local/apache-maven/bin:/usr/local/hadoop/bin:/usr/local/hadoop/sbin:/usr/local/hive/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hadoop/bin)
Logging initialized using configuration in jar:file:/usr/local/apache-hive-1.1.0-cdh5.7.0-bin/lib/hive-common-1.1.0-cdh5.7.0.jar!/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive> show databases;
OK
default
Time taken: 3.327 seconds, Fetched: 1 row(s)
[hadoop@hadoop-master ~]$ mysql -uroot -p123456 -S /usr/local/mysql/data/mysql.sock
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 37
Server version: 5.6.35-log MySQL Community Server (GPL)
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| test_basic |
| test |
+--------------------+
5 rows in set (0.16 sec)
mysql> use test_basic;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+-----------------------------+
| Tables_in_test_basic |
+-----------------------------+
| bucketing_cols |
| cds |
| columns_v2 |
| database_params |
| dbs |
| func_ru |
| funcs |
| global_privs |
| part_col_stats |
| partition_key_vals |
| partition_keys |
| partition_params |
| partitions |
| roles |
| sd_params |
| sds |
| sequence_table |
| serde_params |
| serdes |
| skewed_col_names |
| skewed_col_value_loc_map |
| skewed_string_list |
| skewed_string_list_values |
| skewed_values |
| sort_cols |
| tab_col_stats |
| table_params |
| tbls |
| version |
+-----------------------------+
29 rows in set (0.00 sec)