数据仓库Hive 搭建

前期准备

在安装 Hive 时,需要先搭建好 hadoop, 并保证 HDFS, YARN已经启动成功。
hadoop-daemon.sh start namenode 
hadoop-daemon.sh start datanode
hadoop-daemon.sh start secondarynamenode
yarn-daemon.sh start resourcemanager
yarn-daemon.sh start nodemanager
mr-jobhistory-daemon.sh start historyserver

本次安装使用的 Hive 版本:apache-hive-1.2.2-bin
  • 安装 Hive 之前,需要先安装Mysql,因为Hive不仅依赖hdfs,而且还依赖于mysql

安装mysql

1. 检查 mysql 是否安装
   yum list installed |grep mysql

2. 卸载 mysql 
   yum -y remove mysql-libs.x86_64
   rpm -qa | grep mysql

3. 安装 mysql
   yum install -y mysql-server mysql  mysql-devel

4. 检查 mysql 版本
   rpm -qi mysql-server

5. 启动 mysql 数据库
   service mysqld start
   service mysqld restart

6. 设置开机启动
   chkconfig --list | grep mysqld
   chkconfig mysqld on

7. 设置 mysql 的用户
   // root 用户, 密码:123456
   mysqladmin -u root password '123456' 
   
8. 登录
   mysql -u root -p

9. 查看 databases 数据库
  show databases;

10. 创建 Hive 的数据库,用于存储 hive 的元数据
    create database hive; //javax.jdo.option.ConnectionURL的端口后的字段保持一致

11. 查看是否创建成功
    show databases;
   
12. 用户授权
    //仅允许用户 root 从 192.168.47.188 的主机连接到 mysql 数据库,'123456' 是 root 的密码
    GRANT ALL PRIVILEGES ON *.* TO 'root'@'192.168.47.188' IDENTIFIED BY '123456' WITH GRANT OPTION;
    //允许 root 用户从任意主机连接到 mysql 服务器。
     GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123456' WITH GRANT OPTION;

13. 刷新权限
    flush privileges;

注: ERROR 1045 (28000):Access denied for user 'root'@'node01' (using password: YES)
    <!-- https://blog.csdn.net/chengyuqiang/article/details/61213396 -->
    // 查询用户表
    select user,host,password from mysql.user;
    // 删除用户表中的 host=node01 用户
    delete from mysql.user where host='node01';
    // 删除密码为空的记录
    delete from mysql.user where user is null;
    delete from mysql.user where password='';
   
    grant all privileges on *.* to root@'%' identified by '123456' with grant option;
    flush privileges;
    
    mysql -h node1 -uroot -p123456 //可以直接进去表示权限问题处理成功。
   
    删除用户信息,防止分布式环境下登陆失败
    delete from user where host='127.0.0.1';

14. mysql 已经安装成功, 并启动
    service mysqld start

安装配置 Hive

 1. 解压安装 Hive。
    tar -zxvf apache-hive-1.2.2-bin.tar.gz -c /opt/app

2. 配置 Hive 环境变量:
   vi /etc/profile
   添加:
       export HIVE_HOME=/opt/app/hive-0.13.1-cdh5.3.6
       export PATH=$HIVE_HOME/bin:$PATH

3. 生效
   source /etc/profile

4. 进入 ./hive-0.13.1-cdh5.3.6/conf
   cp hive-default.xml.template hive-site.xml
   cp hive-env.sh.template hive-env.sh

5. 修改 hive-env.sh
   例如:
       # Set HADOOP_HOME to point to a specific hadoop install directory
        HADOOP_HOME=/opt/app/hadoop-2.5.0
       # Hive Configuration Directory can be controlled by:
        export HIVE_CONF_DIR=/opt/app/hive-0.13.1-cdh5.3.6/conf

6. 修改 hive-site.xml
   a. 保持 hive.metastore.warehouse.dir的值为 /user/hive/warehouse,保持不变
   <property>
   <name>hive.metastore.warehouse.dir</name>
   <value>/user/hive/warehouse</value> <!--hdfs上的目录, 非本地-->
   <description>location of default database for the warehouse</description>
   </property>

   b. javax.jdo.option.ConnectionURL -> jdbc:mysql://192.168.47.188:3306/hive?createDatabaseIfNotExist=true
   <property>
   <name>javax.jdo.option.ConnectionURL</name>
   <value>jdbc:mysql://192.168.47.188:3306/hive?createDatabaseIfNotExist=true</value> <!--这里hive 是数据库中的创建的hive-->
   <description>JDBC connect string for a JDBC metastore</description>
   </property>
  
   c. javax.jdo.option.ConnectionDriverName -> com.mysql.jdbc.Driver (mysql 驱动)
   <property>
   <name>javax.jdo.option.ConnectionDriverName</name>
   <value>com.mysql.jdbc.Driver</value>
   <description>Driver class name for a JDBC metastore</description>
   </property>

   d. javax.jdo.option.ConnectionUserName -> mysql 用户名
   <property>
   <name>javax.jdo.option.ConnectionUserName</name>
   <value>root</value>
   <description>username to use against metastore database</description>
   </property>
  
   e. javax.jdo.option.ConnectionPassword -> mysql 密码
   <property>
   <name>javax.jdo.option.ConnectionPassword</name>
   <value>123456</value>
   <description>password to use against metastore database</description>
   </property>

   f. hive.metastore.uris -> thrift 端口
   <property>
   <name>hive.metastore.uris</name>
   <value>thrift://node01:9083</value>
   <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
   </property>


7. 创建tmp 目录
   cd /opt/app/hive-0.13.1-cdh5.3.6
   mkdir tmp

8. hive-site.xml ->  hive.exec.local.scratchdir -> /opt/app/hive-0.13.1-cdh5.3.6/tmp/${user.name}
  <property>
  <name>hive.exec.local.scratchdir</name>
  <value>/opt/app/hive-0.13.1-cdh5.3.6/tmp/${user.name}</value>
  <description>Local scratch space for Hive jobs</description>
  </property>

9. (可选)hive.metastore.schema.verification设置成false(可选):如果初始化元数据库出现认证问题,可以设置此参数,再次初始化。


10. 目录的配置
   hdfs dfs -mkdir /tmp     //基于hdfs, 非本地的目录.hive的默认临时文件目录
   hdfs dfs -mkdir -p /user/hive/warehouse  //基于hdfs, 非本地的目录. hive的warehouse默认目录
   hadoop fs -chmod g+w   /tmp   //为hdfs中的tmp目录授权
   hadoop fs -chmod g+w   /user/hive/warehouse  //为hdfs中的目录授权

11. 上传mysql 驱动包  -> lib 库
    cp mysql-connector-java-5.1.45-bin.jar /opt/app/hive-0.13.1-cdh5.3.6/lib/

12. 初始化 Hive 元数据库
    schematool -dbType mysql -initSchema  --verbose 
    //如果一个系统中有两个hive共用一个mysql, 那么需要重新创建hive database(见安装mysql的步骤10). 否则, 新的hive 进行该操作时,会报错.

14. 检测hive 是否安装成功
    直接在命令行输入 hive:
    注意:此时hdfs和yarn必须是启动运行状态。


额外内容:
  <property>
    <name>hive.cli.print.header</name>
    <value>true</value> <!--在hive 中可以显示当前数据库,如linux会显示当前路径目录一样-->
    <description>Whether to print the names of the columns in query output.</description>
  </property>

  <property>
    <name>hive.cli.print.current.db</name>
    <value>true</value>  <!--在hive 中可以显示当前数据库-->
    <description>Whether to include the current database in the Hive prompt.</description>
  </property>

启动过程中可能遇到的问题

问题 1
[ERROR] Terminal initialization failed; falling back to unsupported
java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected
        at jline.TerminalFactory.create(TerminalFactory.java:101)
        at jline.TerminalFactory.get(TerminalFactory.java:158)
        at jline.console.ConsoleReader.<init>(ConsoleReader.java:229)
        at jline.console.ConsoleReader.<init>(ConsoleReader.java:221)
        at jline.console.ConsoleReader.<init>(ConsoleReader.java:209)
        at org.apache.hadoop.hive.cli.CliDriver.getConsoleReader(CliDriver.java:773)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:715)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
            
hadoop目录下存在老版本jline:
/hadoop-2.6.0/share/hadoop/yarn/lib:
-rw-r--r-- 1 root root   87325 Mar 10 18:10 jline-0.9.94.jar
将 hive 的 jline jar包复制到 该目录中, 并可以备份并且移除 $HADOOP_HOME/share/hadoop/yarn/lib/ 下的jline-0.9.94.jar文件,
            
问题 2
Exception in thread "main"java.lang.RuntimeException: java.lang.IllegalArgumentException:java.net.URISyntaxException: Relative path in absolute URI:${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
        atorg.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:444)
        atorg.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:672)
        atorg.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
        atsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        atjava.lang.reflect.Method.invoke(Method.java:606)
        atorg.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.lang.IllegalArgumentException:java.net.URISyntaxException: Relative path in absolute URI:${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
        atorg.apache.hadoop.fs.Path.initialize(Path.java:148)
        atorg.apache.hadoop.fs.Path.<init>(Path.java:126)
        atorg.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:487)
        atorg.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:430)
        ... 7more
Caused by: java.net.URISyntaxException:Relative path in absolute URI:${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
        atjava.net.URI.checkPath(URI.java:1804)
        atjava.net.URI.<init>(URI.java:752)
        atorg.apache.hadoop.fs.Path.initialize(Path.java:145)
        ... 10more
            
1.查看hive-site.xml配置,会看到配置值含有"system:java.io.tmpdir"的配置项
2.新建文件夹/home/grid/hive-0.14.0-bin/iotmp
3.将含有"system:java.io.tmpdir"的配置项的值修改为如上地址
https://blog.csdn.net/zwx19921215/article/details/42776589
            
 
问题 3
错误Name node is in safe mode的解决方法  ->   bin/hadoop dfsadmin -safemode leave
            
      
问题 4
Exception in thread "main" java.lang.RuntimeException: java.net.ConnectException: Call From node01/192.168.47.188 to node01:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.net.ConnectException: Call From node01/192.168.47.188 to node01:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
	at org.apache.hadoop.ipc.Client.call(Client.java:1415)
	at org.apache.hadoop.ipc.Client.call(Client.java:1364)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
	at com.sun.proxy.$Proxy19.getFileInfo(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
	at com.sun.proxy.$Proxy19.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:707)
	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1785)
	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1068)
	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1064)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1064)
	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1398)
	at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:596)
	at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
	... 7 more
Caused by: java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
	at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
	at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:606)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:700)
	at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1463)
	at org.apache.hadoop.ipc.Client.call(Client.java:1382)
	... 27 more

jps 查看是否 namenode 已经挂了. 重启namenode 可能会出现 namenode is in safe mode -> hadoop dfsadmin -safemode leave


问题五
ogging initialized using configuration in jar:file:/opt/apps/hive122/lib/hive-common-1.2.2.jar!/hive-log4j.properties
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
	at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
	at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
	at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
	... 8 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
	... 14 more
Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:421)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236)
	at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
	at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
	at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
Caused by: java.net.ConnectException: 拒绝连接 (Connection refused)
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:589)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
	... 22 more
)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:468)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236)
	at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
	... 19 more

这种情况,可能是 hive service 没有启动,就需要 hive 每次需要启动
./bin/hive --service metastore &
./bin/hive --service hiveserver2 -p 10012& (DBevar 默认是10012)

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值