一、概念
Hive是基于Hadoop的一个数据仓库工具,可以将结构化的数据文件映射为一张数据库表,并提供类SQL查询功能。
本质是将SQL转换为MapReduce程序
- hive的元数据
- Hive将元数据储存在数据库中(metastore),支持mysql,derby,oracle等数据库。
- Hive中的元数据包括表的名字,表的列和分区及其属性,表的属性(是否为外部表等),表的数据所在目录等
- HQL的执行过程
解释器、编译器、优化器完成HQL查询语句从词法分析、语法分析、编译、优化以及查询计划(PLAN)的生成。生成的查询计划储存在HDFS中,并在随后又MapReduce调用执行
- Hive的结构体系
二、安装、配置
- Hive安装模式分为嵌入模式、本地模式、远程模式三种模式
- 嵌入模式:Hive将元信息存储到derby数据库中。它只能创建一个连接,同一时间只能一个人操作Hive
- 本地模式:Hive将元信息储存在MySQL数据库中。Mysql数据库与Hive运行在同一台物理机器上,支持多个连接,多用于开发和测试
- 远程模式:Hive将元信息储存在Mysql数据库中。Mysql数据库与Hive运行在不同的物理机器上,多用于生产环境
- 准备
apache-hive-1.2.1-bin.tar.gz安装包,下载地址:http://apache.fayea.com/hive/hive-1.2.1/
将hive安装包上传至linux的root目录下
- 安装
解压安装,tar zxvf apache-hive-1.2.1.tar.gz
建立软链接,ln -sf /root/apache-hive-1.2.1-bin /home/hive
- 配置
cd /home/hive/conf目录
cp hive-default.xml.template hive-site.xml
配置环境变量
vim /etc/profile
- export HIVE_INSTALL=/home/hive
- export PATH=$PATH:$HIVE_INSTALL/bin
source /etc/profile
- 启动hive,成功!
三、安装关系型数据库MySQL
- 通过查看hive/conf/hive-site.xml,hive自带的关系型数据库是derby,这种数据库不稳定
- <property>
- <name>javax.jdo.option.ConnectionURL</name>
- <value>jdbc:derby:;databaseName=metastore_db;create=true</value>
- <description>JDBC connect string for a JDBC metastore</description>
- </property>
四、常见异常
-
- [root@node1 bin]# hive
-
- Logging initialized using configuration in jar:file:/root/apache-hive-1.2.1-bin/lib/hive-common-1.2.1.jar!/hive-log4j.properties
- Exception in thread "main" java.lang.RuntimeException: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
- at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
- at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
- at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
- at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
- at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
- at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
- at java.lang.reflect.Method.invoke(Method.java:606)
- at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
- Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
- at org.apache.hadoop.fs.Path.initialize(Path.java:206)
- at org.apache.hadoop.fs.Path.<init>(Path.java:172)
- at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:563)
- at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
- ... 7 more
- Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
- at java.net.URI.checkPath(URI.java:1804)
- at java.net.URI.<init>(URI.java:752)
- at org.apache.hadoop.fs.Path.initialize(Path.java:203)
- ... 10 more
解决办法:
vim conf/hive-site.xml,增加下面两项配置内容
- <property>
- <name>system:java.io.tmpdir</name>
- <value>/opt/hive/tmpdir</value>
- </property>
-
- <property>
- <name>system:user.name</name>
- <value>username</value>
- </property>
- 异常二
- [root@node1 lib]# hive
-
- Logging initialized using configuration in jar:file:/root/apache-hive-1.2.1-bin/lib/hive-common-1.2.1.jar!/hive-log4j.properties
- [ERROR] Terminal initialization failed; falling back to unsupported
- java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected
- at jline.TerminalFactory.create(TerminalFactory.java:101)
- at jline.TerminalFactory.get(TerminalFactory.java:158)
- at jline.console.ConsoleReader.<init>(ConsoleReader.java:229)
- at jline.console.ConsoleReader.<init>(ConsoleReader.java:221)
- at jline.console.ConsoleReader.<init>(ConsoleReader.java:209)
- at org.apache.hadoop.hive.cli.CliDriver.setupConsoleReader(CliDriver.java:787)
- at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:721)
- at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
- at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
- at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
- at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
- at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
- at java.lang.reflect.Method.invoke(Method.java:606)
- at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
-
- Exception in thread "main" java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected
- at jline.console.ConsoleReader.<init>(ConsoleReader.java:230)
- at jline.console.ConsoleReader.<init>(ConsoleReader.java:221)
- at jline.console.ConsoleReader.<init>(ConsoleReader.java:209)
- at org.apache.hadoop.hive.cli.CliDriver.setupConsoleReader(CliDriver.java:787)
- at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:721)
- at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
- at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
- at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
- at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
- at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
- at java.lang.reflect.Method.invoke(Method.java:606)
- at org.apache.hadoop.util.RunJar.main(RunJar.java:212)