Hive的HiveServer2服务端和Beeline客户端
HiveServer服务端启动命令
[hadoop@hadoop apache-hive-0.13.1-bin]$ bin/hiveserver2
HiveServer2的默认端口是10000
Beeline客户端启动命令
[hadoop@hadoop apache-hive-0.13.1-bin]$ bin/beeline
可以通过help来查看帮助命令
0: jdbc:hive2://hadoop.madmant.com:10000> help
连接命令
!connect jdbc:hive2://hadoop.madmant.com:10000
也可以直接在bin/beeline后面直接接命令如:
bin/beeline -u jdbc:hive2://hadoop.madmant.com:10000 -n hadoop -p aaa111
使用beeline客户端连接之后就可以像普通的hive一样使用hive的sql了,如下面的效果图,并且发现展现是很好看的哦..
0: jdbc:hive2://hadoop.madmant.com:10000> select * from dept;
+--------------+-------------+-----------+
| dept.deptno | dept.dname | dept.loc |
+--------------+-------------+-----------+
| 10 | ACCOUNTING | NEW YORK |
| 20 | RESEARCH | DALLAS |
| 30 | SALES | CHICAGO |
| 40 | OPERATIONS | BOSTON |
+--------------+-------------+-----------+
4 rows selected (1.198 seconds)
hive的fetch的参数设置
fetch的设置参数有两个,参考下面的描述信息,可自行设置,可以设置哪些任务跑MapReduce,哪些任务不需要跑MapReduce。
<property>
<name>hive.fetch.task.conversion</name>
<value>minimal</value>
<description>
Some select queries can be converted to single FETCH task minimizing latency.
Currently the query should be single sourced not having any subquery and should not have
any aggregations or distincts (which incurs RS), lateral views and joins.
1. minimal : SELECT STAR, FILTER on partition columns, LIMIT only
2. more : SELECT, FILTER, LIMIT only (TABLESAMPLE, virtual columns)
</description>
</property>
虚拟列
One is INPUT__FILE__NAME, which is the input file’s name for a mapper task. the other is BLOCK__OFFSET__INSIDE__FILE, which is the current global file position.
虚拟列的两个属性
- 【INPUT__FILE__NAME】
- 代表这行数据属于哪个文件中的
- 【BLOCK__OFFSET__INSIDE__FILE】
- 块的偏移量
【INPUT__FILE__NAME】代表这行数据属于哪个文件中的
select deptno,dname,INPUT__FILE__NAME from dept;
【BLOCK__OFFSET__INSIDE__FILE】块的偏移量
select deptno,dname,BLOCK__OFFSET__INSIDE__FILE from dept;
效果如下:
0: jdbc:hive2://hadoop.madmant.com:10000> select deptno,dname,INPUT__FILE__NAME from dept;
+---------+-------------+-------------------------------------------------------------------+
| deptno | dname | input__file__name |
+---------+-------------+-------------------------------------------------------------------+
| 10 | ACCOUNTING | hdfs://hadoop.madmant.com:8020/user/hive/warehouse/dept/dept.txt |
| 20 | RESEARCH | hdfs://hadoop.madmant.com:8020/user/hive/warehouse/dept/dept.txt |
| 30 | SALES | hdfs://hadoop.madmant.com:8020/user/hive/warehouse/dept/dept.txt |
| 40 | OPERATIONS | hdfs://hadoop.madmant.com:8020/user/hive/warehouse/dept/dept.txt |
+---------+-------------+-------------------------------------------------------------------+
4 rows selected (24.772 seconds)
0: jdbc:hive2://hadoop.madmant.com:10000> select deptno,dname,BLOCK__OFFSET__INSIDE__FILE from dept;
+---------+-------------+------------------------------+
| deptno | dname | block__offset__inside__file |
+---------+-------------+------------------------------+
| 10 | ACCOUNTING | 0 |
| 20 | RESEARCH | 23 |
| 30 | SALES | 42 |
| 40 | OPERATIONS | 59 |
+---------+-------------+------------------------------+
4 rows selected (21.119 seconds)
0: jdbc:hive2://hadoop.madmant.com:10000>