Hive - 创建Index失败，原因暂未知

最新推荐文章于 2024-01-09 10:50:13 发布

puffsun

最新推荐文章于 2024-01-09 10:50:13 发布

阅读量1k

点赞数

分类专栏： Hadoop Hive 文章标签：大数据数据库 java

本文链接：https://blog.csdn.net/puffsun/article/details/84467808

版权

Hadoop 同时被 2 个专栏收录

57 篇文章 0 订阅

订阅专栏

Hive

3 篇文章 0 订阅

订阅专栏

运行环境Cloudera Hive 0.10-CDH4

在我机器上安装的Hive里有如下的表：

hive (human_resources)> describe formatted employees;

col_name data_type comment

# col_name data_type comment

name string None

salary float None

subordinates array<string> None

deductions map<string,float> None

address struct<country:string,city:string,zip:int> None

# Partition Information

# col_name data_type comment

country string None

state string None

# Detailed Table Information

Database: human_resources

Owner: root

CreateTime: Mon Jul 22 23:05:47 CST 2013

LastAccessTime: UNKNOWN

Protect Mode: None

Retention: 0

Location: hdfs://n8.example.com:8020/user/hive/warehouse/human_resources.db/employees

Table Type: MANAGED_TABLE

Table Parameters:

numFiles 1

numPartitions 1

numRows 0

rawDataSize 0

totalSize 784

transient_lastDdlTime 1375942564

# Storage Information

SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

InputFormat: org.apache.hadoop.mapred.TextInputFormat

OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

Compressed: No

Num Buckets: -1

Bucket Columns: []

Sort Columns: []

Storage Desc Params:

serialization.format 1

Time taken: 0.132 seconds

该Employees表中有如下数据(Hive会自动把select * 操作转换成文件系统读操作，所以这里并没有MR Job)：

hive (human_resources)> select * from employees;

name salary subordinates deductions address country state

John Doe 100000.0 ["Mary Smith","Todd Jones"] {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1} {"country":"1 Michigan Ave.","city":"Chicago","zip":null} US CA

Mary Smith 80000.0 ["Bill King"] {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1} {"country":"100 Ontario St.","city":"Chicago","zip":null} US CA

Todd Jones 70000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"country":"200 Chicago Ave.","city":"Oak Park","zip":null} US CA

Bill King 60000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"country":"300 Obscure Dr.","city":"Obscuria","zip":null} US CA

Boss Man 200000.0 ["John Doe","Fred Finance"] {"Federal Taxes":0.3,"State Taxes":0.07,"Insurance":0.05} {"country":"1 Pretentious Drive.","city":"Chicago","zip":null} US CA

Fred Finance 150000.0 ["Stacy Accountant"] {"Federal Taxes":0.3,"State Taxes":0.07,"Insurance":0.05} {"country":"2 Pretentious Drive.","city":"Chicago","zip":null} US CA

Stacy Accountant 60000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"country":"300 Main St.","city":"Naperville","zip":null} US CA

Time taken: 0.164 seconds

现在我想用如下语句给Employees表创建索引，操作失败并有如下提示：

hive (human_resources)> CREATE INDEX employees_index

> ON TABLE employees (country, name)

> AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD

> IDXPROPERTIES ('creator' = 'me', 'created_at' = 'some_time')

> IN TABLE employees_index_table

> PARTITIONED BY (country)

> COMMENT 'Employees indexed by country and name.';

FAILED: ParseException line 6:0 missing EOF at 'PARTITIONED' near 'employees_index_table'

假如我去掉partitioned by子句会出现如下错误提示：