<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.232.8:3306/hadoop?createDatabaseIfNotExist=true&autoReconnect=true&useSSL=false</value>
</property>
VERSION
根据上诉配置可以查看元数据【非一条不可启动hive】
mysql> select * from VERSION;
+--------+----------------+---------------------------------------+
| VER_ID | SCHEMA_VERSION | VERSION_COMMENT |
+--------+----------------+---------------------------------------+
| 1 | 1.1.0 | Set by MetaStore hadoop@192.168.232.8 |
+--------+----------------+---------------------------------------+
1 row in set (0.00 sec)
DBS
数据库的信息
mysql> mysql> select * from DBS;
+-------+-----------------------+----------------------------------------+---------+------------+------------+
| DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+----------------------------------------+---------+------------+------------+
| 1 | Default Hive database | hdfs://hadoop:8020/user/hive/warehouse | default | public | ROLE |
+-------+-----------------------+----------------------------------------+---------+------------+------------+
1 row in set (0.00 sec)
mysql> select * from DBS \G;
*************************** 1. row ***************************
DB_ID: 1
DESC: Default Hive database
DB_LOCATION_URI: hdfs://hadoop:8020/user/hive/warehouse 【数据库路径】
NAME: default【数据库名字】
OWNER_NAME: public
OWNER_TYPE: ROLE
1 row in set (0.00 sec)
TBLS
表的信息
mysql> select * from TBLS \G;
*************************** 1. row ***************************
TBL_ID: 11
CREATE_TIME: 1555234071【创建表的时间】
DB_ID: 1【表的归属库名字的id】
LAST_ACCESS_TIME: 0
OWNER: hadoop
RETENTION: 0
SD_ID: 11【对应下列的SDS表的SD_ID】
TBL_NAME: hive_wordcount 【我们自己创建的表名】
TBL_TYPE: MANAGED_TABLE【表类型(内部表外部表)】
VIEW_EXPANDED_TEXT: NULL
VIEW_ORIGINAL_TEXT: NULL
SDS
表的存储类型信息
*************************** 13. row ***************************
SD_ID: 45
CD_ID: 40【对应CDS表的CD_ID】
INPUT_FORMAT: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat【我们设置的input存储类型】
IS_COMPRESSED:
IS_STOREDASSUBDIRECTORIES:
LOCATION: hdfs://hadoop:8020/user/hive/warehouse/page_views_parquet_gzip
NUM_BUCKETS: -1
OUTPUT_FORMAT: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat【我们设置的output存储类型】
SERDE_ID: 45
13 rows in set (0.01 sec)
ERROR:
No query specified
mysql> select * from SDS \G;
CDS
COLUMNS_V2
列的信息表
*************************** 61. row ***************************
CD_ID: 40【对应CDS表的CD_ID】
COMMENT: NULL
COLUMN_NAME: url【列名】
TYPE_NAME: string
INTEGER_IDX: 1【列的index】
61 rows in set (0.00 sec)
ERROR:
No query specified
mysql> select * from COLUMNS_V2 \G;
PARTITIONS
分区信息表
mysql> select * from PARTITIONS \G;
*************************** 1. row ***************************
PART_ID: 2
CREATE_TIME: 1555406840
LAST_ACCESS_TIME: 0
PART_NAME: day=20190416【分区的名字】
SD_ID: 26
TBL_ID: 24
PARTITIONS_KEY
分区key的信息表
mysql> select * from PARTITION_KEYS \G;
*************************** 1. row ***************************
TBL_ID: 24
PKEY_COMMENT: NULL
PKEY_NAME: day
PKEY_TYPE: string
INTEGER_IDX: 0
PARTITION_KEY_VALS
分区key的值的信息表
mysql> select * from PARTITION_KEY_VALS \G;
*************************** 1. row ***************************
PART_ID: 2
PART_KEY_VAL: 20190416
INTEGER_IDX: 0
场景应用
当Hadoop的hdfs与spark依赖的hdfs版本发生冲突的时候;删除hive里的表会发生元数据识别问题;这个时候的问题解决就是
更新hive表的元数据,从mysql手动删除元数据。