HIVE--DDL

                                           HIVE--DDL

 

一.DDL概念介绍

DDL:数据库模式定义语言DDL(Data Definition Language),是用于描述数据库中要存储的现实世界实体的语言。

我个人理解DDL就是对表的Metadata(元数据)操作的语言

HIVE官方的是DDL地址:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL

 

二.DDL--Database

Create Database:

   CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
  [COMMENT database_comment]
  [LOCATION hdfs_path]
  [WITH DBPROPERTIES (property_name=property_value, ...)];

IF NOT EXISTS:是否存在就不需要建,如果不存在就建数据库

COMMENT database_comment:数据库的备注信息

LOCATION hdfs_path:可以指定数据存储地址,一般数据库默认存放在hdfs:/user/hive/warehouse

WITH DBPROPERTIES (property_name=property_value, ...):可以加一些默认的参数

eg:

hive (default)> CREATE DATABASE IF NOT EXISTS xkwarehouse
              > COMMENT 'xkwarehouse' 
              > LOCATION '/user/hive/xkwarehouse';
OK
Time taken: 0.022 seconds
hive (default)> show databases;
OK
database_name
default
xkdata
xkhive
xkwarehouse
Time taken: 0.025 seconds, Fetched: 4 row(s)
hive (default)> desc database xkwarehouse;
OK
db_name comment location        owner_name      owner_type      parameters
xkwarehouse     xkwarehouse     hdfs://xkhadoop:9000/user/hive/xkwarehouse      hadoop  USER
Time taken: 0.029 seconds, Fetched: 1 row(s)

因为刚创建的数据库没有数据,所以需要把先创建一个table,然后查看下HDFS:/user/hive/xkwarehouse这个路径下是否有数据产生

hive (xkwarehouse)> create table stu (id int ,age int);
OK
Time taken: 0.096 seconds
hive (xkwarehouse)> show tables;
OK
tab_name
stu
Time taken: 0.029 seconds, Fetched: 1 row(s)

 

 

Drop Database

DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];

CASCADE:级联删除,如果删除一个库的时候,数据库里面有表,则会报错叫你先删除表再删除库,加上这个参数,则不管是否有表存在,表+库一起删除(慎用

eg:

hive (xkwarehouse)> show databases;
OK
database_name
default
xkdata
xkhive
xkwarehouse
Time taken: 0.014 seconds, Fetched: 4 row(s)
hive (xkwarehouse)> DROP DATABASE IF EXISTS xkdata;
OK
Time taken: 0.064 seconds
hive (xkwarehouse)> show databases;
OK
database_name
default
xkhive
xkwarehouse
Time taken: 0.018 seconds, Fetched: 3 row(s)

Alter Database

ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, ...);

eg:

hive (xkwarehouse)> ALTER DATABASE xkwarehouse SET DBPROPERTIES ('creator'='xk');
OK
Time taken: 0.053 seconds
hive (xkwarehouse)> desc database xkwarehouse;
OK
db_name comment location        owner_name      owner_type      parameters
xkwarehouse     xkwarehouse     hdfs://xkhadoop:9000/user/hive/xkwarehouse      hadoop  USER
Time taken: 0.017 seconds, Fetched: 1 row(s)

三.DDL--Create Table

查看表详细数据

hive (default)> desc formatted stu;
OK
col_name        data_type       comment
# col_name              data_type               comment             
                 
id                      int                                         
age                     int                                         
                 
# Detailed Table Information             
Database:               default                  
OwnerType:              USER                     
Owner:                  hadoop                   
CreateTime:             Mon Dec 16 22:20:46 CST 2019     
LastAccessTime:         UNKNOWN                  
Protect Mode:           None                     
Retention:              0                        
Location:               hdfs://xkhadoop:9000/user/hive/warehouse/stu     
Table Type:             MANAGED_TABLE            
Table Parameters:                
        COLUMN_STATS_ACCURATE   true                
        numFiles                1                   
        numRows                 1                   
        rawDataSize             4                   
        totalSize               5                   
        transient_lastDdlTime   1576506895          
                 
# Storage Information            
SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe       
InputFormat:            org.apache.hadoop.mapred.TextInputFormat         
OutputFormat:           org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat       
Compressed:             No                       
Num Buckets:            -1                       
Bucket Columns:         []                       
Sort Columns:           []                       
Storage Desc Params:             
        serialization.format    1                   
Time taken: 0.109 seconds, Fetched: 33 row(s)

1.外部表与内部表

外部表:

[hadoop@xkhadoop data]$ cat order_created.txt 
10703007267488  2014-05-01 06:01:12.334+01
10101043505096  2014-05-01 07:28:12.342+01
10103043509747  2014-05-01 07:50:12.33+01
10103043501575  2014-05-01 09:27:12.33+01
10104043514061  2014-05-01 09:03:12.324+01
-------HIVE-------------
hive (xkhive)> CREATE EXTERNAL TABLE IF NOT EXISTS order_created (
             > id string,
             > create_time string
             > )ROW FORMAT delimited FIELDS TERMINATED BY '/t'
             > LOCATION '/test';
OK
Time taken: 0.14 seconds

---------load data ------------
hive (xkhive)> LOAD DATA LOCAL INPATH '/home/hadoop/data/order_created.txt' INTO TABLE order_created;
Loading data to table xkhive.order_created
Table xkhive.order_created stats: [numFiles=0, totalSize=0]
OK
Time taken: 0.583 seconds
hive (xkhive)> select * from order_created;
OK
order_created.id        order_created.create_time
10703007267488  2014-05-01 06:01:12.334+01      NULL
10101043505096  2014-05-01 07:28:12.342+01      NULL
10103043509747  2014-05-01 07:50:12.33+01       NULL
10103043501575  2014-05-01 09:27:12.33+01       NULL
10104043514061  2014-05-01 09:03:12.324+01      NULL
Time taken: 0.253 seconds, Fetched: 5 row(s)
--------MySQL查询metadata-------
mysql> select * from tbls \G;
*************************** 4. row ***************************
            TBL_ID: 11
       CREATE_TIME: 1577457570
             DB_ID: 2
  LAST_ACCESS_TIME: 0
             OWNER: hadoop
        OWNER_TYPE: USER
         RETENTION: 0
             SD_ID: 11
          TBL_NAME: order_created
          TBL_TYPE: EXTERNAL_TABLE
VIEW_EXPANDED_TEXT: NULL
VIEW_ORIGINAL_TEXT: NULL
4 rows in set (0.00 sec)
--------drop table -------------
hive (xkhive)> drop table order_created;
OK
Time taken: 0.552 seconds
hive (xkhive)> show tables;
OK
tab_name
Time taken: 0.021 seconds
------------查询MySQL metadata----
mysql> select * from tbls \G;
-- 没有刚才那个表数据
-----------查询之前table创建指定HDFS地址------
[hadoop@xkhadoop data]$ hadoop fs -ls /test/
19/12/27 22:49:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
-rwxr-xr-x   1 hadoop supergroup        213 2019-12-27 22:41 /test/order_created.txt

内部表:

hive (xkhive)> CREATE TABLE IF NOT EXISTS order_created_managed (
             > id string,
             > create_time string
             > )ROW FORMAT delimited FIELDS TERMINATED BY '/t'
             > LOCATION '/test';
OK
Time taken: 0.076 seconds
-------load data------------
hive (xkhive)> LOAD DATA LOCAL INPATH '/home/hadoop/data/order_created.txt' INTO TABLE order_created_managed;
Loading data to table xkhive.order_created_managed
Table xkhive.order_created_managed stats: [numFiles=0, numRows=0, totalSize=0, rawDataSize=0]
OK
Time taken: 0.23 seconds
hive (xkhive)> select * from order_created_managed;
OK
order_created_managed.id        order_created_managed.create_time
10703007267488  2014-05-01 06:01:12.334+01      NULL
10101043505096  2014-05-01 07:28:12.342+01      NULL
10103043509747  2014-05-01 07:50:12.33+01       NULL
10103043501575  2014-05-01 09:27:12.33+01       NULL
10104043514061  2014-05-01 09:03:12.324+01      NULL
10703007267488  2014-05-01 06:01:12.334+01      NULL
10101043505096  2014-05-01 07:28:12.342+01      NULL
10103043509747  2014-05-01 07:50:12.33+01       NULL
10103043501575  2014-05-01 09:27:12.33+01       NULL
10104043514061  2014-05-01 09:03:12.324+01      NULL
Time taken: 0.059 seconds, Fetched: 10 row(s)
----------查询HDFS--------
[hadoop@xkhadoop data]$ hadoop fs -ls /test/
19/12/27 22:56:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rwxr-xr-x   1 hadoop supergroup        213 2019-12-27 22:41 /test/order_created.txt
-rwxr-xr-x   1 hadoop supergroup        213 2019-12-27 22:55 /test/order_created_copy_1.txt
--------发现多了一份数据---------
----------查询MySQL metadata-----------
mysql> select * from tbls \G;
*************************** 4. row ***************************
            TBL_ID: 12
       CREATE_TIME: 1577458497
             DB_ID: 2
  LAST_ACCESS_TIME: 0
             OWNER: hadoop
        OWNER_TYPE: USER
         RETENTION: 0
             SD_ID: 12
          TBL_NAME: order_created_managed
          TBL_TYPE: MANAGED_TABLE
VIEW_EXPANDED_TEXT: NULL
VIEW_ORIGINAL_TEXT: NULL
4 rows in set (0.00 sec)

----------drop table--------
hive (xkhive)> drop table order_created_managed;
OK
Time taken: 0.125 seconds
hive (xkhive)> show tables;
OK
tab_name
Time taken: 0.029 seconds
--------查询MySQL metadata-----
---发现没有order_created_managed表元数据信息----
---查询HDFS---------
[hadoop@xkhadoop data]$ hadoop fs -ls /test/
19/12/27 23:06:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ls: `/test/': No such file or directory

内部表与外部表区别:

内部表drop table的时候会删除元数据信息和HDFS里面数据文件,外部表drop table 的时候只删除元数据信息,生产推荐使用外部表

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值