建表语法:
1CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive 0.14.0 and later)
2 [(col_name data_type [COMMENT col_comment], ... [constraint_specification])]
3 [COMMENT table_comment]
4 [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
5 [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
6 [SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)]
7 ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)
8 [STORED AS DIRECTORIES]
9 [
10 [ROW FORMAT row_format]
11 [STORED AS file_format]
12 | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later)
加载数据到表中语法:
1LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
2LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)] [INPUTFORMAT 'inputformat' SERDE 'serde'] (3.0 or later)
3
4LOCAL:从本地系统 linux
5不带LOCAL: 从Hadoop文件系统 HDFS
6OVERWRITE 数据覆盖
7不带OVERWRITE 追加
导出数据到本地系统或者HDFS
1INSERT OVERWRITE [LOCAL] DIRECTORY directory1
2 [ROW FORMAT row_format] [STORED AS file_format] SELECT ... FROM ...
3hive (default)> insert overwrite local directory '/home/hadoop/emp3.txt' row format DELIMITED FIELDS TERMINATED BY '\t' select * from emp03;
4在对应的系统目录可以看到导出的结果文件
1.内部表
1默认语法是创建内部表
2create table emp03(
3empno int,
4ename string,
5job string,
6mgr int,
7hiredate string,
8sal double,
9comm double,
10deptno int)
11row format delimited fields terminated by ',';
12创建好表之后,需要导入数据
13hive (default)> load data local inpath '/home/hadoop/emp.txt' overwrite into table emp03;
14Loading data to table default.emp03
15Table default.emp03 stats: [numFiles=1, totalSize=845]
16OK
17Time taken: 0.248 seconds
18但删除内部表,将删除元数据库中的元数据和hdfs上的数据
2.外部表
1创建外部表需要添加 external声明,最好指明在HDFS创建的目录
2create external table emp_external(
3empno int,
4ename string,
5job string,
6mgr int,
7hiredate string,
8sal double,
9comm double,
10deptno int
11)row format delimited fields terminated by ','
12LOCATION '/d7_externel/emp/' ;
13外部表不需要加载数据,直接将对应的数据文件放到HDFS的目录就可以
14[hadoop@hadoop001 ~]$ hdfs dfs -put emp.txt /d7_externel/emp/
15[hadoop@hadoop001 ~]$ hdfs dfs -ls /d7_externel/emp
16Found 1 items
17-rw-r--r-- 1 hadoop supergroup 845 2019-07-18 11:22 /d7_externel/emp/emp.txt
18删除外部表,只删除元数据中的元数据,数据文件还存在,建议使用外部表