1.建库、删除库
CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
[COMMENT database_comment]
[LOCATION hdfs_path]
[WITH DBPROPERTIES (property_name=property_value, ...)];
create database if not exists dbname;
drop database dbname;
2.建表
create table if not exists student(
num int,
name string
)
row format delimited fields terminated by '\t'
stored as textfile;
默认存储格式为:textfile
3、导入数据:
源数据:
1 tom
2 peter
3 jack
4 leo
5 lucy
6 james
7 Michael
9 Eric
导入数据命令:
a、从本地导数据:
load data local inpath '/opt/data/student' into table student;
b、从hdfs导数据:
load data inpath '/student' into table student;
4、查询表中数据:
select * from student;
查询结果:
hive (dbschool)> select * from student;
OK
student.num student.name
1 tom
2 peter
3 jack
4 leo
5 lucy
6 james
7 Michael
9 Eric
Time taken: 0.101 seconds, Fetched: 8 row(s)
5、描述表的信息:
第一种:
desc tablename
结果:
hive (dbschool)> desc student;
OK
col_name data_type comment
num int
name string
Time taken: 0.148 seconds, Fetched: 2 row(s)
第二张:
desc extended student
结果:
hive (dbschool)> desc extended student;
OK
col_name data_type comment
num int
name string
Detailed Table Information Table(tableName:student, dbName:dbschool, owner:root, createTime:1532007488, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:num, type:int, comment:null), FieldSchema(name:name, type:string, comment:null)], location:hdfs://ns1/user/hive/warehouse/dbschool.db/student, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format= , field.delim=
Time taken: 0.117 seconds, Fetched: 4 row(s)
第三种:
desc formatted student
结果:
hive (dbschool)> desc formatted student;
OK
col_name data_type comment
# col_name data_type comment
num int
name string
# Detailed Table Information
Database: dbschool
Owner: root
CreateTime: Thu Jul 19 21:38:08 CST 2018
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://ns1/user/hive/warehouse/dbschool.db/student
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE true
numFiles 1
totalSize 65
transient_lastDdlTime 1532064277
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim \t
serialization.format \t
Time taken: 0.101 seconds, Fetched: 31 row(s)
PS:hive默认情况下创建的表类型都为 MANAGED_TABLE 管理表
6、查看系统中的方法:
show functions;
结果:
hive (dbschool)> show functions;
OK
tab_name
!
!=
%
&
*
+
-
/
<
<=
<=>
<>
=
==
>
>=
^
abs
acos
add_months
and
array
array_contains
ascii
...
描述一个方法:
desc function upper;
结果:
hive (dbschool)> desc function upper;
OK
tab_name
upper(str) - Returns str with all characters changed to uppercase
Time taken: 0.061 seconds, Fetched: 1 row(s)
7、查看hive的帮助信息
[root@master hive-1.1.0]# bin/hive -help
which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/java/jdk1.7.0_80/bin:/opt/java/apache-maven-3.5.3/bin:/root/bin)
18/07/21 15:07:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
usage: hive
-d,--define <key=value> Variable subsitution to apply to hive
commands. e.g. -d A=B or --define A=B
--database <databasename> Specify the database to use
-e <quoted-query-string> SQL from command line
-f <filename> SQL from files
-H,--help Print help information
--hiveconf <property=value> Use value for given property
--hivevar <key=value> Variable subsitution to apply to hive
commands. e.g. --hivevar A=B
-i <filename> Initialization SQL file
-S,--silent Silent mode in interactive shell
-v,--verbose Verbose mode (echo executed SQL to the
console)
8、对hdfs进行操作:
hive (default)> dfs -ls /;
Found 2 items
drwxrwxr-x - root supergroup 0 2018-07-19 00:45 /tmp
drwxr-xr-x - root supergroup 0 2018-07-19 01:07 /use
9、指定位置建数据库和表:
a、先创建文件夹:
hive (default)> dfs -mkdir /locate;
b、创建数据库:
hive (default)> create database if not exists db_loc LOCATION '/locate';
OK
Time taken: 0.17 seconds
c、查看 /locate 目录并不存在新创建的db_loc,在db_loc目录下创建表:
hive (db_loc)> create table tb01(
> name string
> )row format delimited fields terminated by '\t';
OK
Time taken: 0.462 seconds
d、查看 /locate 目录:
e、删除非空的数据库:
hive (db_loc)> drop database db_loc cascade;
OK
Time taken: 0.84 seconds
10、清空表的内容:
truncate table tableName;
结果:
hive (dbschool)> select * from student;
OK
student.num student.name
1 tom
2 peter
3 jack
4 leo
5 lucy
6 james
7 Michael
9 Eric
Time taken: 1.205 seconds, Fetched: 8 row(s)
hive (dbschool)> truncate table student;
OK
Time taken: 0.206 seconds
hive (dbschool)> select * from student;
OK
student.num student.name
Time taken: 0.241 seconds
11、 子查询创建表:
create table stu_info as select name from student;
例子:
hive (dbschool)> create table stu_info as select name from student;
Query ID = root_20180725013939_e7f0d776-85a7-4dc1-90fe-acfba92b6913
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1532453698068_0001, Tracking URL = http://master.cdh.com:8088/proxy/application_1532453698068_0001/
Kill Command = /opt/cdh5.14.2/hadoop-2.6.0/bin/hadoop job -kill job_1532453698068_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2018-07-25 01:39:52,648 Stage-1 map = 0%, reduce = 0%
2018-07-25 01:40:04,795 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.95 sec
MapReduce Total cumulative CPU time: 1 seconds 950 msec
Ended Job = job_1532453698068_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://ns1/user/hive/warehouse/dbschool.db/.hive-staging_hive_2018-07-25_01-39-30_053_8181825053540619786-1/-ext-10001
Moving data to: hdfs://ns1/user/hive/warehouse/dbschool.db/stu_info
Table dbschool.stu_info stats: [numFiles=1, numRows=8, totalSize=43, rawDataSize=35]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 1.95 sec HDFS Read: 3282 HDFS Write: 116 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 950 msec
OK
name
Time taken: 38.344 seconds
hive (dbschool)> select * from stu_info;
OK
stu_info.name
tom
peter
jack
leo
lucy
james
Michael
Eric
Time taken: 0.163 seconds, Fetched: 8 row(s)
12、like方式创建表,只创建表结构,不复制数据:
create table stu_like like student;
结果:
hive (dbschool)> create table stu_like like student;
OK
Time taken: 0.096 seconds
hive (dbschool)> select * from stu_like;
OK
stu_like.num stu_like.name
Time taken: 0.127 seconds