背景:我的虚拟机hadoop是单机版,所以hive在hdfs上生成的目录,均显性呈现在系统上。
一、HIVE建库:
create database fastdo_lte location "/user/hive/warehouse/fastdo_lte.db";
hdfs节点上会自动创建数据库名的文件夹
二、建表:
CREATE TABLE `fastdo_lte.tdlte_mro_locate_hour`(
`groupid` string,
`n1_cell_id` string,
`n1_rsrp` string,
`n2_cell_id` string,
`n2_rsrp` string,
`n3_cell_id` string,
`n3_rsrp` string,
`n4_cell_id` string,
`n4_rsrp` string,
`n5_cell_id` string,
`n5_rsrp` string,
`n6_cell_id` string,
`n6_rsrp` string,
`n7_cell_id` string,
`n7_rsrp` string,
`s_cell_id` string,
`aoa` string,
`ta` string,
`mroid` string,
`s_rsrp` string,
`mro_ts` string,
`mro_mmeues1apid` string,
`mro_mmegroupid` string,
`mro_mmecode` string,
`s_rsrq` string,
`ltescsinrul` string,
`sdate` string,
`city_name` string,
`obj_timestamp` string,
`reserved1` string,
`reserved2` string,
`callid` string,
`imsi` string,
`mro_error` string,
`findncell_v` string,
`sum_v` string,
`n1_pci` string,
`n1_earfcn` string,
`n2_pci` string,
`n2_earfcn` string,
`n3_pci` string,
`n3_earfcn` string,
`n4_pci` string,
`n4_earfcn` string,
`n5_pci` string,
`n5_earfcn` string,
`n6_pci` string,
`n6_earfcn` string,
`n7_pci` string,
`n7_earfcn` string,
`scell_pci` string,
`scell_earfcn` string,
`ct_maxrsrp` string,
`ct_rsrq` string,
`ct_pci` string,
`ct_earfcn` string,
`cm_maxrsrp` string,
`cm_rsrq` string,
`cm_pci` string,
`cm_earfcn` string,
`cu_maxrsrp` string,
`cu_rsrq` string,
`cu_pci` string,
`cu_earfcn` string,
`mr_type` string,
`orig_lon` string,
`orig_lat` string,
`reserved3` string,
`reserved4` string,
`longitude` string,
`latitude` string,
`planid` string,
`height` string,
`err` string,
`confidence_degree` string,
`position_type` string,
`m_x` string,
`m_y` string)
PARTITIONED BY (
`day` string,
`hour` string,
`city` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
NULL DEFINED AS '';
CREATE TABLE `fastdo_lte.cfg_siteinfo_tdlte`(
`sdate` string COMMENT 'from deserializer',
`city` string COMMENT 'from deserializer',
`area` string COMMENT 'from deserializer',
`region1` string COMMENT 'from deserializer',
`region2` string COMMENT 'from deserializer',
`region3` string COMMENT 'from deserializer',
`region4` string COMMENT 'from deserializer',
`btsname_cn` string COMMENT 'from deserializer',
`btsname_en` string COMMENT 'from deserializer',
`cellname_cn` string COMMENT 'from deserializer',
`cellname_en` string COMMENT 'from deserializer',
`vendorid` string COMMENT 'from deserializer',
`vendorname` string COMMENT 'from deserializer',
`sitetype` string COMMENT 'from deserializer',
`scenario` string COMMENT 'from deserializer',
`longitude` string COMMENT 'from deserializer',
`latitude` string COMMENT 'from deserializer',
`height` string COMMENT 'from deserializer',
`downtilt` string COMMENT 'from deserializer',
`azimuth` string COMMENT 'from deserializer',
`enbid` string COMMENT 'from deserializer',
`lcrid` string COMMENT 'from deserializer',
`cellid` string COMMENT 'from deserializer',
`eci` string COMMENT 'from deserializer',
`tac` string COMMENT 'from deserializer',
`pci` string COMMENT 'from deserializer',
`band` string COMMENT 'from deserializer',
`band_id` string COMMENT 'from deserializer',
`earfcn` string COMMENT 'from deserializer',
`vmap` string COMMENT 'from deserializer',
`centerx` string COMMENT 'from deserializer',
`centery` string COMMENT 'from deserializer',
`scencategory` string COMMENT 'from deserializer')
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n' ;
在新建库中创建表结构,此时只是空表。
三、上传结构化数据到hdfs中&load数据
这里估计根据hive设置不同,有的环境直接put文件进入hdfs就会被识别,有的还需要加进行load。我认为本质都是读数据到hdfs中,并形成数据和结构的逻辑映射
Hadoop fs -put tdlte_mro_locate_hour /user/hive/warehouse/fastdo_lte_lte/tdlte_mro_locate_hour
Hadoop fs -put SJZ&BD.csv /user/hive/warehouse/fastdo_lte_lte/cfg_stieinfo_tdlte
手动指定分区导入
LOAD DATA LOCAL INPATH '/Lee/tdlte_mro_locate_hour'
OVERWRITE INTO TABLE fastdo_lte.tdlte_mro_locate_hour
PARTITION ( day='20191011',hour='00',city='SHIJIAZHUANG');
无分区导入
LOAD DATA LOCAL INPATH '/Lee/SJZ&BD.csv'
OVERWRITE INTO TABLE fastdo_lte.cfg_siteinfo_tdlte;
五、查看数据:
select * from fastdo_lte.tdlte_mro_locate_hour;
select * from fastdo_lte.cfg_siteinfo_tdlte;
六、建表语法
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
-- (Note: TEMPORARY available in Hive 0.14.0 and later)
[(col_name data_type [COMMENT col_comment], ... [constraint_specification])]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
[SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)]
ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)
[STORED AS DIRECTORIES]
[
[ROW FORMAT row_format]
[STORED AS file_format]
| STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later)
]
[LOCATION hdfs_path]
[TBLPROPERTIES (property_name=property_value, ...)] -- (Note: Available in Hive 0.6.0 and later)
a)指定分隔符: [ROW FORMAT row_format]
: DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char]
[MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
[NULL DEFINED AS char] -- (Note: Available in Hive 0.13 and later)
内置指定分隔符
Fields列之间分隔符