HIVE建表&load数据实录

背景:我的虚拟机hadoop是单机版,所以hive在hdfs上生成的目录,均显性呈现在系统上。

一、HIVE建库:

create database fastdo_lte location "/user/hive/warehouse/fastdo_lte.db";

hdfs节点上会自动创建数据库名的文件夹
二、建表:

CREATE TABLE `fastdo_lte.tdlte_mro_locate_hour`(
  `groupid` string, 
  `n1_cell_id` string, 
  `n1_rsrp` string, 
  `n2_cell_id` string, 
  `n2_rsrp` string, 
  `n3_cell_id` string, 
  `n3_rsrp` string, 
  `n4_cell_id` string, 
  `n4_rsrp` string, 
  `n5_cell_id` string, 
  `n5_rsrp` string, 
  `n6_cell_id` string, 
  `n6_rsrp` string, 
  `n7_cell_id` string, 
  `n7_rsrp` string, 
  `s_cell_id` string, 
  `aoa` string, 
  `ta` string, 
  `mroid` string, 
  `s_rsrp` string, 
  `mro_ts` string, 
  `mro_mmeues1apid` string, 
  `mro_mmegroupid` string, 
  `mro_mmecode` string, 
  `s_rsrq` string, 
  `ltescsinrul` string, 
  `sdate` string, 
  `city_name` string, 
  `obj_timestamp` string, 
  `reserved1` string, 
  `reserved2` string, 
  `callid` string, 
  `imsi` string, 
  `mro_error` string, 
  `findncell_v` string, 
  `sum_v` string, 
  `n1_pci` string, 
  `n1_earfcn` string, 
  `n2_pci` string, 
  `n2_earfcn` string, 
  `n3_pci` string, 
  `n3_earfcn` string, 
  `n4_pci` string, 
  `n4_earfcn` string, 
  `n5_pci` string, 
  `n5_earfcn` string, 
  `n6_pci` string, 
  `n6_earfcn` string, 
  `n7_pci` string, 
  `n7_earfcn` string, 
  `scell_pci` string, 
  `scell_earfcn` string, 
  `ct_maxrsrp` string, 
  `ct_rsrq` string, 
  `ct_pci` string, 
  `ct_earfcn` string, 
  `cm_maxrsrp` string, 
  `cm_rsrq` string, 
  `cm_pci` string, 
  `cm_earfcn` string, 
  `cu_maxrsrp` string, 
  `cu_rsrq` string, 
  `cu_pci` string, 
  `cu_earfcn` string, 
  `mr_type` string, 
  `orig_lon` string, 
  `orig_lat` string, 
  `reserved3` string, 
  `reserved4` string, 
  `longitude` string, 
  `latitude` string, 
  `planid` string, 
  `height` string, 
  `err` string, 
  `confidence_degree` string, 
  `position_type` string, 
  `m_x` string, 
  `m_y` string)
PARTITIONED BY ( 
  `day` string, 
  `hour` string, 
  `city` string)
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY '\t' 
  NULL DEFINED AS '';

CREATE TABLE `fastdo_lte.cfg_siteinfo_tdlte`(
  `sdate` string COMMENT 'from deserializer', 
  `city` string COMMENT 'from deserializer', 
  `area` string COMMENT 'from deserializer', 
  `region1` string COMMENT 'from deserializer', 
  `region2` string COMMENT 'from deserializer', 
  `region3` string COMMENT 'from deserializer', 
  `region4` string COMMENT 'from deserializer', 
  `btsname_cn` string COMMENT 'from deserializer', 
  `btsname_en` string COMMENT 'from deserializer', 
  `cellname_cn` string COMMENT 'from deserializer', 
  `cellname_en` string COMMENT 'from deserializer', 
  `vendorid` string COMMENT 'from deserializer', 
  `vendorname` string COMMENT 'from deserializer', 
  `sitetype` string COMMENT 'from deserializer', 
  `scenario` string COMMENT 'from deserializer', 
  `longitude` string COMMENT 'from deserializer', 
  `latitude` string COMMENT 'from deserializer', 
  `height` string COMMENT 'from deserializer', 
  `downtilt` string COMMENT 'from deserializer', 
  `azimuth` string COMMENT 'from deserializer', 
  `enbid` string COMMENT 'from deserializer', 
  `lcrid` string COMMENT 'from deserializer', 
  `cellid` string COMMENT 'from deserializer', 
  `eci` string COMMENT 'from deserializer', 
  `tac` string COMMENT 'from deserializer', 
  `pci` string COMMENT 'from deserializer', 
  `band` string COMMENT 'from deserializer', 
  `band_id` string COMMENT 'from deserializer', 
  `earfcn` string COMMENT 'from deserializer', 
  `vmap` string COMMENT 'from deserializer', 
  `centerx` string COMMENT 'from deserializer', 
  `centery` string COMMENT 'from deserializer', 
  `scencategory` string COMMENT 'from deserializer')
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY ',' 
  LINES TERMINATED BY '\n' ;

在新建库中创建表结构,此时只是空表。

三、上传结构化数据到hdfs中&load数据
这里估计根据hive设置不同,有的环境直接put文件进入hdfs就会被识别,有的还需要加进行load。我认为本质都是读数据到hdfs中,并形成数据和结构的逻辑映射

Hadoop fs -put tdlte_mro_locate_hour /user/hive/warehouse/fastdo_lte_lte/tdlte_mro_locate_hour
Hadoop fs -put SJZ&BD.csv  /user/hive/warehouse/fastdo_lte_lte/cfg_stieinfo_tdlte

手动指定分区导入

LOAD DATA LOCAL INPATH '/Lee/tdlte_mro_locate_hour'
OVERWRITE  INTO TABLE fastdo_lte.tdlte_mro_locate_hour
PARTITION ( day='20191011',hour='00',city='SHIJIAZHUANG')

无分区导入

LOAD DATA LOCAL INPATH '/Lee/SJZ&BD.csv'
OVERWRITE  INTO TABLE fastdo_lte.cfg_siteinfo_tdlte;

五、查看数据:

select * from fastdo_lte.tdlte_mro_locate_hour; 
select * from fastdo_lte.cfg_siteinfo_tdlte;

六、建表语法

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name    
-- (Note: TEMPORARY available in Hive 0.14.0 and later)
  [(col_name data_type [COMMENT col_comment], ... [constraint_specification])]
  [COMMENT table_comment]
  [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
  [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
  [SKEWED BY (col_name, col_name, ...)                  -- (Note: Available in Hive 0.10.0 and later)]
     ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)
     [STORED AS DIRECTORIES]
  [
   [ROW FORMAT row_format] 
   [STORED AS file_format]
     | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]  -- (Note: Available in Hive 0.6.0 and later)
  ]
  [LOCATION hdfs_path]
  [TBLPROPERTIES (property_name=property_value, ...)]   -- (Note: Available in Hive 0.6.0 and later)
a)指定分隔符:   [ROW FORMAT row_format] 
  : DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char]
        [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
        [NULL DEFINED AS char]   -- (Note: Available in Hive 0.13 and later)
内置指定分隔符
Fields列之间分隔符
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Hive支持复杂的数据类型,用于存储不同类型和结构的数据建表时使用这些复杂数据类型可以帮助组织和管理数据。以下是一些主要的复杂数据类型: 1. **Struct(结构体)**: 它定义了一个带有序列字段的数据类型,每个字段有自己的名称和类型。例如: ``` CREATE TABLE my_table (struct_field struct<field1: string, field2: int>); ``` 2. **Map(映射)**: 存储键值对的数据结构,键和值可以是任意数据类型。例如: ``` CREATE TABLE my_map_table (map_field map<string, string>); ``` 3. **Array(数组)**: 存储同类型的元素序列。比如: ``` CREATE TABLE my_array_table (array_field array<int>); ``` 4. **Union类型(联合类型)**: 可以包含多个类型之一,如: ``` CREATE TABLE my_union_table (union_field uniontype(string, int)); ``` 5. **Decimal和Double(精确数字和浮点数)**: 用于存储带有更高精度的数字,与标准的int和double不同。 ``` CREATE TABLE my_decimal_table (decimal_field decimal(10,2), double_field double); ``` 6. **Timestamp和Date(日期和时间)**: 专门用于存储日期和时间信息。 ``` CREATE TABLE my_timestamp_table (timestamp_field timestamp, date_field date); ``` 7. **Binary和varbinary(二进制和可变长度二进制)**: 存储二进制数据。 ``` CREATE TABLE my_binary_table (binary_field binary, varbinary_field varbinary); ``` 使用这些数据类型时,要根据实际业务需求和数据特性来选择和设计表结构。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值