业务情景一:
hive加载json数据到表中:
linux本地创建文件people.json
数据为:
{"name":"Michael"}
{"name":"Andy", "age":30}
{"name":"Justin", "age":19}
创建表:
CREATE TABLE
ods.spark_people_json
(
`name` string,
`age` INT
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'STORED AS TEXTFILE;
加载数据到hive表:
注意这里可能会报错:
因为直接使用JsonSerDe类,是会报错的,因为这个类并没有在初始化的时候加载到环境中
报错如下:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe
所以我先下载这个包然后add到hive中即可使用:
https://mvnrepository.com/artifact/org.apache.hive.hcatalog/hive-hcatalog-core/0.12.0-cdh5.1.4
包名:hive-hcatalog-core-0.12.0-cdh5.1.4.jar
hive (default)> add jar /var/lib/hadoop-hdfs/spride_sqoop_beijing/bi_table/tang/hive-hcatalog-core-0.12.0-cdh5.1.4.jar;
Added [/var/lib/hadoop-hdfs/spride_sqoop_beijing/bi_table/tang/hive-hcatalog-core-0.12.0-cdh5.1.4.jar] to class path
Added resources: [/var/lib/hadoop-hdfs/spride_sqoop_beijing/bi_table/tang/hive-hcatalog-core-0.12.0-cdh5.1.4.