hive的udf开发
继承udf这个类,方法重载evaluate
1. add jar /opt/udftest.jar
2. create temporary function 功能名as “主类路径”;
使用python脚本transform开发:
1. add FILE weekday_mapper.py;
2. SELECT TRANSFORM (必须是所有的字段,或者*)USING ‘python 脚本’ AS (生成的字段名)FROM t_rating;
数据:
{"movie":"3664","rate":"4","timeStamp":"961685303","uid":"5225"}
新建一张表加载进去数据
create table temp(line string);
脚本:
#!/bin/python
import sys
import datetime
import json
for line in sys.stdin:
if line!='\n':
result=json.loads(line.strip())
movie=result['movie']
rate=result['rate']
uid=result['uid']
weekday =datetime.datetime.fromtimestamp(float(result['timeStamp'])).