原文地址:http://www.07net01.com/linux/hivexingliezhuanhuanzongjie_654296_1381813421.html
1、单列转换成多行
比如:
pageid paged
page1 a,b,c
要转换成
page1 a
page1 b
page1 c
select pageid,p from test lateral view explode(split(paged,',')) adtable as p;
通过split拆成多个元素的集合,再通过split打散成多行,lateral view作用是解决explode这种UDTF函数拆成的数据不能与其他字段共同服务问题
2、多行转换成单列
即上面的结果又反推回去,想到oracle的sys_connect_path,便写个UDF实现类似功能
import java.util.ArrayList; import org.apache.hadoop.hive.ql.exec.UDF; public class RowToCol extends UDF { private static int MAX_VALUE = 50; private static String prerows[] = new String[MAX_VALUE]; // 全局变量,保留上一条记录的值 private static StringBuilder val = null; public String evaluate (Object ...args) { ArrayList<String> thisrowlist=new ArrayList<String>(); for(int i=0;i<args.length;i++) thisrowlist.add(args[i].toString()) ; if (val==null) { for(int i=0;i<thisrowlist.size();i++){ prerows[i] = thisrowlist.get(i); val=new StringBuilder(thisrowlist.get(thisrowlist.size()-1)); //最后一个参数值作为返回值 } return val.toString(); } for(int i=0;i<thisrowlist.size()-1;i++) { if ( !prerows[i].equals(thisrowlist.get(i)) ) //跳到b的时候prerows[0]="a" thisrowlist[0]="b" 满足这种情况重新置1 { for (int j=0;j<thisrowlist.size();j++) { prerows[j] = thisrowlist.get(j); val=new StringBuilder(thisrowlist.get(thisrowlist.size()-1)); } return val.toString(); } } return val.append(",").append(thisrowlist.get(thisrowlist.size()-1)).toString(); } public static void main(String[] args) { RowToCol rt=new RowToCol(); System.out.println(rt.evaluate("page1","a")); System.out.println(rt.evaluate("page2","d")); System.out.println(rt.evaluate("page2","e")); System.out.println(rt.evaluate("page2","x")); System.out.println(rt.evaluate("page2","x")); System.out.println(rt.evaluate("page2","y")); } }
最后生成结果select pageid,max(row_to_col(pageid,paged)) from (select pageid,paged from test2 distribute by pageid sort by pageid,paged) a group by pageid;
3、多行转多列、多列转多行与数据库中SQL写法一样,参考博文:http://blog.csdn.net/u011750989/article/details/11797463