文章目录
建表
insert into table tb_url select "1" as id,"http://facebook.com/path/p1.php?query=1" as url;
insert into table tb_url select "2" as id,"http://tongji.baidu.com/news/index.jsp?uuid=frank" as url;
insert into table tb_url select "3" as id,"http://www.jdwz.com/index?source=baidu" as url;
insert into table tb_url select "4" as id,"http://www.itcast.cn/index?source=alibaba" as url;
id | url |
---|---|
1 | http://facebook.com/path/p1.php?query=1 |
2 | http://tongji.baidu.com/news/index.jsp?uuid=frank |
3 | http://www.jdwz.com/index?source=baidu |
3 | http://www.itcast.cn/index?source=alibaba |
hive parse_url
需求:实现对URL进行分析,从URL中获取每个ID对应HOST、PATH以及QUERY
代码:
select id,
parse_url(url, "HOST") as host,
parse_url(url, "PATH") as path,
parse_url(url, "QUERY") as query
from tb_url;