优化点:一次map多个reduce,有效节省了map操作.
本脚本目的:实现表字段空值率统计
流程:
1.获取表结构
2,通过excel或者是notepad进行批量转换,
3,将语句格式化成如下的from insert 语句,然后执行就OK了.
4,每个insert语句后面可以跟where语句
create table if not exists tmp_null_static (
column_name string ,
ttl_cnt bigint ,
null_rate double
) ;
truncate table tmp_null_static ;
from (select * from tmp_info where pt='20190217000000' ) a
insert into tmp_null_static select "id" as column_name,count(1) as ttl_cnt,count( id)/count(1) as null_rate
insert into tmp_null_static select "certificate_type" as column_name,count(1) as ttl_cnt,count( certificate_type)/count(1) as null_rate ;
听说SQL写多了能成神,同时也要考虑如何优化的话能成仙,哈哈.