1.使用distinct
问题:
每个app下只保留一个用户
案例:
spark-sql> with test1 as
> (select 122 as userid,100024 as apptypeid
> union all
> select 123 as userid,100024 as apptypeid
> union all
> select 123 as userid,100024 as apptypeid)
> select
> distinct userid,apptypeid
> from test1;
2.使用group by
问题:
每个app下只保留一个用户
案例:
spark-sql> with test1 as
> (select 122 as userid,100024 as apptypeid
> union all
> select 123 as userid,100024 as apptypeid
> union all
> select 123 as userid,100024 as apptypeid)
> select
> userid,
> apptypeid
> from
> (select
> userid,
> apptypeid
> from test1) t1
> group by userid,apptypeid;