将同一id多行数据合并到一行

声明:

数据库为impala

需求:

1、3个数据源有一个相同的字段“batch_no”,根据此相同字段将3个数据源中的数据放到一个报表展示

2、其中需展示的”所属账期“字段数据源中没有,根据数据源中的账期开始日和账期结束日判断,开始日取最早日期,结束日取最晚日期

所属账期:账期最早2020-05-20 00:00:00,账期最晚2020-07-10 00:00:00

所属账期就是202005/202006/202007或者202005~202007

3、其中某系字段例如“发票号”“凭证号”“产业”“渠道名称”等包含多个内容时,在同一行显示。并且,如果发票号或者凭证号是连续的使用“~”,如果不是联系的使用“/”隔开,重复的数据只显示一个(也就是去重)。

例如:发票号:1,2,3,4,2;是连续的且2是重复的,则结果为:1~4

          发票号:1,3,4,8,3;非连续的且3是重复的,则结果为:1/3/4/8,而不是1/3/4/8/3

解决步骤:

第一步:查看源表数据,如下

客户希望展现的结果:

第二步:impala的group_concat(column)方法,将某一个列的数据放到一起,需要配合group by使用。(同组的数据,通过group_concat(column)指定要放到一起的数据是哪列);

首先我们来实现多行合并到一行

select b.batch_no,

GROUP_CONCAT(b.invoice_no, '/') invoice_no,

GROUP_CONCAT(b.voucher_no, '/') voucher_no

from receive_data.CVP_V_GET_BATCH_NBMS a

left join receive_data.cvp_v_get_invoice_nbms b

on a.batch_no=b.batch_no

where a.deleted_flag='0'

and b.deleted_flag='0'

and a.batch_no='INBMS202005290100'

group by b.batch_no;

结果如下:

细心的你会发现,invoice_no是连续的啊,要使用“~”,不能使用“/”拼接啊

第三步:把连续的使用“~”,不连续的使用“/”连接

select b.batch_no,

case when max(invoice_no) = min(invoice_no) then max(invoice_no)

     when cast(max(invoice_no) as int)+1 - cast(min(invoice_no) as int)=count(invoice_no) then concat(min(invoice_no),'~',max(invoice_no))

     else GROUP_CONCAT(b.invoice_no, '/')

     end invoice_no,

case when max(voucher_no) = min(voucher_no) then max(voucher_no)

     when cast(max(voucher_no) as int)+1 - cast(min(voucher_no) as int)=count(voucher_no) then concat(min(voucher_no),'~',max(voucher_no))

     else GROUP_CONCAT(b.voucher_no, '/')

     end voucher_no

from receive_data.CVP_V_GET_BATCH_NBMS a

left join receive_data.cvp_v_get_invoice_nbms b

on a.batch_no=b.batch_no

where a.deleted_flag='0'

and b.deleted_flag='0'

and a.batch_no='INBMS202005290100'

group by b.batch_no;

第四步:现在离我们的目标又进了一步,但是需求还有一点就是要去重。你会想到使用distinct啊。我们来看一下结果:

select b.batch_no,

case when max(invoice_no) = min(invoice_no) then max(invoice_no)

     when cast(max(invoice_no) as int)+1 - cast(min(invoice_no) as int)=count(distinct invoice_no) then concat(min(invoice_no),'~',max(invoice_no))

     else GROUP_CONCAT(distinct b.invoice_no, '/')

     end invoice_no,

case when max(voucher_no) = min(voucher_no) then max(voucher_no)

     when cast(max(voucher_no) as int)+1 - cast(min(voucher_no) as int)=count(distinct voucher_no) then concat(min(voucher_no),'~',max(voucher_no))

     else GROUP_CONCAT(distinct b.voucher_no, '/')

     end voucher_no

from receive_data.CVP_V_GET_BATCH_NBMS a

left join receive_data.cvp_v_get_invoice_nbms b

on a.batch_no=b.batch_no

where a.deleted_flag='0'

and b.deleted_flag='0'

and a.batch_no='INBMS202005290100'

group by b.batch_no;

唉呀,报错了!

这里我把报错信息贴出来:

[Code: 500051, SQL State: HY000]  [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, errorMessage:AnalysisException: all DISTINCT aggregate functions need to have the same set of parameters as count(DISTINCT invoice_no); deviating function: count(DISTINCT voucher_no)

), Query: select b.batch_no,

case when max(invoice_no) = min(invoice_no) then max(invoice_no)

     when cast(max(invoice_no) as int)+1 - cast(min(invoice_no) as int)=count(distinct invoice_no) then concat(min(invoice_no),'~',max(invoice_no))

     else GROUP_CONCAT(distinct b.invoice_no, '/')

     end invoice_no,

case when max(voucher_no) = min(voucher_no) then max(voucher_no)

     when cast(max(voucher_no) as int)+1 - cast(min(voucher_no) as int)=count(distinct voucher_no) then concat(min(voucher_no),'~',max(voucher_no))

     else GROUP_CONCAT(distinct b.voucher_no, '/')

     end voucher_no

from receive_data.CVP_V_GET_BATCH_NBMS a

left join receive_data.cvp_v_get_invoice_nbms b

on a.batch_no=b.batch_no

where a.deleted_flag='0'

and b.deleted_flag='0'

and a.batch_no='INBMS202005290100'

group by b.batch_no.

我们只需要看这一部分:

[Code: 500051, SQL State: HY000]  [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, errorMessage:AnalysisException: all DISTINCT aggregate functions need to have the same set of parameters as count(DISTINCT invoice_no); deviating function: count(DISTINCT voucher_no)

这句话的意思:所有不同的聚合函数都需要有相同的参数集

原因:DISTINCT只允许在一个列组合上使用。

第五步:解决多个GROUP_CONCAT在一个select使用。我们可以使用join啊。

select

i.batch_no,

invoice_no,

voucher_no

from (

select b.batch_no,

max(invoice_no) invoice_no_max,    --查询出最大的invoice_no

min(invoice_no) invoice_no_min,     --查询出最小的invoice_no

count(distinct invoice_no) num,      --计算同一个batch_no下invoice_no总数(去重)

case when max(invoice_no) = min(invoice_no) then max(invoice_no)  --判断是否只有一个invoice_no

     when cast(max(invoice_no) as int)+1 - cast(min(invoice_no) as int)=count(distinct invoice_no) then concat(min(invoice_no),'~',max(invoice_no))  --判断是否是连续的,如果是连续的使用“~”连接invoice_no

     else GROUP_CONCAT(distinct b.invoice_no, '/')  --如果不连续使用“/”拼接invoice_no

     end invoice_no

from receive_data.CVP_V_GET_BATCH_NBMS a

left join receive_data.cvp_v_get_invoice_nbms b

on a.batch_no=b.batch_no

where a.deleted_flag='0'

and b.deleted_flag='0'

and a.batch_no='INBMS202005290100'

group by b.batch_no) i

join

(select b.batch_no,

max(voucher_no) voucher_no_max,

min(voucher_no) voucher_no_min,

count(distinct voucher_no) num,

case when max(voucher_no) = min(voucher_no) then max(voucher_no)

     when cast(max(voucher_no) as int)+1 - cast(min(voucher_no) as int)=count(distinct voucher_no) then concat(min(voucher_no),'~',max(voucher_no))

     else GROUP_CONCAT(distinct b.voucher_no, '/')

        end voucher_no

from receive_data.CVP_V_GET_BATCH_NBMS a

left join receive_data.cvp_v_get_invoice_nbms b

on a.batch_no=b.batch_no

where a.deleted_flag='0'

and b.deleted_flag='0'

and a.batch_no='INBMS202005290100'

group by b.batch_no ) v

on i.batch_no = v.batch_no;

结果如下:

细心的你是不是又发现了,我们还有”所属账期“没处理呢。接下来就是处理“所属账期”。

我们来看一下所属账期的问题。

查看一下源表:

select c.batch_no,

sheet_start_date,    --开始时间

sheet_end_date      --结束时间

from receive_data.CVP_V_GET_BATCH_NBMS a

left join receive_data.cvp_v_get_sheet_nbms c on a.batch_no=c.batch_no 

where a.deleted_flag='0'

and c.deleted_flag='0'

and a.batch_no='INBMS202005290100';

结果如下:共85条,只展现部分数据

客户希望的结果

我这里直接贴出sql,原理跟上面一样,这里就不废话了。

select batch_no,sheet_start_date,sheet_end_date,

case when start_date = end_date then end_date   --判断是否只有一个月份

else concat(start_date,'~',end_date) end account_date --如果不是,使用“~”连接

from (

select c.batch_no,

min(c.sheet_start_date) sheet_start_date,  --获取最早账期

max(c.sheet_end_date) sheet_end_date,   --获取最晚账期

concat(substr(min(c.sheet_start_date),1,4),substr(min(c.sheet_start_date),6,2)) start_date, --取最早账期年月

concat(substr(max(c.sheet_end_date),1,4),substr(max(c.sheet_end_date),6,2)) end_date  --取最晚账期年月

from receive_data.CVP_V_GET_BATCH_NBMS a

left join receive_data.cvp_v_get_sheet_nbms c on a.batch_no=c.batch_no 

where a.deleted_flag='0'

and c.deleted_flag='0'

and a.batch_no='INBMS202005290100'

group by c.batch_no ) t

到此,我们就把客户提的需求完美解决。

前台展现:这里使用的是帆软(FineReport)

顺便秀一下我的个人证书:

关注我,带你体验更多真实案例。

  • 9
    点赞
  • 29
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

AllenGd

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值