大数据千亿级离线数仓项目第三天 维度数据分析与业务开发

86 篇文章 47 订阅

千亿级数仓项目第03天讲义

课程目标

  • 能够使用kettle生成日期维度数据

  • 能够完成订单时间维度分析

  • 能够完成订单地域、分类维度业务开发

  • 能够完成用户订单行为业务开发

指标开发-时间维度分析业务开发

需求分析

  • 统计2019年期间每个季度的销售订单笔数、订单总额
  • 统计2019年期间每个月的销售订单笔数、订单总额
  • 统计2019年期间每周一到周日的销售订单笔数、订单总额
  • 统计2019年期间国家法定节假日、休息日、工作日的订单笔数、订单总额

生成日期维度数据

经过需求分析我们发现指标统计需要的日期类型比较多如果在事实表的时间字段上做文章,比较复杂,甚至有些根本无法获取到比如节假日、休息日、工作日信息。所以我们可以考虑提前准备一张日期维度表,准备好指标统计的日期类型。

利用我们前面学习的kettle工具生成日期维度数据,需要借助于kettle的js脚本组件。

kettle实现思路:

  • 日期使用 Canlendar、SimpleDateFormat,编写Javascript脚本实现
  • 国家法定节假日实现,需要往http://timor.tech/api/holiday/info/2018-03-02年月日,发送请求,获取结果,以下为该REST API接口简介
1、接口地址:http://timor.tech/api/holiday/info/数字日期,支持https协议。
2、返回数据:正常工作日对应结果为 0,,休息日对应结果为 1, 法定节假日对应结果为 2
3、节假日数据说明:本接口包含2017年起的中国法定节假日数据,数据来源国务院发布的公告,每年更新1次,确保数据最新
4、示例:
http://timor.tech/api/holiday/info/2018-03-02
返回数据:
{"code":0,"type":{"type":0,"name":"周五","week":5},"holiday":null}

{
  "code": 0,              // 0服务正常。-1服务出错
  "type": {
    "type": enum(0, 1, 2), // 节假日类型,分别表示 工作日、周末、节日。
    "name": "周六",         // 节假日类型中文名,可能值为 周一 至 周日、假期的名字、某某调休。
    "week": enum(1 - 7)    // 一周中的第几天。值为 1 - 7,分别表示 周一 至 周日。
  },
  "holiday": {
    "holiday": false,     // true表示是节假日,false表示是调休
    "name": "国庆前调休",  // 节假日的中文名。如果是调休,则是调休的中文名,例如'国庆前调休'
    "wage": 1,            // 薪资倍数,1表示是1倍工资
    "after": false,       // 只在调休下有该字段。true表示放完假后调休,false表示先调休再放假
    "target": '国庆节'     // 只在调休下有该字段。表示调休的节假日
  }
}
类型示例值中文名
date_keystring20000101代理键
date_valuestring2000-01-01年-月-日
day_in_yearstring1当年的第几天
day_in_monthstring1当月的第几天
is_first_day_in_monthstringy是否月的第一天
is_last_day_in_monthstringn是否月的最后一天
weekdaystring星期一星期
week_in_monthstring1月的第几个星期
is_first_day_in_weekstringy、n是否周一
is_dayoffstringy、n是否休息日
is_workdaystringy、n是否工作日
is_holidaystringy、n是否国家法定节假日
date_typestringworkday、weekend、holiday
工作日、周末、法定节假日
日期类型
工作日:workday
国家法定节假日:holiday

休息日:weekend
month_numberstring1、2、…、12月份
yearstring2000年份
quarter_namestringQ1季度名称
quarter_numberstring1季度
year_quarterstring2000-Q1年-季度
year_month_numberstring2000-01年-月份

参考数据:

dim_date.date_valuedim_date.day_in_monthdim_date.is_first_day_in_monthdim_date.is_last_day_in_monthdim_date.weekdaydim_date.week_in_monthdim_date.is_first_day_in_weekdim_date.is_dayoffdim_date.is_workdaydim_date.is_holidaydim_date.date_typedim_date.month_numberdim_date.yeardim_date.year_month_numberdim_date.quarter_namedim_date.quarter_numberdim_date.year_quarter
2019-1-11yn21nnnyholiday12019Jan-19Q112019-Q1
2019-1-33nn41nnynworkday12019Jan-19Q112019-Q1
2019-1-44nn51nnynworkday12019Jan-19Q112019-Q1
2019-1-55nn61nynnweekend12019Jan-19Q112019-Q1
2019-1-22nn31nnynworkday12019Jan-19Q112019-Q1
2019-1-66nn72nynnweekend12019Jan-19Q112019-Q1
2019-1-77nn12ynynworkday12019Jan-19Q112019-Q1
2019-1-88nn22nnynworkday12019Jan-19Q112019-Q1
2019-1-99nn32nnynworkday12019Jan-19Q112019-Q1
2019-1-1010nn42nnynworkday12019Jan-19Q112019-Q1
Hive创建日期维度表

参考SQL语句:

DROP TABLE IF EXISTS `itcast_dw`.`dim_date`;
CREATE TABLE `itcast_dw`.`dim_date`(
    `date_key` string,
    `date_value` string,
    `day_in_year` string,
    `day_in_month` string,
    `is_first_day_in_month` string,
    `is_last_day_in_month` string,
    `weekday` string,
    `week_in_month` string,
    `is_first_day_in_week` string,
    `is_dayoff` string,
    `is_workday` string,
    `is_holiday` string,
    `date_type` string,
    `month_number` string,                        
    `year` string,
    `year_month_number` string,
    `quarter_name` string,
    `quarter_number` string,
    `year_quarter` string
)
stored as parquet TBLPROPERTIES('parquet.compression'='SNAPPY');
使用Kettle构建以下组件结构图
  • 生成记录
  • 增加序列
  • Javascript代码
  • Http Client组件(5个并行度)
  • JSON Input
  • Switch/case
  • 字段映射为常量
    • 工作日
    • 法定节假日
    • 节假日调休补班
    • 休息日
  • 字段选择
  • Parquet Output
kettle生成自定义日期维度数据组件图
image-20191224021700500
组件配置

1、生成记录组件

生成记录组件-配置限制为 365
image-20191107151442046
配置增加序列组件

3、Javascript组件,生成基本时间维度数据

 //初始日期
        var initStr = "2019-01-01";
        //转为日期对象
        var dateFormat = new java.text.SimpleDateFormat("yyyy-MM-dd");
        var initDate = dateFormat.parse(initStr);
        //获取日历对象
        var calendar = java.util.Calendar.getInstance();
        //设置日历的时间就是我们的初始时间
        calendar.setTime(initDate);
        //使用日历加减天,变换日期
        calendar.add(java.util.Calendar.DAY_OF_MONTH, setup);
        //获取增加一天之后的日期
        var newDate = calendar.getTime();
        var date_value = dateFormat.format(newDate);
        dateFormat.applyPattern("yyyyMMdd");
        var date_key = dateFormat.format(newDate);
        //一年当中第几天?
       var day_in_year = calendar.get(java.util.Calendar.DAY_OF_YEAR)+"";
        //一月当中第几天
        var day_in_month = calendar.get(java.util.Calendar.DAY_OF_MONTH)+"";
        //是否是月的第一天
        var is_first_day_in_month = "n";
        if (day_in_month.equals("1")) {
            is_first_day_in_month = "y";
        }
        //是否是月的最后一天
        var is_last_day_in_month = "n";
        calendar.add(java.util.Calendar.DAY_OF_MONTH, 1);
        var nextDay = calendar.get(java.util.Calendar.DAY_OF_MONTH) + "";
        if (nextDay.equals("1")) {
            is_last_day_in_month = "y";
        }
        //注意增加之后的日历对象需要再减回去
        calendar.add(java.util.Calendar.DAY_OF_MONTH, -1);

        //星期几
       var weekday= (calendar.get(java.util.Calendar.DAY_OF_WEEK)-1)+"";
       //星期日:1,星期六是7
        if(weekday.equals("0")){
            weekday="7";
        }
        //获取月中第几周,先减一保证时间正确(我们的规则)
        calendar.add(java.util.Calendar.DAY_OF_MONTH, -1);
        var week_in_month = calendar.get(java.util.Calendar.WEEK_OF_MONTH)+"";
        //再增加回去
        calendar.add(java.util.Calendar.DAY_OF_MONTH, 1);
        //是否是周一
        var is_first_day_in_week = "n";
        if (weekday.equals("1")) {
            is_first_day_in_week = "y";
        }

        var is_dayoff = "n";
        var is_workday = "n";
        var is_holiday = "n";
        var date_type = "workday";
        //定义查询的url
        var holidayUrl = "http://timor.tech/api/holiday/info/" + date_value;
        //月份
        dateFormat.applyPattern("MM");
        var month_number = dateFormat.format(newDate);

        dateFormat.applyPattern("yyyy");
        var year = dateFormat.format(newDate);
        var year_month_number = year + "-" + month_number;
        //季度
        var quarter_name = "";
        var quarter_number = "";
        var year_quarter = "";
		//按照我们的规则,否则默认是国外规则,星期日属于下个周
        switch (calendar.get(java.util.Calendar.MONTH) + 1) {
            case 1:
            case 2:
            case 3:
                quarter_name = "Q1";
                quarter_number = "1";
                year_quarter = year + "-" + quarter_name;
                break;
            case 4:
            case 5:
            case 6:
                quarter_name = "Q2";
                quarter_number = "2";
                year_quarter = year + "-" + quarter_name;
                break;
            case 7:
            case 8:
            case 9:
                quarter_name = "Q3";
                quarter_number = "3";
                year_quarter = year + "-" + quarter_name;
                break;
            case 10:
            case 11:
            case 12:
                quarter_name = "Q4";
                quarter_number = "4";
                year_quarter = year + "-" + quarter_name;
                break;

        }

4、配置HttpClient组件

配置HttpClient组件

5、配置JSON Input组件

配置JSON Input组件
image-20191107152424848

6、配置 switch/case 组件

配置 switch/case 组件
image-20191224021208204

7、配置工作日、法定节假日、休息日等组件

配置工作日、法定节假日、休息日等组件

8、配置字段选择组件

配置字段选择组件
image-20191107152616037

9、配置parquet输出

配置parquet输出
image-20191224021025161
验证数据
select * from itcast_dw.dim_date limit 10;

创建ads层数据表

drop table if exists itcast_ads.ads_order_date;
create table itcast_ads.ads_order_date(
    date_type string,   -- 时间维度类型
    date_val string,   -- 时间维度值
    order_count bigint, -- 订单笔数
    order_money double  -- 订单交易额
)
partitioned by (dt string)
row format delimited fields terminated by '\001' stored as TEXTFILE;
统计半年期间每个季度的销售订单笔数、订单总额

时间维度类型:

  • 1 → 季度
  • 2 → 月度
  • 3 → 星期
  • 4 → 日期类型(节假日、休息日、工作日)

统计2019年期间每个季度的销售订单笔数、订单总额

insert overwrite table itcast_ads.ads_order_date partition(dt='20190909')
select
    '1' as date_type,                                      -- 时间维度类型,1表示季度
    t1.year_quarter as date_val,                           -- 季度名称
    count(orderid),                            -- 订单总数
    sum(totalMoney)                         -- 订单总额
from 
    itcast_dw.dim_date t1
    left join
    (select * from itcast_dw.fact_orders where dt >='20190101' and dw_end_date='9999-12-31') t2
    on t1.date_value = date_format(createtime, 'yyyy-MM-dd')
group by t1.year_quarter
order by t1.year_quarter;

-- 测试
select * from itcast_ads.ads_order_date where dt='20190909' and date_type = '1';
统计2019年期间每个月的销售订单笔数、订单总额
insert into table itcast_ads.ads_order_date partition(dt='20190909')
select
    '2' as date_type, --2表示月类型
    t1.year_month_number as date_val,
    count(orderid),                  -- 订单总数
    sum(totalMoney)                         -- 订单总额
from
    itcast_dw.dim_date t1
    left join
     (select * from itcast_dw.fact_orders where dt >='20190101' and dw_end_date='9999-12-31') t2
     on t1.date_value = date_format(createtime, 'yyyy-MM-dd')
 group by 
    t1.year_month_number
order by t1.year_month_number;

-- 测试
select * from itcast_ads.ads_order_date where dt='20190909' and date_type = '2';
统计2019年期间每周一到周日的销售订单笔数、订单总额
insert into table itcast_ads.ads_order_date partition(dt='20190909')
select
    '3' as date_type,
    t1.weekday as date_val,
     count(orderid),                  -- 订单总数
    sum(totalMoney)                         -- 订单总额
from 
    itcast_dw.dim_date t1
    left join
    (select * from itcast_dw.fact_orders where dt >='20190101' and dw_end_date='9999-12-31') t2
     on t1.date_value = date_format(createtime, 'yyyy-MM-dd')
group by
    t1.weekday
order by t1.weekday;

-- 测试
select * from itcast_ads.ads_order_date where dt='20190909' and date_type = '3';
统计2019年期间国家法定节假日、休息日、工作日的订单笔数、订单总额
insert into table itcast_ads.ads_order_date partition(dt='20190909')
select
    '4' as date_type,
    t1.date_type as date_val,
    count(orderid) as order_count,                  -- 订单总数
    sum(totalMoney) as order_money                        -- 订单总额
from 
    itcast_dw.dim_date t1
    left join
    (select * from itcast_dw.fact_orders where dt >='20190101' and dw_end_date='9999-12-31') t2
     on t1.date_value = date_format(createtime, 'yyyy-MM-dd')
group by
    t1.date_type
order by t1.date_type;

-- 测试
select * from itcast_ads.ads_order_date where dt='20190909' and date_type = '4';

指标开发-订单分析地域、分类维度分析业务开发

需求分析

集团总公司分为很多的分公司(销售事业部)

分公司
唐山市分公司
邯郸市分公司
邢台市分公司
保定市分公司
承德市分公司
沧州市分公司
廊坊市分公司
衡水市分公司

为了公司的经营需要,公司需要定期检查各个分公司的经营指标完成情况,运营部门提出数据分析需求:

  • 交易金额
  • 交易笔数
  • 微信交易笔数
  • 支付宝交易笔数

通过以上指标数据,公司的运营人员能够了解到各个分公司的经营情况,在出现经营问题时,能够及时发现,并根据问题,去寻找问题原因。

大区城市一级分类二级分类微信交易笔数支付宝交易笔数交易笔数交易金额
099990010013411238581223992262348998

维度如下:

  • 商品分类
  • 行政区域

要求:最终可以根据不同大区、不同城市、不能分类级别查询交易数据,也就是要求支持不同维度的组合查询。

创建 dw 层数据表

该层主要创建维度表与事实表。为了便于识别,维度表增加 dim_ 前缀。事实表增加 fact_ 前缀:

fact_order_goods_wide
-- 1. 创建订单明细事实表
DROP TABLE IF EXISTS itcast_dw.fact_order_goods_wide;
create table if not exists itcast_dw.fact_order_goods_wide(
    order_id string,            -- 订单id
    goods_cat_3d_id string,     -- 商品三级分类id
    shop_id string,             -- 店铺id
    payment_id string,          -- 订单支付方式
    goods_num bigint,           -- 商品数量
    pay_money double,           -- 订单明细金额
    paytime string              -- 订单时间
)
partitioned by (dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');
dim_goods_cat
-- 表创建
DROP TABLE IF EXISTS itcast_dw.dim_goods_cat;
create table if not exists itcast_dw.dim_goods_cat(
    cat_3d_id string,                   -- 三级商品分类id
    cat_3d_name string,                 -- 三级商品分类名称
    cat_2d_id string,                    -- 二级商品分类Id
    cat_2d_name string,                  -- 二级商品分类名称
    cat_1t_id string,                   -- 一级商品分类id
    cat_1t_name string                  -- 一级商品分类名称
)
partitioned by (dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');
dim_shops
-- 创建dw层商铺维度表
DROP TABLE IF EXISTS itcast_dw.dim_shops;
-- 1. 店铺维度表
create table if not exists itcast_dw.dim_shops(
    shop_id string,             -- 店铺id,
    shop_name string,           -- 店铺名称
    city_id string,             -- 店铺所属城市组织机构id
    city_name string,           -- 店铺所属城市组织机构名称
    region_id string,         -- 店铺所属省份组织机构id
    region_name string        -- 店铺所属省份组织机构名称
)
partitioned by (dt string)
STORED AS PARQUET  TBLPROPERTIES('parquet.compression'='SNAPPY');
dim_payment
-- 1. 创建支付方式维度表
drop table if exists itcast_dw.dim_payment;
create table if not exists itcast_dw.dim_payment(
    payment_id string,          -- 支付方式id
    payment_name string         -- 支付方式名称
)
partitioned by (dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

创建 ads 层数据表

-- 创建ads(数据集市层)订单分析表
DROP TABLE IF EXISTS itcast_ads.ads_trade_order;
create table if not exists itcast_ads.ads_trade_order(
    area_type string,               -- 区域范围:区域类型(全国、大区、城市)
    region_name string,             -- 区域名称
    city_name string,               -- 城市名称
    category_type string,           --分类类型(一级、二级)
    category_1st_name string,       -- 一级分类名称
    category_2d_name string,        -- 二级分类名称
    payment_name string,            -- 支付方式(所有、微信、支付宝、...)
    total_count bigint,             -- 订单数量
    total_goods_num bigint,         -- 商品数量
    total_money double              -- 支付金额
)
partitioned by (dt string)
row format delimited fields terminated by '\001' stored as TEXTFILE;

ods层数据至dw层

fact_orders

直接查询之前准备好的拉链表:

select * from itcast_dw.fact_orders where  dt ='20190909' and dw_end_date = '9999-12-31' limit 5;
dim_goods

直接查询之前准备好的拉链表:

-- 测试
select * from itcast_dw.dim_goods where dw_end_date = '9999-12-31' limit 5;

dw层数据维度拉宽

数据维度表拉宽
区域店铺维度数据拉宽

店铺表(itcast_shops)与组织机构表(itcast_org)进行关联

列名称数据类型数据来源
shop_idstringitcast_shops
shop_namestringitcast_shops
city_idstringitcast_org
city_namestringitcast_org
region_idstringitcast_org
region_namestringitcast_org

参考代码:

-- 2. 加载订单组织机构维度表数据
insert overwrite table itcast_dw.dim_shops partition(dt='20190909')
select
    t1.shopid as shop_id,           -- 店铺id
    t1.shopname as shop_name,       -- 店铺名称
    t2.orgid as city_id,            -- 城市组织机构id
    t2.orgname as city_name,        -- 城市组织机构名称
    t3.orgid as region_id,          -- 区域组织机构id
    t3.orgname as region_name       -- 区域组织机构名称
from
    (select shopid, shopname, areaid from itcast_ods.itcast_shops where dt='20190909') t1                         -- 店铺数据
    left join
    (select orgid, parentid, orgname, orglevel from itcast_ods.itcast_org where orglevel=2 and dt='20190909') t2    -- 城市组织机构数据
    on t1.areaid = t2.orgid
    left join
    (select orgid, parentid, orgname, orglevel from itcast_ods.itcast_org where orglevel=1 and dt='20190909') t3    -- 省份组织机构数据
    on t2.parentid = t3.orgid;
    -- 测试数据
    select * from itcast_dw.dim_org limit 15;
商品分类维度数据拉宽

商品表维度数据拉宽

参考代码:

列名称数据类型数据来源
cat_3d_idstringitcast_goods_cats
cat_3d_namestringitcast_goods_cats
cat_2d_idstringitcast_goods_cats
cat_2d_namestringitcast_goods_cats
cat_1t_idstringitcast_goods_cats
cat_1t_namestringitcast_goods_cats
insert overwrite table itcast_dw.dim_goods_cat partition(dt='20190909')
select 
    t3.catid as cat_3d_id,          -- 三级分类id
    t3.catname as cat_3d_name,      -- 三级分类名称
    t2.catid as cat_2d_id,          -- 二级分类id
    t2.catname as cat_2d_name,      -- 二级分类名称
    t1.catid as cat_1t_id,          -- 一级分类id
    t1.catname as cat_1t_name       -- 一级分类名称
from 
    (select catid, catname, parentid from itcast_ods.itcast_goods_cats where cat_level=3 and dt='20190909') t3 -- 商品三级分类数据
    left join
    (select catid, catname, parentid from itcast_ods.itcast_goods_cats where cat_level=2 and dt='20190909') t2 -- 商品二级分类数据
    on t3.parentid = t2.catid
    left join
    (select catid, catname, parentid from itcast_ods.itcast_goods_cats where cat_level=1 and dt='20190909') t1 -- 商品一级分类数据
    on t2.parentid = t1.catid;
    
-- 测试数据
select * from itcast_dw.dim_goods_cat limit 5;
支付方式维度表拉宽
列名称数据类型数据来源
payment_idstringitcast_payments
payment_namestringitcast_payments
-- 2. 加载支付方式维度数据
-- 需要额外添加一行数据 0 -> 其他
insert overwrite table itcast_dw.dim_payment partition(dt='20190909')
select 
    t1.id as payment_id,        -- 支付方式id
    t1.payName as payment_name  -- 支付方式名称
from 
    (select id, payName from itcast_ods.itcast_payments where dt='20190909') t1;

    -- 测试查询支付方式维度数据
select * from itcast_dw.dim_payment limit 5;
事实表拉宽
列名称数据类型数据来源
order_idstringitcast_orders
goods_cat_3d_idstringdim_goods
shop_idstringdim_goods
payment_idstringitcast_orders
goods_numbigintitcast_order_goods
pay_moneydoubleitcast_orders
paytimestringitcast_orders
-- 2. 拉宽订单明细事实表
insert overwrite table itcast_dw.fact_order_goods_wide partition(dt='20190909')
select
    t1.orderid as order_id,
    t3.goodscatid as goods_cat_3d_id,
    t3.shopid as shop_id,
    t1.paytype as payment_id,
    t2.goodsnum as goods_num,
    t2.goodsnum*t2.payprice  as pay_money,
    t1.paytime as paytime
from
    (select orderid, paytype, paytime from itcast_dw.fact_orders where dt='20190909') t1                     -- 订单表数据
    left join
    (select orderid, goodsid, goodsnum, payprice from itcast_ods.itcast_order_goods where dt='20190909') t2     -- 订单明细数
    on t1.orderid = t2.orderid
    left join
    (select goodsid, shopid, goodscatid from itcast_dw.dim_goods where dw_end_date = '9999-12-31') t3           -- 商品数量
    on t2.goodsid = t3.goodsid;
    
-- 测试数据
select * from itcast_dw.fact_order_goods_wide limit 5;

指标计算

-- 指标统计
-- 创建 ads 层结果表
create table if not exists itcast_ads.ads_trade_order(
    area_type string,               -- 区域范围:区域类型(全国、大区、城市)
    region_name string,             -- 区域名称
    city_name string,               -- 城市名称
    category_type string,           --分类类型(一级、二级)
    category_1st_name string,       -- 一级分类名称
    category_2d_name string,        -- 二级分类名称
    payment_name string,            -- 支付方式(所有、微信、支付宝、...)
    total_count bigint,             -- 订单数量
    total_goods_num bigint,         -- 商品数量
    total_money double              -- 支付金额
)
partitioned by (dt string)
row format delimited fields terminated by '\001' stored as TEXTFILE;

指标计算汇总计算

维度数据:

  • 区域维度数据
  • 商品分类维度数据。

根据需求,需要依据这两个维度进行汇总。最终在业务系统中展示如下:

数据展示示例

公司高管拥有查询所有数据的权限,省份不选择时,看到的为全国范围内的数据,如果指定省份(本次开发的系统为大区)时,看到的为指定省份数据,如果指定某个城市,则看到的是指定城市的数据,同理商品分类维度类似推算。

由于区域维度为分级维度,分为全国、大区、城市三级。通常情况下,我们会计算出城市维度,然后对城市维度数据进行汇总计算得出大区维度和全国维度数据,实际上,由于一个订单,由于会有多个商品,这些商品属于不同的商店,这些商店又属于不同的区域,故不能简单累加计算。

ogidorderidshopidcityidregionidgoodsiddt
120242532100050北京华北10011120190905
120243532100058石家庄华北10017620190905
120244532100062上海华东10010520190905

根据此订单信息分析,orderid为532的订单,实际上仅为一个订单,该订单共计3件商品,分别属于不同的商店,这些商店分属北京、石家庄、上海城市。

在区域维度计算时。精确到城市维度数据计算应该如下:

大区城市订单量
华北北京1
华北石家庄1
华东上海1

精确到大区维度数据计算应该如下:

大区订单量
华北1
华东1

精确到全国维度数据计算时结果数据应该如下:

区域订单量
全国1

通过分析,发现如果使用已经计算过的城市维度的数据汇总至大区时,华北大区的订单量会被计算为2。故不能简单累加。每个区域级别的数据需要单独计算,同理在商品分类维度中,也有同样的情况。

全国、无商品分类维度的交易信息

由于需求中需要列出支付宝支付笔数,微信交易笔数与总交易笔数,根据数据信息,支付共计分为支付宝交易、微信交易、现金交易、其他交易和未知支付类型(支付方式,0:未知;1:支付宝,2:微信;3、现金;4、其他)

故每个维度的数据计算时,除按照支付类型进行分组获取各种支付类型支付的数据外,还要获取不分任何类型的数据。

需求:

– 1、全国、无商品分类不分支付类型维度分析

– 2、 全国、无商品分类分支付类型维度分析

注意事项:

第一个INSERT导入数据到数据集市,使用overwrite,第二个则应该使用into

计算如下:

-- 1、全国、无商品分类不分支付类型维度分析
insert overwrite table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '全国' as area_type,
    '' as region_name,
    '' as city_name,
    '' as category_type,
    '' as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t1.order_id) as total_count,--订单笔数需要去重
    sum(t1.goods_num) as total_goods_num,--总的商品数量
    sum(t1.pay_money ) as total_money
from
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t1;
    
-- 2、测试获取数据
select * from itcast_ads.ads_trade_order t where t.area_type = '全国';
-- 全国、无商品分类分支付类型维度分析
-- 注意:要使用 insert into,否则数据会被覆写
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '全国' as area_type,
    '' as region_name,
    '' as city_name,
    '' as category_type,
    '' as category_1st_name,
    '' as category_2d_name,
    t1.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
(select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
left join
    (select * from itcast_dw.dim_payment where dt='20190909') t1   
    on t1.payment_id = t2.payment_id
group by t1.payment_name;

-- 测试查询数据
 select * from itcast_ads.ads_trade_order t1 where t1.payment_name != '所有' limit 5;

全国、一级商品分类维度交易信息

需求:

1、获取全国、一级商品分类维度的不分支付类型数据

2、获取全国、一级商品分类维度的分支付类型数据

-- 获取全国、一级商品分类维度的分支付类型数据
 -- 3.1 获取全国、一级商品分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '全国' as area_type,
    '' as region_name,
    '' as city_name,
    '一级分类' as category_type,
    t1.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(case when t2.goods_num is null
       then 0
       else t2.goods_num
        end
       ) as total_goods_num,
    sum(case when t2.pay_money is null
        then 0
        else t2.pay_money
        end
        ) as total_money
from
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t1
    left join 
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.cat_3d_id = t2.goods_cat_3d_id
group by t1.cat_1t_name;
-- 测试
select * from itcast_ads.ads_trade_order where dt='20190909' and category_type = '一级分类' and payment_name = '所有';
 -- 3.2 获取全国、一级商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '全国' as area_type,
    '' as region_name,
    '' as city_name,
    '一级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    t1.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_payment where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.payment_id = t2.payment_id
    left join 
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
group by t1.payment_name, t3.cat_1t_name;

-- 测试查询数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '全国' and category_type = '一级分类' and payment_name != '所有';
全国、二级商品分类维度交易信息

需求:

1、获取全国、二级商品分类维度的分支付类型数据

2、获取全国、二级商品不分类维度的分支付类型数据

-- 全国、二级商品分类维度分析
-- 1. 获取全国、二级商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '全国' as area_type,
    '' as region_name,
    '' as city_name,
    '二级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    t3.cat_2d_name as category_2d_name,
    t1.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_payment where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.payment_id = t2.payment_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
group by t1.payment_name, t3.cat_1t_name, t3.cat_2d_name;

-- 测试查询数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '全国' and category_type = '二级分类' limit 10;

-- 2. 获取全国、二级商品不分类维度的不分支付类型数据
insert overwrite table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '全国' as area_type,
    '' as region_name,
    '' as city_name,
    '二级分类' as category_type,
    t2.cat_1t_name as category_1st_name,
    t2.cat_2d_name as category_2d_name,
    '所有' as payment_name,
    count(distinct t1.order_id) as total_count,
    sum(t1.goods_num) as total_goods_num,
    sum(t1.pay_money) as total_money
from
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t1
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t2
    on t1.goods_cat_3d_id = t2.cat_3d_id
group by t2.cat_1t_name, t2.cat_2d_name;

-- 测试查询数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '全国' and category_type = '二级分类' and payment_name = '所有' limit 10;
大区、无商品分类维度的交易信息

需求:

1、获取大区、无商品分类维度的分支付类型数据

2、获取大区、无商品分类维度的不分支付类型数据

参考代码:

-- 大区、无商品分类维度的交易信息
-- 1、获取大区、无商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '大区' as area_type,
    t1.region_name as region_name,
    '' as city_name,
    '所有' as category_type,
    '' as category_1st_name,
    '' as category_2d_name,
    t3.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_payment where dt='20190909') t3
    on t2.payment_id = t3.payment_id
group by t1.region_name, t3.payment_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '大区' and category_type = '所有';

-- 2、获取大区、不分商品不分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '大区' as area_type,
    t1.region_name as region_name,
    '' as city_name,
    '所有' as category_type,
    '' as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
group by t1.region_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '大区' and category_type = '所有' and payment_name = '所有';
大区、一级商品分类维度交易信息

1、获取大区、一级商品分类维度的分支付类型数据

2、获取大区、一级商品分类维度的不分支付类型数据

参考代码:

-- 3. 大区、一级商品分类维度交易信息
-- 3.1、获取大区、一级商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '大区' as area_type,
    t1.region_name as region_name,
    '' as city_name,
    '一级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    t4.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shop where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
    left join
    (select * from itcast_dw.dim_payment where dt='20190909') t4
    on t2.payment_id = t4.payment_id
group by t1.region_name, t3.cat_1t_name, t4.payment_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '大区' and category_type = '一级分类' and payment_name != '所有';


-- 3.2、获取大区、一级商品分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '大区' as area_type,
    t1.region_name as region_name,
    '' as city_name,
    '一级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shop where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
group by t1.region_name, t3.cat_1t_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '大区' and category_type = '一级分类' and payment_name = '所有';
大区、二级商品分类维度交易信息

1、获取大区、二级商品分类维度的分支付类型数据

2、获取大区、二级商品分类维度的不分支付类型数据

参考代码:

-- 大区、二级商品分类维度交易信息
--- 1、获取大区、二级商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '大区' as area_type,
    t1.region_name as region_name,
    '' as city_name,
    '二级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    t4.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
    left join
    (select * from itcast_dw.dim_payment where dt='20190909') t4
    on t2.payment_id = t4.payment_id
group by t1.region_name, t3.cat_1t_name, t3.cat_2d_name, t4.payment_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '大区' and category_type = '二级分类' limit 10;

-- 2、获取大区、二级商品分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '大区' as area_type,
    t1.region_name as region_name,
    '' as city_name,
    '二级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
group by t1.region_name, t3.cat_1t_name, t3.cat_2d_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '大区' and category_type = '二级分类' and payment_name='所有';
城市、无商品分类维度交易信息

1、 获取城市、无商品分类维度的分支付类型数据

2、获取城市、无商品分类维度的不分支付类型数据

参考代码:

-- #################
-- 城市、无商品分类维度交易信息
-- #################
-- 1. 获取城市、无商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '城市' as area_type,
    t1.region_name as region_name,
    t1.city_name as city_name,
    '所有' as category_type,
    '' as category_1st_name,
    '' as category_2d_name,
    t3.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_payment where dt='20190909') t3
    on t2.payment_id = t3.payment_id
group by t1.region_name, t1.city_name, t3.payment_name;

-- 获取测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '城市' and category_type = '所有';

-- 2、获取城市、无商品分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '城市' as area_type,
    t1.region_name as region_name,
    t1.city_name as city_name,
    '所有' as category_type,
    '' as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
group by t1.region_name, t1.city_name;


-- 获取测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '城市' and category_type = '所有' and payment_name = '所有';
城市、一级商品分类维度交易信息

1、获取城市、一级商品分类维度的不分支付类型数据

2、获取城市、一级商品分类维度的分支付类型数据

参考代码:

-- #### 城市、一级商品分类维度交易信息
-- #################
-- 1、获取城市、一级商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '城市' as area_type,
    t1.region_name as region_name,
    t1.city_name as city_name,
    '一级分类' as category_type,
    t4.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    t3.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_payment where dt='20190909') t3
    on t2.payment_id = t3.payment_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t4
    on t2.goods_cat_3d_id = t4.cat_3d_id
group by t1.region_name, t1.city_name, t3.payment_name, t4.cat_1t_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '城市' and category_type = '一级分类';

-- 2、获取城市、一级商品分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '城市' as area_type,
    t1.region_name as region_name,
    t1.city_name as city_name,
    '一级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
group by t1.region_name, t1.city_name, t3.cat_1t_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '城市' and category_type = '一级分类' and payment_name = '所有';
城市、二级商品分类维度交易信息

1、获取城市、二级商品分类维度的不分支付类型数据

2、获取城市、二级商品分类维度的分支付类型数据

参考代码:

-- #### 城市、二级商品分类维度交易信息
-- #################
-- 1、获取城市、二级商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '城市' as area_type,
    t1.region_name as region_name,
    t1.city_name as city_name,
    '二级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    t3.cat_2d_name as category_2d_name,
    t4.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
    left join
    (select * from itcast_dw.dim_payment where dt='20190909') t4
    on t2.payment_id = t4.payment_id
group by t1.region_name, t1.city_name, t3.cat_1t_name, t3.cat_2d_name, t4.payment_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '城市' and category_type = '二级分类';

-- 2、获取城市、二级商品分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '城市' as area_type,
    t1.region_name as region_name,
    t1.city_name as city_name,
    '二级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    t3.cat_2d_name as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
group by t1.region_name, t1.city_name, t3.cat_1t_name, t3.cat_2d_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '城市' and category_type = '二级分类' and payment_name = '所有';

指标开发-用户订单指标业务开发

需求分析

  • 电商平台往往需要根据用户的购买数据来分析用户的行为
  • 根据用户的消费行为习惯,给运营部门提供用户分析数据指标
  • 主要围绕订单表开展分析
基于用户的订单指标
统计指标所需字段
第一次消费时间支付时间
最近一次消费时间支付时间
首单距今时间支付时间
尾单距今时间支付时间
最小消费金额订单支付金额
最大消费金额订单支付金额
累计消费次数(不含退拒)订单id、订单状态
累计消费金额(不含退拒)订单支付金额、订单状态
近30天购买次数(不含退拒)是否为近30天、订单id、订单状态
近30天购买金额(不含退拒)是否为近30天、订单支付金额、订单状态
近30天购买次数(含退拒)是否为近30天、订单id
近30天购买金额(含退拒)是否为近30天、订单支付金额
客单价(含退拒)订单支付金额、客户订单数
客单价(不含退拒)订单支付金额、客户订单数、订单状态
近60天客单价(含退拒)是否为近60天、订单支付金额、客户订单数
近60天客单价(不含退拒)是否为近60天、订单支付金额、订单状态
常用收货地址用户收货地址
常用支付方式支付方式
学校下单总数用户收货地址
单位下单总数用户收货地址
家里下单总数用户收货地址
上午下单总数订单id、是否为上午
下午下单总数订单id、是否为下午
晚上下单总数订单id、是否为晚上

订单状态

  • 交易未成功(拒收/reject)

    • 订单未送达、送达后未签收或签收后办理了退货
    • 订单状态显示「交易未成功」
  • 取消(退货/return)

    • 订单中的商品缺货、订单超出了订单保留期限,或订单进行了取消操作
    • 订单显示「取消」状态
  • 等待支付

    • 用户选择 「网上支付」和 「银行转账」这两种付款方式
    • 在暂未收到款项时,订单会显示「等待支付」
  • 等待预售商品到货

    • 订购的商品为预售商品
    • 商品到货前订单会显示 「等待预售商品到货」
  • 正在配货

    • 订单正在库房配货。
  • 等待移仓

    • 订单中的商品当地库房缺货
    • 将从北京仓库调货至当地仓库,订单显示「等待移仓」
  • 正在移仓

    • 订单中的商品正从北京仓库发往当地仓库,订单会显示「正在移仓」
  • 已配货

    • 此订单已完成配货,正在等待发货
  • 已发货

    • 订单已从库房发出,正在配送途中,订单显示「已发货」
  • 已送达

    1. 已收到商品并在“我的订单”中进行了「收货反馈」,订单会显示「已送达」
    2. 未进行「收货反馈」操作,系统在发货后的20天默认变为「已送达」
    3. 订单状态为「已发货」,但物流配送信息显示「配送成功」时,系统默认态显示为「已送达」
    4. 国内平邮订单,未进行「确认收货」操作,系统在发货25天后默认收到商品,订单状态显示为「已送达」
    5. 海外订单,如果未进行「确认收货」操作,系统会在发货60天后默认您收到商品,订单状态显示为「已送达」
  • 交易成功

    • 若订单状态为」已送达」,且此状态后的15天内未发生退货,系统将默认变为「交易成功」

创建dw层表

1、创建itcast_dw.dim_user表

字段说明
userId用户id
loginName登录名
userSex性别
drop table if exists itcast_dw.dim_user;
create table itcast_dw.dim_user(
    userId          bigint,
    loginName       string,
    userSex         bigint
)
partitioned by (dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

2、创建dw层用户收货地址表 itcast_dw.dim_user_address

--创建dw层dim_user_address表
drop table if exists itcast_dw.dim_user_address;
create table itcast_dw.dim_user_address(
    addressId    bigint,
    userId       bigint,
    userName     string,
    otherName    string,
    userPhone    string,
    areaIdPath   string,
    areaId       bigint,
    userAddress  string,
    isDefault    bigint,
    dataFlag     bigint,
    createTime   string
)
partitioned by (dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

--从ods层itcast_user_address导出数据到dim_user_address表
insert overwrite table itcast_dw.dim_user_address partition(dt="20190909")
select 
addressId,
userId,
userName, 
otherName,
userPhone,
areaIdPath,
areaId,
userAddress,
isDefault,
dataFlag, 
createTime 
from itcast_ods.itcast_user_address where dt="20190909";

3、创建订单时间标志、地址标志宽表 fact_order_wide

字段名称说明
flag30近30天标志
flag60近60天标志
flag90近90天标志
flag180近180天标志
flagTimeBucket一天时间段标志(凌晨、早晨、上午…
othername地址标志(家里、学校、工作单位…)
-- 地址拉宽
drop table itcast_dw.fact_order_wide;
create table itcast_dw.fact_order_wide(
    orderId bigint,
    orderStatus bigint,
    payType bigint,
    userId bigint,
    userAddressId bigint,
    payTime string,
    payMoney double,
    createtime string,
    flag30 bigint,
    flag60 bigint,
    flag90 bigint,
    flag180 bigint,
    flagTimeBucket string,
    othername string
)
partitioned by(dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

订单宽表ETL处理

加载用户维度数据

参考代码:

insert overwrite table itcast_dw.dim_user partition(dt='20190909')
select 
    userId,   
    loginName,
    userSex  
from
    itcast_ods.itcast_users where dt ='20190909' ;
    
--验证
select * from itcast_dw.dim_user limit 10;
时间近30天、90天、180天、订单上午、下午时间拉宽

参考代码:

insert overwrite table itcast_dw.fact_order_wide partition(dt='20190909')
select
    t1.orderId,
    t1.orderStatus,
    t1.payType,
    t1.userId,
    t1.userAddressId,
    t1.payTime,
    t1.totalMoney,
    t1.createtime,
    --近30天
    case when datediff(current_timestamp, t1.createtime) <= 30
        then 1
    else 0
    end as flag_30,
     --近60天
    case when datediff(current_timestamp, t1.createtime) <= 60
        then 1
    else 0
    end as flag_60,
     --近90天
    case when datediff(current_timestamp, t1.createtime) <= 90
        then 1
    else 0
    end as flag_90, 
     --近180天
    case when datediff(current_timestamp, t1.createtime) <= 180
    then 1
    else 0
    end as flag_180,
    --所属时间段
    case when hour(t1.createtime) >= 0 and hour(t1.createtime) < 6
        then '凌晨'
    when hour(t1.createtime) >= 6 and hour(t1.createtime) < 12
        then '上午'
    when hour(t1.createtime) >= 12 and hour(t1.createtime) < 14
        then '中午'
    when hour(t1.createtime) >= 14 and hour(t1.createtime) < 18
        then '下午'
    else '晚上'
    end as flag_time_bucket,
    --家里单位学校
    t2.othername
from 
   (select orderId,orderStatus,payType,userId,userAddressId,payTime,totalMoney,createtime from  itcast_dw.fact_orders
where dw_end_date ='9999-12-31') t1  
left join
    (select * from itcast_dw.dim_user_address where dt='20190909') t2
    on t1.userAddressId = t2.addressId;
        
-- 测试语句
select * from itcast_dw.fact_order_wide limit 5; 

指标开发

指标开发 一
指标字段说明
第一次消费时间支付时间
最近一次消费时间支付时间
首单距今时间支付时间
尾单距今时间------分析用户什么时候来购买商品以及多久没有购买了支付时间
最小消费金额订单支付金额
最大消费金额订单支付金额

参考代码:

select
    t1.userid,
    t1.loginname,
    MIN(t2.payTime) as first_paytime, --首次下单时间
    MAX(t2.payTime) as lastest_paytime, --尾单时间
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t2.payTime)) as first_day_during_days,--首单距今
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t2.payTime)) as lastest_day_durning_days, --尾单距今
    MIN(t2.paymoney) as min_paymoney,
    MAX(t2.paymoney) as max_paymoney
from
   (select * from itcast_dw.fact_order_wide where dt='20190909') as t2
    left join
     (select * from itcast_dw.dim_user where dt='20190909') as t1
    on t1.userId = t2.userId
group by t1.userid,t1.loginname
limit 5;
指标开发二
指标
累计消费次数(不含退拒)
累计消费金额(不含退拒)
近30天购买次数(不含退拒)
近30天购买金额(不含退拒)
近30天购买次数(含退拒)
近30天购买金额(含退拒)----分析用户最近的消费能力

参考代码:

select
   t2.userid,
    t2.loginname,
    MIN(t1.payTime) as first_paytime,
    MAX(t1.payTime) as lastest_paytime,
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t1.payTime)) as first_day_during_days,
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t1.payTime)) as lastest_day_durning_days,
    MIN(t1.paymoney) as min_paymoney,
    MAX(t1.paymoney) as max_paymoney,
sum(
case when t1.orderstatus !=10 and t1.orderstatus !=11
    then 1
    else 0
    end
) as total_count_without_back,--累计消费次数不含退拒,
sum(case when t1.orderstatus !=10 and t1.orderstatus !=11
   then t1.paymoney
   else 0
    end
   ) as total_money_without_back, --累计消费金额不含退拒
   --累计近30天消费次数不含退拒
   sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then 1
       else 0
       end
      ) as total_count_without_back_30,
    --累计近30天消费金额不含退拒
     sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then t1.paymoney
       else 0
       end
      ) as total_money_without_back_30,
        --累计近30天消费次数含退拒
   sum(case when t1.flag30 =1 
      then 1
       else 0
       end
      ) as total_count_without_30,
    --累计近30天消费金额含退拒
     sum(case when t1.flag30 =1 
      then t1.paymoney
       else 0
       end
      ) as total_money_with_back_30
      
from 
(select * from itcast_dw.fact_order_wide where dt="20190909") t1 
left join  
(select * from itcast_dw.dim_user where dt="20190909") t2 on 
t1.userid=t2.userid
group by t2.userid,t2.loginname limit 5;


指标开发三
指标
客单价(含退拒)
客单价(不含退拒)
近60天客单价(含退拒)
近60天客单价(不含退拒)

参考代码:

-- 指标开发三
-- 1. 客单价(含退拒)
-- 2. 客单价(不含退拒)
-- 3. 近60天客单价(含退拒)-----分析用户消费水平
-- 4. 近60天客单价(不含退拒)

select
   t2.userid,
    t2.loginname,
    MIN(t1.payTime) as first_paytime,
    MAX(t1.payTime) as lastest_paytime,
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t1.payTime)) as first_day_during_days,
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t1.payTime)) as lastest_day_durning_days,
    MIN(t1.paymoney) as min_paymoney,
    MAX(t1.paymoney) as max_paymoney,
sum(
case when t1.orderstatus !=10 and t1.orderstatus !=11
    then 1
    else 0
    end
) as total_count_without_back,--累计消费次数不含退拒,
sum(case when t1.orderstatus !=10 and t1.orderstatus !=11
   then t1.paymoney
   else 0
    end
   ) as total_money_without_back, --累计消费金额不含退拒
   --累计近30天消费次数不含退拒
   sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then 1
       else 0
       end
      ) as total_count_without_back_30,
    --累计近30天消费金额不含退拒
     sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then t1.paymoney
       else 0
       end
      ) as total_money_without_back_30,
      
        --累计近30天消费次数含退拒
   sum(case when t1.flag30 =1 
      then 1
       else 0
       end
      ) as total_count_without_30,
    --累计近30天消费金额含退拒
     sum(case when t1.flag30 =1 
      then t1.paymoney
       else 0
       end
      ) as total_money_with_back_30,
      --客单价含退拒
      SUM(t1.paymoney) / SUM(1) AS atv,
       --客单价不含退拒
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 
        then t1.paymoney else 0 end) / 
        SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 then 1 else 0
              end) AS atv_withoutback,
              --近60天客单价含退拒
    SUM(case when t1.flag60 = 1 
        then t1.paymoney else 0 end) / SUM(case when t1.flag60 = 1
            then 1
            else 0
          end) AS atv_60,
          --近60天不含退拒
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 and t1.flag60 = 1
        then t1.paymoney
        else 0
        end) / SUM(case when (t1.orderStatus != 10 or t1.orderStatus != 11) and t1.flag60 = 1
        then 1
        else 0
      end) AS atv_60_withoutback
      
from 
(select * from itcast_dw.fact_order_wide where dt="20190909") t1 
left join  
(select * from itcast_dw.dim_user where dt="20190909") t2 on 
t1.userid=t2.userid
group by t2.userid,t2.loginname limit 5;


指标开发四
指标
学校下单总数
单位下单总数
家里下单总数
上午下单总数
下午下单总数
晚上下单总数

参考代码

select
    t2.userid,
    t2.loginname,
    MIN(t1.payTime) as first_paytime,
    MAX(t1.payTime) as lastest_paytime,
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t1.payTime)) as first_day_during_days,
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t1.payTime)) as lastest_day_durning_days,
    MIN(t1.paymoney) as min_paymoney,
    MAX(t1.paymoney) as max_paymoney,
sum(
case when t1.orderstatus !=10 and t1.orderstatus !=11
    then 1
    else 0
    end
) as total_count_without_back,--累计消费次数不含退拒,
sum(case when t1.orderstatus !=10 and t1.orderstatus !=11
   then t1.paymoney
   else 0
    end
   ) as total_money_without_back, --累计消费金额不含退拒
   --累计近30天消费次数不含退拒
   sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then 1
       else 0
       end
      ) as total_count_without_back_30,
    --累计近30天消费金额不含退拒
     sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then t1.paymoney
       else 0
       end
      ) as total_money_without_back_30,
        --累计近30天消费次数含退拒
   sum(case when t1.flag30 =1 
      then 1
       else 0
       end
      ) as total_count_without_30,
    --累计近30天消费金额含退拒
     sum(case when t1.flag30 =1 
      then t1.paymoney
       else 0
       end
      ) as total_money_with_back_30,
      SUM(t1.paymoney) / SUM(1) AS atv,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 
        then t1.paymoney else 0 end) / 
        SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 then 1 else 0
              end) AS atv_withoutback,
    SUM(case when t1.flag60 = 1 
        then t1.paymoney else 0 end) / SUM(case when t1.flag60 = 1
            then 1
            else 0
          end) AS atv_60,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 and t1.flag60 = 1
        then t1.paymoney
        else 0
        end) / SUM(case when (t1.orderStatus != 10 or t1.orderStatus != 11) and t1.flag60 = 1
        then 1
        else 0
      end) AS atv_60_withoutback,
        SUM(case when t1.otherName = '学校'
        then 1
        else 0
        end) as school_order_count,      -- 学校下单总数
    SUM(case when t1.otherName = '单位'
        then 1
        else 0
        end) as company_order_count,    -- 单位下单总数
    SUM(case when t1.otherName = '家里'
        then 1
        else 0
        end) as home_order_count,        -- 家里下单总数
    SUM(case when t1.flagTimeBucket = '上午'
        then 1
        else 0
        end) as am_order_count,          -- 上午下单总数
    SUM(case when t1.flagTimeBucket = '下午'
        then 1
        else 0
        end) as pm_order_count,          -- 下午下单总数
    SUM(case when t1.flagTimeBucket = '晚上'
        then 1
        else 0
        end) as night_order_count-- 晚上下单总数
from
(select * from itcast_dw.fact_order_wide where dt="20190909") t1 
left join  
(select * from itcast_dw.dim_user where dt="20190909") t2 on 
t1.userid=t2.userid
group by t2.userid,t2.loginname limit 5;

指标开发五
指标
常用收货地址:某个地址对应的订单数量越多则越常用

1、加载订单地址分析表

参考代码:

-- 创建订单地址分析表
drop table if exists itcast_ads.tmp_order_address;
create table itcast_ads.tmp_order_address(
    userId bigint,          -- 用户Id
    otherName string,       -- 地址类型(家里、学校...)
    totalCount bigint,     -- 下单数
    rn bigint       -- 下单排名
)
partitioned by (dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

--从tmp_order_datetag_wide统计最常用地址
insert overwrite table itcast_ads.tmp_order_address partition(dt='20190909')
select
t3.userid,
t3.othername,
t3.ordercount,
 row_number() over( partition by t3.userid order by ordercount desc ) rn  --partiton by userid:按照用户分组,order by ordercount :按照订单数量排序 降序  ,rn:组内的序号
from 
(select
    t1.userId as userid,
    t1.othername as othername,
   count(t1.orderid) as ordercount  -->每个用户每个地址订单的数量
from
    (select * from itcast_dw.fact_order_wide where dt='20190909') t1
   
group by t1.userid,t1.otherName order by t1.userid  ) t3  ;

-- 测试
select * from itcast_ads.tmp_order_address order by userId, rn limit 10;

2、统计常用收货地址指标

参考代码:

select
   t2.userid,
    t2.loginname,
    MIN(t1.payTime) as first_paytime,
    MAX(t1.payTime) as lastest_paytime,
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t1.payTime)) as first_day_during_days,
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t1.payTime)) as lastest_day_durning_days,
    MIN(t1.paymoney) as min_paymoney,
    MAX(t1.paymoney) as max_paymoney,
sum(
case when t1.orderstatus !=10 and t1.orderstatus !=11
    then 1
    else 0
    end
) as total_count_without_back,--累计消费次数不含退拒,
sum(case when t1.orderstatus !=10 and t1.orderstatus !=11
   then t1.paymoney
   else 0
    end
   ) as total_money_without_back, --累计消费金额不含退拒
   --累计近30天消费次数不含退拒
   sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then 1
       else 0
       end
      ) as total_count_without_back_30,
    --累计近30天消费金额不含退拒
     sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then t1.paymoney
       else 0
       end
      ) as total_money_without_back_30,
        --累计近30天消费次数含退拒
   sum(case when t1.flag30 =1 
      then 1
       else 0
       end
      ) as total_count_without_30,
    --累计近30天消费金额含退拒
     sum(case when t1.flag30 =1 
      then t1.paymoney
       else 0
       end
      ) as total_money_with_back_30,
      SUM(t1.paymoney) / SUM(1) AS atv,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 
        then t1.paymoney else 0 end) / 
        SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 then 1 else 0
              end) AS atv_withoutback,
    SUM(case when t1.flag60 = 1 
        then t1.paymoney else 0 end) / SUM(case when t1.flag60 = 1
            then 1
            else 0
          end) AS atv_60,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 and t1.flag60 = 1
        then t1.paymoney
        else 0
        end) / SUM(case when (t1.orderStatus != 10 or t1.orderStatus != 11) and t1.flag60 = 1
        then 1
        else 0
      end) AS atv_60_withoutback,
       SUM(case when t1.otherName = '学校'
        then 1
        else 0
        end) as school_order_count,      -- 学校下单总数
    SUM(case when t1.otherName = '单位'
        then 1
        else 0
        end) as company_order_count,    -- 单位下单总数
    SUM(case when t1.otherName = '家里'
        then 1
        else 0
        end) as home_order_count,        -- 家里下单总数
    SUM(case when t1.flagTimeBucket = '上午'
        then 1
        else 0
        end) as am_order_count,          -- 上午下单总数
    SUM(case when t1.flagTimeBucket = '下午'
        then 1
        else 0
        end) as pm_order_count,          -- 下午下单总数
    SUM(case when t1.flagTimeBucket = '晚上'
        then 1
        else 0
        end) as night_order_count,    -- 晚上下单总数
      --最常用地址
max(case when t3.rn =1
then t3.othername
else ''
end) as most_usual_address 
      
from 
(select * from itcast_dw.fact_order_wide where dt="20190909") t1 
left join  
(select * from itcast_dw.dim_user where dt="20190909") t2 on 
t1.userid=t2.userid
    left join
    (select * from itcast_ads.tmp_order_address where dt='20190909') as t3
    on t1.userId = t3.userId
group by t2.userid,t2.loginname limit 5;

指标开发六
指标
常用支付方式:某种支付方式对应的订单数量越多则越常用

1、加载支付方式排名

-- 创建支付方式分析表
drop table if exists itcast_ads.tmp_order_paytype;
create table itcast_ads.tmp_order_paytype(
    userid bigint,              -- 用户id
    payType bigint,           -- 支付类型id
    payName string,             -- 支付码
    totalCount bigint,          -- 订单总数
    rn bigint                   -- 等级
)
partitioned by (dt string)
STORED AS PARQUET;

-- 加载支付方式分析
insert overwrite table itcast_ads.tmp_order_paytype partition(dt='20190909')
select
    t3.*,
    row_number() over(partition by userId order by totalCount desc) rn
from
    (select
        t1.userId,
        t1.payType,
        t2.payment_name,
        sum(1) as totalCount   --sum(1)等同于count效果
    from
     (select * from itcast_dw.fact_order_wide where dt='20190909') t1
     left join 
        (select * from itcast_dw.dim_payment where dt='20190909') t2
        on t1.payType = t2.payment_id
    group by t1.userId, t1.payType,t2.payment_name) t3;
    
-- 测试
select * from itcast_ads.tmp_order_paytype limit 5;

2、统计常用支付方式指标

参考代码:

select
   t2.userid,
    t2.loginname,
    MIN(t1.payTime) as first_paytime,
    MAX(t1.payTime) as lastest_paytime,
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t1.payTime)) as first_day_during_days,
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t1.payTime)) as lastest_day_durning_days,
    MIN(t1.paymoney) as min_paymoney,
    MAX(t1.paymoney) as max_paymoney,
sum(
case when t1.orderstatus !=10 and t1.orderstatus !=11
    then 1
    else 0
    end
) as total_count_without_back,--累计消费次数不含退拒,
sum(case when t1.orderstatus !=10 and t1.orderstatus !=11
   then t1.paymoney
   else 0
    end
   ) as total_money_without_back, --累计消费金额不含退拒
   --累计近30天消费次数不含退拒
   sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then 1
       else 0
       end
      ) as total_count_without_back_30,
    --累计近30天消费金额不含退拒
     sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then t1.paymoney
       else 0
       end
      ) as total_money_without_back_30,
        --累计近30天消费次数含退拒
   sum(case when t1.flag30 =1 
      then 1
       else 0
       end
      ) as total_count_without_30,
    --累计近30天消费金额含退拒
     sum(case when t1.flag30 =1 
      then t1.paymoney
       else 0
       end
      ) as total_money_with_back_30,
      SUM(t1.paymoney) / SUM(1) AS atv,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 
        then t1.paymoney else 0 end) / 
        SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 then 1 else 0
              end) AS atv_withoutback,
    SUM(case when t1.flag60 = 1 
        then t1.paymoney else 0 end) / SUM(case when t1.flag60 = 1
            then 1
            else 0
          end) AS atv_60,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 and t1.flag60 = 1
        then t1.paymoney
        else 0
        end) / SUM(case when (t1.orderStatus != 10 or t1.orderStatus != 11) and t1.flag60 = 1
        then 1
        else 0
      end) AS atv_60_withoutback,
       SUM(case when t1.otherName = '学校'
        then 1
        else 0
        end) as school_order_count,      -- 学校下单总数
    SUM(case when t1.otherName = '单位'
        then 1
        else 0
        end) as company_order_count,    -- 单位下单总数
    SUM(case when t1.otherName = '家里'
        then 1
        else 0
        end) as home_order_count,        -- 家里下单总数
    SUM(case when t1.flagTimeBucket = '上午'
        then 1
        else 0
        end) as am_order_count,          -- 上午下单总数
    SUM(case when t1.flagTimeBucket = '下午'
        then 1
        else 0
        end) as pm_order_count,          -- 下午下单总数
    SUM(case when t1.flagTimeBucket = '晚上'
        then 1
        else 0
        end) as night_order_count,-- 晚上下单总数
      --最常用地址
max(case when t3.rn =1
then t3.othername
else ''
end) as most_usual_address,
--常用的支付方式
      MAX(case when t4.rn = 1
        then t4.payName
        else ''
        end) as most_usual_paytype
from
(select * from itcast_dw.fact_order_wide where dt="20190909") t1 
left join  
(select * from itcast_dw.dim_user where dt="20190909") t2 on 
t1.userid=t2.userid
    left join
    (select * from itcast_ads.tmp_order_address where dt='20190909') as t3
    on t1.userId = t3.userId
    left join (select * from itcast_ads.tmp_order_paytype where dt='20190909') as t4
    on t1.userId = t4.userId
group by t2.userid,t2.loginname limit 5;


创建ads层表、加载数据
drop table if exists itcast_ads.user_order_measure;
create table itcast_ads.user_order_measure(
    userid string,                          -- 用户id
    username string,                        -- 用户名称
    first_paytime string,                   -- 第一次消费时间
    lastest_paytime string,                 -- 最近一次消费时间
    first_day_during_days bigint,           -- 首单距今时间
    lastest_day_durning_days bigint,        -- 尾单距今时间
    min_paymoney double,                    -- 最小消费金额
    max_paymoney double,                    -- 最大消费金额
    total_count_without_back bigint,        -- 累计消费次数(不含退拒)
    total_money_without_back double,        -- 累计消费金额(不含退拒)
    total_count_without_back_30 bigint,     -- 近30天累计消费次数(不含退拒)
    total_money_without_back_30 double,     -- 近30天累计消费金额(不含退拒)
    total_count_30 bigint,                  -- 近30天累计消费次数(含退拒)
    total_money_30 double,                  -- 近30天累计消费金额(含退拒)
    atv double,                             -- 客单价(含退拒)
    atv_withoutback double,                 -- 客单价(不含退拒)
    atv_60 double,                          -- 近60天客单价(含退拒)
    atv_60_withoutback double,              -- 近60天客单价(不含退拒)
    school_order_count bigint,              -- 学校下单总数
    company_order_count bigint,             -- 单位下单总数
    home_order_count bigint,                -- 家里下单总数
    am_order_count bigint,                  -- 上午下单总数
    pm_order_count bigint,                  -- 下午下单总数
    night_order_count bigint,               -- 晚上下单总数
    most_usual_address string,              -- 常用收货地址
    most_usual_paytype string               -- 常用支付方式
)
partitioned by (dt string)
row format delimited fields terminated by '\001' stored as TEXTFILE;
    

insert overwrite table itcast_ads.user_order_measure partition (dt='20190909')
select
   t2.userid,
    t2.loginname,
    MIN(t1.payTime) as first_paytime,
    MAX(t1.payTime) as lastest_paytime,
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t1.payTime)) as first_day_during_days,
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t1.payTime)) as lastest_day_durning_days,
    MIN(t1.paymoney) as min_paymoney,
    MAX(t1.paymoney) as max_paymoney,
sum(
case when t1.orderstatus !=10 and t1.orderstatus !=11
    then 1
    else 0
    end
) as total_count_without_back,--累计消费次数不含退拒,
sum(case when t1.orderstatus !=10 and t1.orderstatus !=11
   then t1.paymoney
   else 0
    end
   ) as total_money_without_back, --累计消费金额不含退拒
   --累计近30天消费次数不含退拒
   sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then 1
       else 0
       end
      ) as total_count_without_back_30,
    --累计近30天消费金额不含退拒
     sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then t1.paymoney
       else 0
       end
      ) as total_money_without_back_30,
        --累计近30天消费次数含退拒
   sum(case when t1.flag30 =1 
      then 1
       else 0
       end
      ) as total_count_without_30,
    --累计近30天消费金额含退拒
     sum(case when t1.flag30 =1 
      then t1.paymoney
       else 0
       end
      ) as total_money_with_back_30,
      SUM(t1.paymoney) / SUM(1) AS atv,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 
        then t1.paymoney else 0 end) / 
        SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 then 1 else 0
              end) AS atv_withoutback,
    SUM(case when t1.flag60 = 1 
        then t1.paymoney else 0 end) / SUM(case when t1.flag60 = 1
            then 1
            else 0
          end) AS atv_60,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 and t1.flag60 = 1
        then t1.paymoney
        else 0
        end) / SUM(case when (t1.orderStatus != 10 or t1.orderStatus != 11) and t1.flag60 = 1
        then 1
        else 0
      end) AS atv_60_withoutback,
       SUM(case when t1.otherName = '学校'
        then 1
        else 0
        end) as school_order_count,      -- 学校下单总数
    SUM(case when t1.otherName = '单位'
        then 1
        else 0
        end) as company_order_count,    -- 单位下单总数
    SUM(case when t1.otherName = '家里'
        then 1
        else 0
        end) as home_order_count,        -- 家里下单总数
    SUM(case when t1.flagTimeBucket = '上午'
        then 1
        else 0
        end) as am_order_count,          -- 上午下单总数
    SUM(case when t1.flagTimeBucket = '下午'
        then 1
        else 0
        end) as pm_order_count,          -- 下午下单总数
    SUM(case when t1.flagTimeBucket = '晚上'
        then 1
        else 0
        end) as night_order_count,-- 晚上下单总数
      --最常用地址
max(case when t3.rn =1
then t3.othername
else ''
end) as most_usual_address,
--常用的支付方式
      MAX(case when t4.rn = 1
        then t4.payName
        else ''
        end) as most_usual_paytype
from
(select * from itcast_dw.fact_order_wide where dt="20190909") t1 
left join  
(select * from itcast_dw.dim_user where dt="20190909") t2 on 
t1.userid=t2.userid
    left join
    (select * from itcast_ads.tmp_order_address where dt='20190909') as t3
    on t1.userId = t3.userId
    left join (select * from itcast_ads.tmp_order_paytype where dt='20190909') as t4
    on t1.userId = t4.userId
group by t2.userid,t2.loginname ;


-- 测试
select * from itcast_ads.user_order_measure limit 10;
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

办公模板库 素材蛙

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值