tpcds数据压测

1.tpcds数据压测(支持关系数据库)

TPC资源准备

这个部分主要介绍如何从TPC上获得测试所需资源,入1GB数据、500GB数据的生成,建表语句和sql语句的生成等等。

1.官网下载tpc-ds压缩包
参考 git clone https://github.com/gregrahn/tpcds-kit.git
2.在tools文件中执行make
[root@node1tpc]# cd /tpcds-kit/tools
[root@node1 tools]# make
注意:若没有安装gcc会报错!
3.在tools目录下使用./dsdgen生成数据。
    --help查看帮助
    常用:
    - DIR 生成目录
    - SCALE 数据大小(单位GB)
    - DELIMITER 字段分隔符,默认|
    - TERMINATE 末尾是否有分隔符,参数 Y或者N
    例子:在/usr/datas目录下生成分隔符为|且末尾没有分隔符的1GB数据:
    ./dsdgen -DIR /usr/datas -SCALE 1 -TERMINATE N
dsdgen是个单线程程序,一般在测试过程中,会通过指定表名(“-TABLE”参数)以多个进程并发(每个进程对应1张表)的方式来加快生成数据:
./dsdgen -SCALE 1 -DISTRIBUTIONS tpcds.idx -TERMINATE N -TABLE time_dim
因为事实表普遍比较大,所以考虑使用dsdgen通过分块的方式加速生成数据:
./dsdgen -SCALE 1 -DISTRIBUTIONS tpcds.idx -TERMINATE N -TABLE catalog_sales -PARALLEL 10 -CHILD 1
命令中:“-PARALLEL 10”参数表示整个表分成10块,“-CHILD 1”参数表示生成第1块;同时启动10个dsdgen进程,每个进程CHILD编号递增,加速效果就出来了。
如果数据集比较大,可以考虑把事实表分散到多台服务器上的多个进程来生成,比如20台服务器,每台服务器启动32个进程,这样效果就更明显了:

4.(./dsqgen)生成sql测试用例
    --help 查看帮助
    常用:
    -input 输入,读取测试用例包含的模板,一般使用/query_templates/templates.lst即可。
    -directory 模板所在目录, 一般使用-directory../query_templates即可。
    -dialect 生成某个数据库的语言,可选项可以查看/query_templates目录,有oracle、db2、SqlServer等。
    -scale 数据大小(单位G)
    特别说明:/query_templates目录下的每一个query末尾都要加上:
   	由此,可以在tools目录下生成一个query_0.sql,即为生成的sql测试用例
执行: 
mkdir sql (tools下)
for id in `seq 1 99`; do ./dsqgen -DIRECTORY ../query_templates -TEMPLATE "query$id.tpl" -DIALECT netezza -FILTER Y > ./sql/"query$id.sql"; done


6.建表语句
/tpcds-kit/tools目录下,tpcds.sql文件里。很多数据平台可能不能直接使用,需要修改。建表语句的修改主要是依据不同环境支持的数据类型修改和一些基础语法修正,还需依照生成的数据的分割符在建表时指定分隔符。
7.测试用例结果
在目录/tpcds-kit/answer_sets下。

SQL语句兼容性测试以及语句修改

1.SQL语句兼容性测试
在测试前最好能进行一次99条语句的格式整理,这是有必要的,原因是获取的脚本中可能含有制表符,
完成database的建立,进入这个database进行建表操作。注意,如果不建立新的数据库,操作将在default数据库中进行。
之后可以在少量数据(1GB)条件下,遍历进行SQL语句的支持情况测试(这个测试可以在虚拟机中完成,因为本测试仅测试兼容性,对测试性能要求较低)。
首先,将本地数据load到刚才建立好的表中,如:
Load完成后,检查各表数据是否完整,这之后开始SQL兼容性测试。主要测试方法是,先遍历执行各个测试案例,如果SQL语句能执行完成,进行可完成的记录;若不能完成,则将系统报错记录,直至所有语句测试完成。

2.语句修改
通过上阶段的兼容性测试后,可将无法完成SQL的语句的报错进行汇总,将问题分别归类,同类问题一起修改。
每修改一次语句都需要对修改后的语句进行测试,这部分测试,为了效率的保证,也是在少量数据(1GB)的条件下进行测试的。在测试中可能会出现别的错误,这时需要立刻着手解决这个问题。

3.TPC-DS测试
在完成SQL语句兼容性测试后,将可运行的语句汇总,编写在一个脚本中。在测试环境中重复上述建表和load过程,500GB数据的导入完成后,执行这个脚本,实现TPC-DS测试。这样做的主要原因是,这些SQL案例测试时间较长,可能需要通过利用夜间时间进行不断电测试,否则,也可以采用如同兼容性测试同样的方法,一条一条进行测试。
在执行load以及SQL语句时,需要将log保留,以hive中执行SQL语句为例,首先切换至root用户,新建一个log文档并改变它的读写权限,然后执行脚本:
若想知道语句的执行情况,可以通过cat语句查询log,在脚本出错时需及时停止执行,导出log,分析出错原因,重新执行或是执行下一条语句。
将所有语句执行完成后,将log导出,进行下一步处理。

4.测试结果整理
将log中的各个平台的SQL执行时间数据制成表格,由此可直观的对比各个平台性能的优劣;
要了解具体支持程度,还可以进一步将测试结果正确与否、出错原因进行比对。

本测试主要关心的因素有:
1.大表之间的join
2.数据倾斜问题
3.普通action测试

在这里插入图片描述
tpcds总共24张表,7张事实表(加粗)、17张维度表.

create table dbgen_version
(
    dv_version                varchar(16)                   ,
    dv_create_date            date                          ,
    dv_create_time            TIMESTAMP                     ,
    dv_cmdline_args           varchar(200)
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/dbgen_version.dat' overwrite into table dbgen_version;

create table customer_address
(
    ca_address_sk             int               ,
    ca_address_id             char(16)              ,
    ca_street_number          char(10)                      ,
    ca_street_name            varchar(60)                   ,
    ca_street_type            char(15)                      ,
    ca_suite_number           char(10)                      ,
    ca_city                   varchar(60)                   ,
    ca_county                 varchar(30)                   ,
    ca_state                  char(2)                       ,
    ca_zip                    char(10)                      ,
    ca_country                varchar(20)                   ,
    ca_gmt_offset             decimal(5,2)                  ,
    ca_location_type          char(20)                      
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/customer_address.dat' overwrite into table customer_address;

create table customer_demographics
(
    cd_demo_sk                int               ,
    cd_gender                 char(1)                       ,
    cd_marital_status         char(1)                       ,
    cd_education_status       char(20)                      ,
    cd_purchase_estimate      int                       ,
    cd_credit_rating          char(10)                      ,
    cd_dep_count              int                       ,
    cd_dep_employed_count     int                       ,
    cd_dep_college_count      int                       
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/customer_demographics.dat' overwrite into table customer_demographics;

create table date_dim
(
    d_date_sk                 int               ,
    d_date_id                 char(16)              ,
    d_date                    date                          ,
    d_month_seq               int                       ,
    d_week_seq                int                       ,
    d_quarter_seq             int                       ,
    d_year                    int                       ,
    d_dow                     int                       ,
    d_moy                     int                       ,
    d_dom                     int                       ,
    d_qoy                     int                       ,
    d_fy_year                 int                       ,
    d_fy_quarter_seq          int                       ,
    d_fy_week_seq             int                       ,
    d_day_name                char(9)                       ,
    d_quarter_name            char(6)                       ,
    d_holiday                 char(1)                       ,
    d_weekend                 char(1)                       ,
    d_following_holiday       char(1)                       ,
    d_first_dom               int                       ,
    d_last_dom                int                       ,
    d_same_day_ly             int                       ,
    d_same_day_lq             int                       ,
    d_current_day             char(1)                       ,
    d_current_week            char(1)                       ,
    d_current_month           char(1)                       ,
    d_current_quarter         char(1)                       ,
    d_current_year            char(1)                       
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/date_dim.dat' overwrite into table date_dim;

create table warehouse
(
    w_warehouse_sk            int               ,
    w_warehouse_id            char(16)              ,
    w_warehouse_name          varchar(20)                   ,
    w_warehouse_sq_ft         int                       ,
    w_street_number           char(10)                      ,
    w_street_name             varchar(60)                   ,
    w_street_type             char(15)                      ,
    w_suite_number            char(10)                      ,
    w_city                    varchar(60)                   ,
    w_county                  varchar(30)                   ,
    w_state                   char(2)                       ,
    w_zip                     char(10)                      ,
    w_country                 varchar(20)                   ,
    w_gmt_offset              decimal(5,2)                  
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/warehouse_1_10.dat' overwrite into table warehouse;


create table ship_mode
(
    sm_ship_mode_sk           int               ,
    sm_ship_mode_id           char(16)              ,
    sm_type                   char(30)                      ,
    sm_code                   char(10)                      ,
    sm_carrier                char(20)                      ,
    sm_contract               char(20)                      
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/ship_mode.dat' overwrite into table ship_mode;


create table time_dim
(
    t_time_sk                 int               ,
    t_time_id                 char(16)              ,
    t_time                    int                       ,
    t_hour                    int                       ,
    t_minute                  int                       ,
    t_second                  int                       ,
    t_am_pm                   char(2)                       ,
    t_shift                   char(20)                      ,
    t_sub_shift               char(20)                      ,
    t_meal_time               char(20)                      
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/time_dim_1_10.dat' overwrite into table time_dim;


create table reason
(
    r_reason_sk               int               ,
    r_reason_id               char(16)              ,
    r_reason_desc             char(100)                     
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/reason.dat' overwrite into table reason;


create table income_band
(
    ib_income_band_sk         int               ,
    ib_lower_bound            int                       ,
    ib_upper_bound            int                       
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/income_band.dat' overwrite into table income_band;


create table item
(
    i_item_sk                 int               ,
    i_item_id                 char(16)              ,
    i_rec_start_date          date                          ,
    i_rec_end_date            date                          ,
    i_item_desc               varchar(200)                  ,
    i_current_price           decimal(7,2)                  ,
    i_wholesale_cost          decimal(7,2)                  ,
    i_brand_id                int                       ,
    i_brand                   char(50)                      ,
    i_class_id                int                       ,
    i_class                   char(50)                      ,
    i_category_id             int                       ,
    i_category                char(50)                      ,
    i_manufact_id             int                       ,
    i_manufact                char(50)                      ,
    i_size                    char(20)                      ,
    i_formulation             char(20)                      ,
    i_color                   char(20)                      ,
    i_units                   char(10)                      ,
    i_container               char(10)                      ,
    i_manager_id              int                       ,
    i_product_name            char(50)                      
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/item.dat' overwrite into table item;


create table store
(
    s_store_sk                int               ,
    s_store_id                char(16)              ,
    s_rec_start_date          date                          ,
    s_rec_end_date            date                          ,
    s_closed_date_sk          int                       ,
    s_store_name              varchar(50)                   ,
    s_number_employees        int                       ,
    s_floor_space             int                       ,
    s_hours                   char(20)                      ,
    s_manager                 varchar(40)                   ,
    s_market_id               int                       ,
    s_geography_class         varchar(100)                  ,
    s_market_desc             varchar(100)                  ,
    s_market_manager          varchar(40)                   ,
    s_division_id             int                       ,
    s_division_name           varchar(50)                   ,
    s_company_id              int                       ,
    s_company_name            varchar(50)                   ,
    s_street_number           varchar(10)                   ,
    s_street_name             varchar(60)                   ,
    s_street_type             char(15)                      ,
    s_suite_number            char(10)                      ,
    s_city                    varchar(60)                   ,
    s_county                  varchar(30)                   ,
    s_state                   char(2)                       ,
    s_zip                     char(10)                      ,
    s_country                 varchar(20)                   ,
    s_gmt_offset              decimal(5,2)                  ,
    s_tax_precentage          decimal(5,2)                  
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/store.dat' overwrite into table store;


create table call_center
(
    cc_call_center_sk         int               ,
    cc_call_center_id         char(16)              ,
    cc_rec_start_date         date                          ,
    cc_rec_end_date           date                          ,
    cc_closed_date_sk         int                       ,
    cc_open_date_sk           int                       ,
    cc_name                   varchar(50)                   ,
    cc_class                  varchar(50)                   ,
    cc_employees              int                       ,
    cc_sq_ft                  int                       ,
    cc_hours                  char(20)                      ,
    cc_manager                varchar(40)                   ,
    cc_mkt_id                 int                       ,
    cc_mkt_class              char(50)                      ,
    cc_mkt_desc               varchar(100)                  ,
    cc_market_manager         varchar(40)                   ,
    cc_division               int                       ,
    cc_division_name          varchar(50)                   ,
    cc_company                int                       ,
    cc_company_name           char(50)                      ,
    cc_street_number          char(10)                      ,
    cc_street_name            varchar(60)                   ,
    cc_street_type            char(15)                      ,
    cc_suite_number           char(10)                      ,
    cc_city                   varchar(60)                   ,
    cc_county                 varchar(30)                   ,
    cc_state                  char(2)                       ,
    cc_zip                    char(10)                      ,
    cc_country                varchar(20)                   ,
    cc_gmt_offset             decimal(5,2)                  ,
    cc_tax_percentage         decimal(5,2)                  
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/call_center.dat' overwrite into table call_center;


create table customer
(
    c_customer_sk             int               ,
    c_customer_id             char(16)              ,
    c_current_cdemo_sk        int                       ,
    c_current_hdemo_sk        int                       ,
    c_current_addr_sk         int                       ,
    c_first_shipto_date_sk    int                       ,
    c_first_sales_date_sk     int                       ,
    c_salutation              char(10)                      ,
    c_first_name              char(20)                      ,
    c_last_name               char(30)                      ,
    c_preferred_cust_flag     char(1)                       ,
    c_birth_day               int                       ,
    c_birth_month             int                       ,
    c_birth_year              int                       ,
    c_birth_country           varchar(20)                   ,
    c_login                   char(13)                      ,
    c_email_address           char(50)                      ,
    c_last_review_date_sk     int                       
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/customer.dat' overwrite into table customer;


create table web_site
(
    web_site_sk               int               ,
    web_site_id               char(16)              ,
    web_rec_start_date        date                          ,
    web_rec_end_date          date                          ,
    web_name                  varchar(50)                   ,
    web_open_date_sk          int                       ,
    web_close_date_sk         int                       ,
    web_class                 varchar(50)                   ,
    web_manager               varchar(40)                   ,
    web_mkt_id                int                       ,
    web_mkt_class             varchar(50)                   ,
    web_mkt_desc              varchar(100)                  ,
    web_market_manager        varchar(40)                   ,
    web_company_id            int                       ,
    web_company_name          char(50)                      ,
    web_street_number         char(10)                      ,
    web_street_name           varchar(60)                   ,
    web_street_type           char(15)                      ,
    web_suite_number          char(10)                      ,
    web_city                  varchar(60)                   ,
    web_county                varchar(30)                   ,
    web_state                 char(2)                       ,
    web_zip                   char(10)                      ,
    web_country               varchar(20)                   ,
    web_gmt_offset            decimal(5,2)                  ,
    web_tax_percentage        decimal(5,2)                  
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/web_site_1_10.dat' overwrite into table web_site;


create table store_returns
(
    sr_returned_date_sk       int                       ,
    sr_return_time_sk         int                       ,
    sr_item_sk                int               ,
    sr_customer_sk            int                       ,
    sr_cdemo_sk               int                       ,
    sr_hdemo_sk               int                       ,
    sr_addr_sk                int                       ,
    sr_store_sk               int                       ,
    sr_reason_sk              int                       ,
    sr_ticket_number          int               ,
    sr_return_quantity        int                       ,
    sr_return_amt             decimal(7,2)                  ,
    sr_return_tax             decimal(7,2)                  ,
    sr_return_amt_inc_tax     decimal(7,2)                  ,
    sr_fee                    decimal(7,2)                  ,
    sr_return_ship_cost       decimal(7,2)                  ,
    sr_refunded_cash          decimal(7,2)                  ,
    sr_reversed_charge        decimal(7,2)                  ,
    sr_store_credit           decimal(7,2)                  ,
    sr_net_loss               decimal(7,2)                  
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/store_returns.dat' overwrite into table store_returns;


create table household_demographics
(
    hd_demo_sk                int               ,
    hd_income_band_sk         int                       ,
    hd_buy_potential          char(15)                      ,
    hd_dep_count              int                       ,
    hd_vehicle_count          int                       
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/household_demographics.dat' overwrite into table household_demographics;

create table web_page
(
    wp_web_page_sk            int               ,
    wp_web_page_id            char(16)              ,
    wp_rec_start_date         date                          ,
    wp_rec_end_date           date                          ,
    wp_creation_date_sk       int                       ,
    wp_access_date_sk         int                       ,
    wp_autogen_flag           char(1)                       ,
    wp_customer_sk            int                       ,
    wp_url                    varchar(100)                  ,
    wp_type                   char(50)                      ,
    wp_char_count             int                       ,
    wp_link_count             int                       ,
    wp_image_count            int                       ,
    wp_max_ad_count           int                       
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/web_page_1_10.dat' overwrite into table web_page;

create table promotion
(
    p_promo_sk                int               ,
    p_promo_id                char(16)              ,
    p_start_date_sk           int                       ,
    p_end_date_sk             int                       ,
    p_item_sk                 int                       ,
    p_cost                    decimal(15,2)                 ,
    p_response_target         int                       ,
    p_promo_name              char(50)                      ,
    p_channel_dmail           char(1)                       ,
    p_channel_email           char(1)                       ,
    p_channel_catalog         char(1)                       ,
    p_channel_tv              char(1)                       ,
    p_channel_radio           char(1)                       ,
    p_channel_press           char(1)                       ,
    p_channel_event           char(1)                       ,
    p_channel_demo            char(1)                       ,
    p_channel_details         varchar(100)                  ,
    p_purpose                 char(15)                      ,
    p_discount_active         char(1)                       
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/promotion.dat' overwrite into table promotion;


create table catalog_page
(
    cp_catalog_page_sk        int               ,
    cp_catalog_page_id        char(16)              ,
    cp_start_date_sk          int                       ,
    cp_end_date_sk            int                       ,
    cp_department             varchar(50)                   ,
    cp_catalog_number         int                       ,
    cp_catalog_page_number    int                       ,
    cp_description            varchar(100)                  ,
    cp_type                   varchar(100)                  
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/catalog_page.dat' overwrite into table catalog_page;


create table inventory
(
    inv_date_sk               int               ,
    inv_item_sk               int               ,
    inv_warehouse_sk          int               ,
    inv_quantity_on_hand      int                       
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/inventory.dat' overwrite into table inventory;


create table catalog_returns
(
    cr_returned_date_sk       int                       ,
    cr_returned_time_sk       int                       ,
    cr_item_sk                int               ,
    cr_refunded_customer_sk   int                       ,
    cr_refunded_cdemo_sk      int                       ,
    cr_refunded_hdemo_sk      int                       ,
    cr_refunded_addr_sk       int                       ,
    cr_returning_customer_sk  int                       ,
    cr_returning_cdemo_sk     int                       ,
    cr_returning_hdemo_sk     int                       ,
    cr_returning_addr_sk      int                       ,
    cr_call_center_sk         int                       ,
    cr_catalog_page_sk        int                       ,
    cr_ship_mode_sk           int                       ,
    cr_warehouse_sk           int                       ,
    cr_reason_sk              int                       ,
    cr_order_number           int               ,
    cr_return_quantity        int                       ,
    cr_return_amount          decimal(7,2)                  ,
    cr_return_tax             decimal(7,2)                  ,
    cr_return_amt_inc_tax     decimal(7,2)                  ,
    cr_fee                    decimal(7,2)                  ,
    cr_return_ship_cost       decimal(7,2)                  ,
    cr_refunded_cash          decimal(7,2)                  ,
    cr_reversed_charge        decimal(7,2)                  ,
    cr_store_credit           decimal(7,2)                  ,
    cr_net_loss               decimal(7,2)                  
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/catalog_returns.dat' overwrite into table catalog_returns;


create table web_returns
(
    wr_returned_date_sk       int                       ,
    wr_returned_time_sk       int                       ,
    wr_item_sk                int               ,
    wr_refunded_customer_sk   int                       ,
    wr_refunded_cdemo_sk      int                       ,
    wr_refunded_hdemo_sk      int                       ,
    wr_refunded_addr_sk       int                       ,
    wr_returning_customer_sk  int                       ,
    wr_returning_cdemo_sk     int                       ,
    wr_returning_hdemo_sk     int                       ,
    wr_returning_addr_sk      int                       ,
    wr_web_page_sk            int                       ,
    wr_reason_sk              int                       ,
    wr_order_number           int               ,
    wr_return_quantity        int                       ,
    wr_return_amt             decimal(7,2)                  ,
    wr_return_tax             decimal(7,2)                  ,
    wr_return_amt_inc_tax     decimal(7,2)                  ,
    wr_fee                    decimal(7,2)                  ,
    wr_return_ship_cost       decimal(7,2)                  ,
    wr_refunded_cash          decimal(7,2)                  ,
    wr_reversed_charge        decimal(7,2)                  ,
    wr_account_credit         decimal(7,2)                  ,
    wr_net_loss               decimal(7,2)                  
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/web_returns_1_10.dat' overwrite into table web_returns;

create table web_sales
(
    ws_sold_date_sk           int                       ,
    ws_sold_time_sk           int                       ,
    ws_ship_date_sk           int                       ,
    ws_item_sk                int               ,
    ws_bill_customer_sk       int                       ,
    ws_bill_cdemo_sk          int                       ,
    ws_bill_hdemo_sk          int                       ,
    ws_bill_addr_sk           int                       ,
    ws_ship_customer_sk       int                       ,
    ws_ship_cdemo_sk          int                       ,
    ws_ship_hdemo_sk          int                       ,
    ws_ship_addr_sk           int                       ,
    ws_web_page_sk            int                       ,
    ws_web_site_sk            int                       ,
    ws_ship_mode_sk           int                       ,
    ws_warehouse_sk           int                       ,
    ws_promo_sk               int                       ,
    ws_order_number           int               ,
    ws_quantity               int                       ,
    ws_wholesale_cost         decimal(7,2)                  ,
    ws_list_price             decimal(7,2)                  ,
    ws_sales_price            decimal(7,2)                  ,
    ws_ext_discount_amt       decimal(7,2)                  ,
    ws_ext_sales_price        decimal(7,2)                  ,
    ws_ext_wholesale_cost     decimal(7,2)                  ,
    ws_ext_list_price         decimal(7,2)                  ,
    ws_ext_tax                decimal(7,2)                  ,
    ws_coupon_amt             decimal(7,2)                  ,
    ws_ext_ship_cost          decimal(7,2)                  ,
    ws_net_paid               decimal(7,2)                  ,
    ws_net_paid_inc_tax       decimal(7,2)                  ,
    ws_net_paid_inc_ship      decimal(7,2)                  ,
    ws_net_paid_inc_ship_tax  decimal(7,2)                  ,
    ws_net_profit             decimal(7,2)                  
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/web_sales_1_10.dat' overwrite into table web_sales;

create table catalog_sales
(
    cs_sold_date_sk           int                       ,
    cs_sold_time_sk           int                       ,
    cs_ship_date_sk           int                       ,
    cs_bill_customer_sk       int                       ,
    cs_bill_cdemo_sk          int                       ,
    cs_bill_hdemo_sk          int                       ,
    cs_bill_addr_sk           int                       ,
    cs_ship_customer_sk       int                       ,
    cs_ship_cdemo_sk          int                       ,
    cs_ship_hdemo_sk          int                       ,
    cs_ship_addr_sk           int                       ,
    cs_call_center_sk         int                       ,
    cs_catalog_page_sk        int                       ,
    cs_ship_mode_sk           int                       ,
    cs_warehouse_sk           int                       ,
    cs_item_sk                int               ,
    cs_promo_sk               int                       ,
    cs_order_number           int               ,
    cs_quantity               int                       ,
    cs_wholesale_cost         decimal(7,2)                  ,
    cs_list_price             decimal(7,2)                  ,
    cs_sales_price            decimal(7,2)                  ,
    cs_ext_discount_amt       decimal(7,2)                  ,
    cs_ext_sales_price        decimal(7,2)                  ,
    cs_ext_wholesale_cost     decimal(7,2)                  ,
    cs_ext_list_price         decimal(7,2)                  ,
    cs_ext_tax                decimal(7,2)                  ,
    cs_coupon_amt             decimal(7,2)                  ,
    cs_ext_ship_cost          decimal(7,2)                  ,
    cs_net_paid               decimal(7,2)                  ,
    cs_net_paid_inc_tax       decimal(7,2)                  ,
    cs_net_paid_inc_ship      decimal(7,2)                  ,
    cs_net_paid_inc_ship_tax  decimal(7,2)                  ,
    cs_net_profit             decimal(7,2)                  
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/catalog_sales.dat' overwrite into table catalog_sales;
;

create table store_sales
(
    ss_sold_date_sk           int                       ,
    ss_sold_time_sk           int                       ,
    ss_item_sk                int               ,
    ss_customer_sk            int                       ,
    ss_cdemo_sk               int                       ,
    ss_hdemo_sk               int                       ,
    ss_addr_sk                int                       ,
    ss_store_sk               int                       ,
    ss_promo_sk               int                       ,
    ss_ticket_number          int               ,
    ss_quantity               int                       ,
    ss_wholesale_cost         decimal(7,2)                  ,
    ss_list_price             decimal(7,2)                  ,
    ss_sales_price            decimal(7,2)                  ,
    ss_ext_discount_amt       decimal(7,2)                  ,
    ss_ext_sales_price        decimal(7,2)                  ,
    ss_ext_wholesale_cost     decimal(7,2)                  ,
    ss_ext_list_price         decimal(7,2)                  ,
    ss_ext_tax                decimal(7,2)                  ,
    ss_coupon_amt             decimal(7,2)                  ,
    ss_net_paid               decimal(7,2)                  ,
    ss_net_paid_inc_tax       decimal(7,2)                  ,
    ss_net_profit             decimal(7,2)                  
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
load data local inpath '/glusterfs-user-data/data/store_sales.dat' overwrite into table store_sales;
  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值