关闭

大数据平台用于生成数据跑批脚本的脚本(version2.0)

标签: Linuxlinux shell大数据脚本
16243人阅读 评论(0) 收藏 举报
分类:


一、脚本文件路径

[hs@master script_generate]$ pwd

/home/hs/opt/dw-etl/script_generate

[hs@master script_generate]$ tree -f

.

├── ./batch_table.list

├── ./etl_table_list_update.sh

├── ./exec_hivessa_create_tab.sh

├── ./exec_rdsqrt_create_tab.sh

├── ./exec_rdssor_close2source_create_tab.sh

├── ./qrt-create_rdstab_script.sh

├── ./qrt-create_src2qrt_script.sh

├── ./rds-sor-increment_tab_should_create_idx_sql.sh

├── ./real_table.list

├── ./script_dir

   ├──./script_dir/hive-ssa_create_tab_script

      ├──./script_dir/hive-ssa_create_tab_script/99_areajoin.hql

      ├──./script_dir/hive-ssa_create_tab_script/99_city.hql

      └── .........

   ├──./script_dir/list_dir

      ├──./script_dir/list_dir/batch_table_userpass.list

      ├──./script_dir/list_dir/dw_dbuserpass_info.list

      ├──./script_dir/list_dir/real_table_userpass.list

      ├──./script_dir/list_dir/src_userpass_info.list

      └──./script_dir/list_dir/table_instance_info.list

   ├── ./script_dir/rds-qrt_create_tab_script

      ├──./script_dir/rds-qrt_create_tab_script/99_coupon.sql

      ├──./script_dir/rds-qrt_create_tab_script/99_dormdealerpurchaseitem.sql

      └── .........

   ├──./script_dir/rds-sor_close2source_create_tab_script

      ├──./script_dir/rds-sor_close2source_create_tab_script/99_city.sql

      ├──./script_dir/rds-sor_close2source_create_tab_script/99_coupon.sql

      └── .........

   ├──./script_dir/rds-sor_increment_tab_should_create_idx.sql

   ├──./script_dir/rds-ssa_add_comment_script.sql

   ├──./script_dir/rds-ssa_create_tab_script

      ├──./script_dir/rds-ssa_create_tab_script/99_city.sql

      ├──./script_dir/rds-ssa_create_tab_script/99_coupon.sql

      └── .........

   └──./script_dir/ssa-hive_increment_data_rollback.hql

├── ./sor-close2source_create_hive2mysql_script.sh

├── ./sor-close2source_create_rdstab_script.sh

├── ./sor-close2source_data_pull_from_ssa_script.sh

├── ./ssa-create_hivetab_script.sh

├── ./ssa-create_src2ssa_increment_int_script.sh

└── ./ssa-create_src2ssa_script.sh

 

7 directories, 603 files

 

二、配置文件列表清单及处理脚本

1、批量跑批脚本清单

[hs@master script_generate]$ catbatch_table.list

99_dormteam|db99team|ipaddress.mysql.rds.aliyuncs.com|h_99_dormteam|total

99_dormteamchangelog|db99team|ipaddress.mysql.rds.aliyuncs.com|h_99_dormteamchangelog|total

99_dormteamcombatgains|db99team|ipaddress.mysql.rds.aliyuncs.com|h_99_dormteamcombatgains|total

99_dormteamcombatgainsitem|db99team|ipaddress.mysql.rds.aliyuncs.com|h_99_dormteamcombatgainsitem|total

99_coupon|db99hx|ipaddress.mysql.rds.aliyuncs.com|h_99_coupon|increment

 

2、准实时跑批脚本清单

[hs@master script_generate]$ catreal_table.list

99_coupon|db99hx|ipaddress.mysql.rds.aliyuncs.com|real_99_coupon|increment

99_dorm|db99hx|ipaddress.mysql.rds.aliyuncs.com|real_99_dorm|total

99_dormdealerpurchase|db99hx|ipaddress.mysql.rds.aliyuncs.com|real_99_dormdealerpurchase|total

99_dormdealerpurchaseitem|db99hx|ipaddress.mysql.rds.aliyuncs.com|real_99_dormdealerpurchaseitem|increment

 

3、源系统连接用户密码信息

[hs@master list_dir]$ catsrc_userpass_info.list

ipaddress.mysql.rds.aliyuncs.com|dbreader|passwdchar

ipaddress.mysql.rds.aliyuncs.com|dbreader|passwdchar

ipaddress.mysql.rds.aliyuncs.com|dbreader|passwdchar

ipaddress.mysql.rds.aliyuncs.com|dbreader|passwdchar

ipaddress.mysql.rds.aliyuncs.com|dbreader|passwdchar

 

4、数据仓库连接用户密码信息

[hs@master list_dir]$ catdw_dbuserpass_info.list

ipaddress.mysql.rds.aliyuncs.com|datauser|passwdchar|sor

ipaddress.mysql.rds.aliyuncs.com|datauser|passwdchar|qrt

 

5、用户跑批清单处理脚本

[hs@master script_generate]$ catetl_table_list_update.sh

#!/bin/bash

 

#batch per day script proc ...

sort -t "|" -k3,3/home/hs/opt/dw-etl/script_generate/batch_table.list >/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_sort.list

join/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_sort.list/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/src_userpass_info.list-t "|" -1 3 -2 1 -o 1.1,1.2,1.3,1.4,1.5,2.2,2.3 >/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list

rm -rf/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_sort.list

 

#batch real time script proc ...

sort -t "|" -k3,3/home/hs/opt/dw-etl/script_generate/real_table.list >/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_sort.list

join/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_sort.list/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/src_userpass_info.list-t "|" -1 3 -2 1 -o 1.1,1.2,1.3,1.4,1.5,2.2,2.3 >/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list

rm -rf/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_sort.list

 

#get instance info for edit userpass

cat/home/hs/opt/dw-etl/script_generate/batch_table.list/home/hs/opt/dw-etl/script_generate/real_table.list|awk  -F "|"  '{print $3}'| sort | uniq >/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/table_instance_info.list

 

三、数据仓库第一层SSA脚本生成

数据仓库第一层是基于hive库进行建表语句及数据装载语句的生成。

1ssa生成hive建表语句的脚本

[hs@master script_generate]$ catssa-create_hivetab_script.sh

#!/bin/bash

#export yesterday=`date -d last-day+%Y%m%d`

 

rm -rf /home/hs/opt/dw-etl/script_generate/script_dir/hive-ssa_create_tab_script

if [ ! -d/home/hs/opt/dw-etl/script_generate/script_dir/hive-ssa_create_tab_script];then

  mkdir -p/home/hs/opt/dw-etl/script_generate/script_dir/hive-ssa_create_tab_script

fi

 

i=1

 

for tab in $(cat /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

do

 

col_num=$i

tab_name=$(awk -F "|"'NR=='$col_num' {print $1}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

db_name=$(awk -F "|"'NR=='$col_num' {print $2}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

db_ip=$(awk -F "|"'NR=='$col_num' {print $3}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

rds_map_tab_name=$(awk -F "|" 'NR=='$col_num'{print $4}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

load_type=$(awk -F "|"'NR=='$col_num' {print $5}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

user_name=$(awk -F "|"'NR=='$col_num' {print $6}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

passwd=$(awk -F "|"'NR=='$col_num' {print $7}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

 

echo -e `mysql -h$db_ip -u$user_name-p$passwd -N -e"set session group_concat_max_len=20000000;

select CONCAT('use ssa;

drop table if exists\\\`',table_name,'\\\`;

create table \\\`',table_name,'\\\`(',

GROUP_CONCAT('\n\\\`',COLUMN_NAME,'\\\`  ',DATA_TYPE,''),

')

row format delimited

fields terminated by \'|\'

lines terminated by \'"\\n"\'

stored as textfile;')

from (

select

COLUMN_NAME,table_name,

case when DATA_TYPEin('int','bigint','mediumint','smallint','tinyint') and COLUMN_NAME not like'%time' then 'bigint' when DATA_TYPE in('varchar','char') thenCONCAT('varchar','(',CHARACTER_MAXIMUM_LENGTH*3,')') when DATA_TYPEin('decimal') thenCONCAT('decimal','(',NUMERIC_PRECISION*2,',',NUMERIC_SCALE*2,')') whenDATA_TYPE in('text','enum') then 'string' when COLUMN_NAME like '%time' anddata_type = 'int' then 'varchar(19)' else DATA_TYPE end data_type

from information_schema.COLUMNS

where TABLE_SCHEMA='$db_name' andtable_name='$tab_name') a1;"` >/home/hs/opt/dw-etl/script_generate/script_dir/hive-ssa_create_tab_script/$tab_name.hql

 

: $(( i++ ))

 

done

 

sed -i "s/'n'/'\\\n'/g" `grep"lines terminated by" -rl/home/hs/opt/dw-etl/script_generate/script_dir/hive-ssa_create_tab_script`

 

 

2ssa生成从源系统拉取数据并装载到hive上的脚本

[hs@master script_generate]$ catssa-create_src2ssa_script.sh

#!/bin/bash

#export yesterday=`date -d last-day+%Y%m%d`

 

rm -rf/home/hs/opt/dw-etl/etl-script/ssa-increment/home/hs/opt/dw-etl/etl-script/ssa-total

 

i=1

 

for tab in $(cat/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

do

 

col_num=$i

 

tab_name=$(awk -F "|"'NR=='$col_num' {print $1}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

db_name=$(awk -F "|"'NR=='$col_num' {print $2}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

db_ip=$(awk -F "|"'NR=='$col_num' {print $3}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

rds_map_tab_name=$(awk -F "|"'NR=='$col_num' {print $4}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

load_type=$(awk -F "|"'NR=='$col_num' {print $5}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

user_name=$(awk -F "|"'NR=='$col_num' {print $6}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

passwd=$(awk -F "|"'NR=='$col_num' {print $7}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

 

 

if [ ! -d /home/hs/opt/dw-etl/etl-script/ssa-$load_type/$db_name];then

  mkdir -p /home/hs/opt/dw-etl/etl-script/ssa-$load_type/$db_name

fi

 

rm -rf/home/hs/opt/dw-etl/etl-script/ssa-$load_type/$db_name/src2ssa_$tab_name.sh

 

if [ $load_type == 'total' ];then

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

selectCONCAT('#!/bin/bashhuanhang#$tab_name total data proc...huanhang/usr/local/bin/mysql -h$db_ip -u$user_name -p$passwd -N -e\"'

,'select CONCAT(',trim(TRAILING ',\'|\''FROM GROUP_CONCAT(CONCAT('ifnull(',case when data_type in ('varchar') thenconcat('replace(replace(replace(\\\\\`',COLUMN_NAME,'\\\\\`,char(13),\'\'),char(10),\'\'),\'|\',\'\')')when COLUMN_NAME like '%time' and data_type = 'int' then concat('from_unixtime(\\\\\`',COLUMN_NAME,'\\\\\`,\'%Y-%m-%d%H:%i:%s\')') else concat('\\\\\`',COLUMN_NAME,'\\\\\`')end,',\'\')',',\'|\''))) ,') from ',TABLE_SCHEMA,'.',table_name,';'

,'\">/home/hs/opt/dw-etl/data/',TABLE_SCHEMA,'.',table_name,'_total_\$1.dathuanhang'

,'/home/hs/opt/hive-1.2.1/bin/hive -e\"use ssa;load data local inpath\'/home/hs/opt/dw-etl/data/',TABLE_SCHEMA,'.',table_name,'_total_\$1.dat\'overwrite into table ',table_name,';\"') mysql_export_and_hive_load_shell

from information_schema.COLUMNS

where TABLE_SCHEMA='$db_name' andtable_name='$tab_name';"  >>/home/hs/opt/dw-etl/etl-script/ssa-$load_type/$db_name/src2ssa_$tab_name.sh

 

elif [ $load_type == 'increment' ];then

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

selectCONCAT('#!/bin/bashhuanhang#$tab_name data proc ...huanhang/usr/local/bin/mysql-h$db_ip -u$user_name -p$passwd -N -e\"'

,'select CONCAT(',trim(TRAILING ',\'|\''FROM GROUP_CONCAT(CONCAT('ifnull(',case when data_type in ('varchar') then concat('replace(replace(replace(\\\\\`',COLUMN_NAME,'\\\\\`,char(13),\'\'),char(10),\'\'),\'|\',\'\')')when COLUMN_NAME like '%time' and data_type = 'int' thenconcat('from_unixtime(\\\\\`',COLUMN_NAME,'\\\\\`,\'%Y-%m-%d %H:%i:%s\')') elseconcat('\\\\\`',COLUMN_NAME,'\\\\\`') end,',\'\')',',\'|\''))) ,') from',TABLE_SCHEMA,'.',table_name,' where timeline between str_to_date(\'\$103:00:00\',\'%Y-%m-%d %H:%i:%s\') and str_to_date(\'\$2 02:59:59\',\'%Y-%m-%d%H:%i:%s\');'

,'\">/home/hs/opt/dw-etl/data/',TABLE_SCHEMA,'.',table_name,'_\$1.dathuanhang'

,'/home/hs/opt/hive-1.2.1/bin/hive -e\"use ssa;load data local inpath\'/home/hs/opt/dw-etl/data/',TABLE_SCHEMA,'.',table_name,'_\$1.dat\' into table',table_name,';\"') mysql_export_and_hive_load_shell

from information_schema.COLUMNS

where TABLE_SCHEMA='$db_name' andtable_name='$tab_name';"  >>/home/hs/opt/dw-etl/etl-script/ssa-increment/$db_name/src2ssa_$tab_name.sh

 

fi

 

sed -i "s/\\\\\`/\\\`/g"/home/hs/opt/dw-etl/etl-script/ssa-$load_type/$db_name/src2ssa_$tab_name.sh

 

#echo -e \\n >>/home/hs/opt/dw-etl/etl-script/ssa-$load_type/$db_name/src2ssa_$tab_name.sh

 

sed -i "s/huanhang/\n/g" `grep"mysql" -rl /home/hs/opt/dw-etl/etl-script/ssa-$load_type/$db_name/`

 

chmod 777/home/hs/opt/dw-etl/etl-script/ssa-$load_type/$db_name/src2ssa_$tab_name.sh

 

: $(( i++ ))

 

done

 

#sed -i "s/huanhang/\n/g" `grep"mysql" -rl /home/hs/opt/dw-etl/etl-script/ssa-$load_type/$db_name/`

 

 

3ssa源数据增量数据拉取的初始化脚本生成

[hs@master script_generate]$ catssa-create_src2ssa_increment_int_script.sh

#!/bin/bash

#export yesterday=`date -d last-day+%Y%m%d`

 

cat/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list|grep increment|grep -v '59_order|\|59_orderfood|\|59_dormitem|' >/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass-for-ssa-increment-int.list

rm -rf/home/hs/opt/dw-etl/etl-script/ssa-increment-int/

i=1

 

for tab in $(cat/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass-for-ssa-increment-int.list)

do

 

col_num=$i

 

tab_name=$(awk -F "|"'NR=='$col_num' {print $1}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass-for-ssa-increment-int.list)

db_name=$(awk -F "|"'NR=='$col_num' {print $2}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass-for-ssa-increment-int.list)

db_ip=$(awk -F "|"'NR=='$col_num' {print $3}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass-for-ssa-increment-int.list)

rds_map_tab_name=$(awk -F "|"'NR=='$col_num' {print $4}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass-for-ssa-increment-int.list)

load_type=$(awk -F "|"'NR=='$col_num' {print $5}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass-for-ssa-increment-int.list)

user_name=$(awk -F "|"'NR=='$col_num' {print $6}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass-for-ssa-increment-int.list)

passwd=$(awk -F "|"'NR=='$col_num' {print $7}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass-for-ssa-increment-int.list)

 

 

if [ ! -d/home/hs/opt/dw-etl/etl-script/ssa-$load_type-int/$db_name ];then

  mkdir -p /home/hs/opt/dw-etl/etl-script/ssa-$load_type-int/$db_name

fi

 

rm -rf/home/hs/opt/dw-etl/etl-script/ssa-$load_type-int/$db_name/src2ssa_$tab_name.sh

 

if [ $load_type == 'increment' ];then

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

select CONCAT('#!/bin/bashhuanhang#$tab_nametotal data proc ...huanhang/usr/local/bin/mysql -h$db_ip -u$user_name -p$passwd-N -e\"'

,'select CONCAT(',trim(TRAILING ',\'|\''FROM GROUP_CONCAT(CONCAT('ifnull(',case when data_type in ('varchar') thenconcat('replace(replace(replace(\\\\\`',COLUMN_NAME,'\\\\\`,char(13),\'\'),char(10),\'\'),\'|\',\'\')')when COLUMN_NAME like '%time' and data_type = 'int' thenconcat('from_unixtime(\\\\\`',COLUMN_NAME,'\\\\\`,\'%Y-%m-%d %H:%i:%s\')') elseconcat('\\\\\`',COLUMN_NAME,'\\\\\`') end,',\'\')',',\'|\''))) ,') from',TABLE_SCHEMA,'.',table_name,';'

,'\">/home/hs/opt/dw-etl/data/',TABLE_SCHEMA,'.',table_name,'_total_\$1.dathuanhang'

,'/home/hs/opt/hive-1.2.1/bin/hive -e\"use ssa;load data local inpath \'/home/hs/opt/dw-etl/data/',TABLE_SCHEMA,'.',table_name,'_total_\$1.dat\'overwrite into table ',table_name,';\"') mysql_export_and_hive_load_shell

from information_schema.COLUMNS

where TABLE_SCHEMA='$db_name' andtable_name='$tab_name';"  >>/home/hs/opt/dw-etl/etl-script/ssa-$load_type-int/$db_name/src2ssa_$tab_name.sh

 

fi

 

sed -i "s/\\\\\`/\\\`/g"/home/hs/opt/dw-etl/etl-script/ssa-$load_type-int/$db_name/src2ssa_$tab_name.sh

 

#echo -e \\n >>/home/hs/opt/dw-etl/etl-script/ssa-$load_type-int/$db_name/src2ssa_$tab_name.sh

 

sed -i "s/huanhang/\n/g" `grep"mysql" -rl/home/hs/opt/dw-etl/etl-script/ssa-$load_type-int/$db_name/`

 

chmod 777/home/hs/opt/dw-etl/etl-script/ssa-$load_type-int/$db_name/src2ssa_$tab_name.sh

 

: $(( i++ ))

 

done

 

rm -rf/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass-for-ssa-increment-int.list

#sed -i "s/huanhang/\n/g" `grep"mysql" -rl/home/hs/opt/dw-etl/etl-script/ssa-$load_type-int/$db_name/`

 

四、数据仓库第二层SOR脚本生成

数据仓库第二层,是在Hive上进行运算的基础上,再将数据从hive传输到rds-mysql数据库。

 

1sor贴源数据部分从ssa拉取数据(ssa全量加载的全量拉取,ssa增量摘取的生成全量数据拉取)

[hs@master script_generate]$ catsor-close2source_data_pull_from_ssa_script.sh

#/bin/bash

#export yesterday=`date -d last-day+%Y%m%d`

 

rm -rf /home/hs/opt/dw-etl/etl-script/sor/sor_close2source_data_pull_proc_from_ssa.hql

 

i=1

 

for tab in $(cat/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

do

 

col_num=$i

tab_name=$(awk -F "|"'NR=='$col_num' {print $1}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

db_name=$(awk -F "|"'NR=='$col_num' {print $2}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

db_ip=$(awk -F "|"'NR=='$col_num' {print $3}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

rds_map_tab_name=$(awk -F "|"'NR=='$col_num' {print $4}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

load_type=$(awk -F "|"'NR=='$col_num' {print $5}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

user_name=$(awk -F "|"'NR=='$col_num' {print $6}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

passwd=$(awk -F "|"'NR=='$col_num' {print $7}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

 

 

if [ $load_type == 'total' ];then

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

select CONCAT('drop table if existssor.',a1.TABLE_NAME,';create table sor.',a1.TABLE_NAME,' as select * fromssa.',a1.TABLE_NAME,' a1;' ) current_all_sql

from information_schema.TABLES a1

left joininformation_schema.KEY_COLUMN_USAGE a2

on a1.TABLE_NAME=a2.TABLE_NAME anda1.TABLE_SCHEMA=a2.TABLE_SCHEMA and a2.CONSTRAINT_name='PRIMARY'

where a1.TABLE_SCHEMA ='$db_name' anda1.table_name ='$tab_name';" >>/home/hs/opt/dw-etl/etl-script/sor/sor_close2source_data_pull_proc_from_ssa.hql

 

elif [ $load_type == 'increment' ];then

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

select CONCAT('drop table if existssor.',a1.TABLE_NAME,';create table sor.',a1.TABLE_NAME,' as select * from(select a1.*,row_number() over(partition by a1.',a2.column_name,' order bya1.timeline desc)as rn from ssa.',a1.TABLE_NAME,' a1) b1 where b1.rn=1;' )current_all_sql

from information_schema.TABLES a1

left joininformation_schema.KEY_COLUMN_USAGE a2

on a1.TABLE_NAME=a2.TABLE_NAME anda1.TABLE_SCHEMA=a2.TABLE_SCHEMA and a2.CONSTRAINT_name='PRIMARY'

where a1.TABLE_SCHEMA ='$db_name' anda1.table_name ='$tab_name';" >>/home/hs/opt/dw-etl/etl-script/sor/sor_close2source_data_pull_proc_from_ssa.hql

 

fi

 

#echo -e \\n >>/home/hs/opt/dw-etl/etl-script/sor/sor_close2source_data_pull_proc_from_ssa.hql

 

: $(( i++ ))

 

done

 

 

2sorrds库贴源部分建表语句的生成

[hs@master script_generate]$ catsor-close2source_create_rdstab_script.sh

#!/bin/bash

#export yesterday=`date -d last-day+%Y%m%d`

 

rm -rf/home/hs/opt/dw-etl/script_generate/script_dir/rds-sor_close2source_create_tab_script

if [ ! -d/home/hs/opt/dw-etl/script_generate/script_dir/rds-sor_close2source_create_tab_script];then

  mkdir -p /home/hs/opt/dw-etl/script_generate/script_dir/rds-sor_close2source_create_tab_script

fi

 

i=1

 

for tab in $(cat/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

do

 

col_num=$i

tab_name=$(awk -F "|"'NR=='$col_num' {print $1}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

db_name=$(awk -F "|"'NR=='$col_num' {print $2}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

db_ip=$(awk -F "|"'NR=='$col_num' {print $3}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

rds_map_tab_name=$(awk -F "|"'NR=='$col_num' {print $4}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

load_type=$(awk -F "|"'NR=='$col_num' {print $5}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

user_name=$(awk -F "|"'NR=='$col_num' {print $6}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

passwd=$(awk -F "|"'NR=='$col_num' {print $7}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

 

echo -e `mysql -h$db_ip -u$user_name-p$passwd -N -e"set session group_concat_max_len=20000000;

select CONCAT('use sor;

drop table if exists\\\`',table_name,'\\\`;

create table \\\`',table_name,'\\\`(',

GROUP_CONCAT('\n\\\`',COLUMN_NAME,'\\\`  ',DATA_TYPE,''),

') ENGINE=InnoDB DEFAULT CHARSET=utf8;')

from (

select

COLUMN_NAME,'$rds_map_tab_name' table_name,

case when DATA_TYPEin('int','bigint','mediumint','smallint','tinyint') and COLUMN_NAME not like'%time' then 'bigint' when DATA_TYPE in('varchar','char') thenCONCAT('varchar','(',CHARACTER_MAXIMUM_LENGTH*3,')') when DATA_TYPEin('decimal') thenCONCAT('decimal','(',NUMERIC_PRECISION*2,',',NUMERIC_SCALE*2,')') whenDATA_TYPE in('text','enum') then 'text' when COLUMN_NAME like '%time' anddata_type = 'int' then 'timestamp' else DATA_TYPE end data_type

from information_schema.COLUMNS

where TABLE_SCHEMA='$db_name' andtable_name='$tab_name') a1;"` > /home/hs/opt/dw-etl/script_generate/script_dir/rds-sor_close2source_create_tab_script/$tab_name.sql

 

: $(( i++ ))

 

done

 

#sed -i "s/'n'/'\\\n'/g" `grep"lines terminated by" -rl/home/hs/opt/dw-etl/script_generate/script_dir/rds-sor_close2source_create_tab_script`

 

 

3sor贴源数据从hive传送到rds-mysql

[hs@master script_generate]$ catsor-close2source_create_hive2mysql_script.sh

#!/bin/bash

#export yesterday=`date -d last-day+%Y-%m-%d`

 

rm -rf/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/sor-hive2mysql_close2source.sh

echo "#!/bin/bash

#export yesterday=\`date -d last-day+%Y-%m-%d\`

export yesterday=\$1

 

" >/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/sor-hive2mysql_close2source.sh

 

#hive load data sync mysql script generate...

i=1

 

for tab in $(cat/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

do

 

col_num=$i

 

tab_name=$(awk -F "|"'NR=='$col_num' {print $1}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

db_name=$(awk -F "|"'NR=='$col_num' {print $2}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

db_ip=$(awk -F "|"'NR=='$col_num' {print $3}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

rds_map_tab_name=$(awk -F "|"'NR=='$col_num' {print $4}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

load_type=$(awk -F "|"'NR=='$col_num' {print $5}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

user_name=$(awk -F "|"'NR=='$col_num' {print $6}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

passwd=$(awk -F "|"'NR=='$col_num' {print $7}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list)

 

if [ $load_type == 'increment' ];then

 

if [ $tab_name == '59_order' ];then

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

select

concat('#',t.TABLE_NAME,' table current alldata increment download from hive and upload to sorrds...huanhang/home/hs/opt/hive-1.2.1/bin/hive -e \"use sor;insertoverwrite local directory\'/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/data-sor/',t.TABLE_NAME,'\' rowformat delimited fields terminated by \'|\' selectorder_id,status,type,paytype,paystatus,pay_trade_no,source,consumption_type,suspicious,sid,site_id,dorm_id,dormentry_id,shop_id,uid,service_eva,delivery_eva,food_eva,food_num,food_amount,ship_fee,coupon_discount,promotion_discount,discount,order_amount,delivery_id,add_time,confirm_time,send_time,expect_date,delivery_type,expect_time,expect_timeslot,order_mark,uname,portrait,phone,phone_addr,buy_times,address1,address2,dormitory,time_deliver,credit,ip,coupon_code,feature,remark,evaluation,expect_start_time,expect_end_time,timeline,unix_timestamp(add_time)add_unixtime,unix_timestamp(send_time) send_unixtime from ',t.TABLE_NAME,'where timeline >= \'\$yesterday03:00:00\';\"huanhang/usr/local/bin/mysql-hrdsipaddress.mysql.rds.aliyuncs.com -udatauser -ppasswordchar -e \"usetmp;drop table if exists ','$rds_map_tab_name','_today;create table','$rds_map_tab_name','_today as select * from sor.','$rds_map_tab_name','where 1=2;\"huanhang#loop load dir file to rds huanhangfor tabdt_path in/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/data-sor/',t.TABLE_NAME,'/*;huanhangdohuanhang/usr/local/bin/mysql-hrdsipaddress.mysql.rds.aliyuncs.com -udatauser -ppasswordchar -e \"usetmp;load data local infile \'\$tabdt_path\' into table','$rds_map_tab_name','_today fields terminated by \'|\' enclosed by \'\' linesterminated by \'\\n\' ignore 0lines;\"huanhangdonehuanhang/usr/local/bin/mysql-hrdsipaddress.mysql.rds.aliyuncs.com -udatauser -ppasswordchar -e \"usetmp;alter table tmp.','$rds_map_tab_name','_today add indexidx_','$rds_map_tab_name','_',k.COLUMN_NAME,' (',k.COLUMN_NAME,') usingbtree;delete ca.* from sor.','$rds_map_tab_name',' ca left jointmp.','$rds_map_tab_name','_today i on ca.',k.COLUMN_NAME,' =i.',k.COLUMN_NAME,' where i.',k.COLUMN_NAME,' is not null;insert intosor.','$rds_map_tab_name',' select * fromtmp.','$rds_map_tab_name','_today;\"huanhang') sql_text

from information_schema.TABLES t

left joininformation_schema.KEY_COLUMN_USAGE k

on t.TABLE_SCHEMA=k.TABLE_SCHEMA andt.TABLE_NAME=k.TABLE_NAME and k.CONSTRAINT_name='PRIMARY'

where t.TABLE_SCHEMA='$db_name' andt.TABLE_NAME='$tab_name';"  >>/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/sor-hive2mysql_close2source.sh

 

else

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

select

concat('#',t.TABLE_NAME,' table current alldata increment download from hive and upload to sorrds...huanhang/home/hs/opt/hive-1.2.1/bin/hive -e \"use sor;insert overwritelocal directory\'/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/data-sor/',t.TABLE_NAME,'\' rowformat delimited fields terminated by \'|\' select * from ',t.TABLE_NAME,'where timeline >= \'\$yesterday 03:00:00\';\"huanhang/usr/local/bin/mysql-hrdsipaddress.mysql.rds.aliyuncs.com -udatauser -ppasswordchar -e \"usetmp;drop table if exists ','$rds_map_tab_name','_today;create table','$rds_map_tab_name','_today as select * from sor.','$rds_map_tab_name','where 1=2;\"huanhang#loop load dir file to rds huanhangfor tabdt_path in/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/data-sor/',t.TABLE_NAME,'/*;huanhangdohuanhang/usr/local/bin/mysql-hrdsipaddress.mysql.rds.aliyuncs.com -udatauser -ppasswordchar -e \"usetmp;load data local infile \'\$tabdt_path\' into table','$rds_map_tab_name','_today fields terminated by \'|\' enclosed by \'\' linesterminated by \'\\n\' ignore 0lines;\"huanhangdonehuanhang/usr/local/bin/mysql-hrdsipaddress.mysql.rds.aliyuncs.com -udatauser -ppasswordchar -e \"usetmp;alter table tmp.','$rds_map_tab_name','_today add indexidx_','$rds_map_tab_name','_',k.COLUMN_NAME,' (',k.COLUMN_NAME,') usingbtree;delete ca.* from sor.','$rds_map_tab_name',' ca left jointmp.','$rds_map_tab_name','_today i on ca.',k.COLUMN_NAME,' = i.',k.COLUMN_NAME,'where i.',k.COLUMN_NAME,' is not null;insert into sor.','$rds_map_tab_name','select * from tmp.','$rds_map_tab_name','_today;\"huanhang') sql_text

from information_schema.TABLES t

left joininformation_schema.KEY_COLUMN_USAGE k

on t.TABLE_SCHEMA=k.TABLE_SCHEMA andt.TABLE_NAME=k.TABLE_NAME and k.CONSTRAINT_name='PRIMARY'

where t.TABLE_SCHEMA='$db_name' andt.TABLE_NAME='$tab_name';"  >>/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/sor-hive2mysql_close2source.sh

 

fi

 

elif [ $load_type == 'total' ];then

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

select

concat('#',t.TABLE_NAME,' table datadownload from hive and upload to sorrds...huanhang/home/hs/opt/hive-1.2.1/bin/hive -e \"use sor;insertoverwrite local directory\'/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/data-sor/',t.TABLE_NAME,'\' rowformat delimited fields terminated by \'|\' select * from',t.TABLE_NAME,';\"huanhang/usr/local/bin/mysql-hrdsipaddress.mysql.rds.aliyuncs.com -udatauser -ppasswordchar -e \"usesor;truncate table ','$rds_map_tab_name',';\"huanhang#loop load dir fileto rds huanhangfor tabdt_path in/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/data-sor/',t.TABLE_NAME,'/*;huanhangdohuanhang/usr/local/bin/mysql-hrdsipaddress.mysql.rds.aliyuncs.com -udatauser -ppasswordchar -e \"usesor;load data local infile \'\$tabdt_path\' into table ','$rds_map_tab_name','fields terminated by \'|\' enclosed by \'\' lines terminated by \'\\n\' ignore0 lines;\"huanhangdonehuanhang') sql_text

from information_schema.TABLES t

where t.TABLE_SCHEMA='$db_name' andt.TABLE_NAME='$tab_name';"  >>/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/sor-hive2mysql_close2source.sh

 

fi

 

#echo -e \\n >>/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/sor-hive2mysql_close2source.sh

 

: $(( i++ ))

 

done

 

#script file final proc ...

sed -i 's/huanhang/\n/g'/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/sor-hive2mysql_close2source.sh

chmod 777/home/hs/opt/dw-etl/etl-script/sor-hive2mysql/sor-hive2mysql_close2source.sh

 

4sor增量数据rds表应建索引的脚本生成

[hs@master script_generate]$ catrds-sor-increment_tab_should_create_idx_sql.sh

#!/bin/bash

#export yesterday=`date -d last-day+%Y-%m-%d`

 

 

cat/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/batch_table_userpass.list|grep increment >/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/rds-sor-batch_table_userpass-for-closesource_create_index.list

 

rm -rf /home/hs/opt/dw-etl/script_generate/script_dir/rds-sor_increment_tab_should_create_idx.sql

 

#rds sor increment load table should createindex script generate ...

i=1

 

for tab in $(cat/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/rds-sor-batch_table_userpass-for-closesource_create_index.list)

do

 

col_num=$i

 

tab_name=$(awk -F "|"'NR=='$col_num' {print $1}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/rds-sor-batch_table_userpass-for-closesource_create_index.list)

db_name=$(awk -F "|"'NR=='$col_num' {print $2}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/rds-sor-batch_table_userpass-for-closesource_create_index.list)

db_ip=$(awk -F "|"'NR=='$col_num' {print $3}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/rds-sor-batch_table_userpass-for-closesource_create_index.list)

rds_map_tab_name=$(awk -F "|"'NR=='$col_num' {print $4}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/rds-sor-batch_table_userpass-for-closesource_create_index.list)

load_type=$(awk -F "|"'NR=='$col_num' {print $5}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/rds-sor-batch_table_userpass-for-closesource_create_index.list)

user_name=$(awk -F "|"'NR=='$col_num' {print $6}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/rds-sor-batch_table_userpass-for-closesource_create_index.list)

passwd=$(awk -F "|"'NR=='$col_num' {print $7}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/rds-sor-batch_table_userpass-for-closesource_create_index.list)

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

select

concat('use ssa;alter tablessa.',t.TABLE_NAME,' add index idx_',t.TABLE_NAME,'_',k.COLUMN_NAME,'(',k.COLUMN_NAME,') using btree;') sql_text

from information_schema.TABLES t

left joininformation_schema.KEY_COLUMN_USAGE k

on t.TABLE_SCHEMA=k.TABLE_SCHEMA andt.TABLE_NAME=k.TABLE_NAME and k.CONSTRAINT_name='PRIMARY'

where t.TABLE_SCHEMA='$db_name' andt.TABLE_NAME='$tab_name';"  >>/home/hs/opt/dw-etl/script_generate/script_dir/rds-sor_increment_tab_should_create_idx.sql

 

#echo -e \\n >>/home/hs/opt/dw-etl/script_generate/script_dir/rds-sor_increment_tab_should_create_idx.sql

 

: $(( i++ ))

 

done

 

#script file final proc ...

sed -i 's/huanhang/\n/g'/home/hs/opt/dw-etl/script_generate/script_dir/rds-sor_increment_tab_should_create_idx.sql

 

#delete temp list file

rm -rf/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/rds-sor-batch_table_userpass-for-closesource_create_index.list

 

 

五、数据仓库qrt准实时数据脚本生成

1qrt准实时数据rds库建表脚本的生成

[hs@master script_generate]$ catqrt-create_rdstab_script.sh

#!/bin/bash

#export yesterday=`date -d last-day+%Y%m%d`

 

#script dir int ...

rm -rf/home/hs/opt/dw-etl/script_generate/script_dir/rds-qrt_create_tab_script

mkdir -p /home/hs/opt/dw-etl/script_generate/script_dir/rds-qrt_create_tab_script

 

i=1

 

for tab in $(cat/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

do

 

col_num=$i

 

tab_name=$(awk -F "|"'NR=='$col_num' {print $1}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

db_name=$(awk -F "|"'NR=='$col_num' {print $2}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

db_ip=$(awk -F "|"'NR=='$col_num' {print $3}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

rds_map_tab_name=$(awk -F "|"'NR=='$col_num' {print $4}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

load_type=$(awk -F "|" 'NR=='$col_num'{print $5}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

user_name=$(awk -F "|"'NR=='$col_num' {print $6}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

passwd=$(awk -F "|" 'NR=='$col_num'{print $7}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

 

echo -e `mysql -h$db_ip -u$user_name-p$passwd -N -e"set session group_concat_max_len=20000000;

select CONCAT('use qrt;

drop table if exists \\\`',table_name,'\\\`;

create table \\\`',table_name,'\\\`(',

GROUP_CONCAT('\n\\\`',COLUMN_NAME,'\\\`  ',DATA_TYPE,''),

') ENGINE=InnoDB DEFAULT CHARSET=utf8;')

from (

select

COLUMN_NAME,'$rds_map_tab_name' table_name,

case when DATA_TYPE in('int','bigint','mediumint','smallint','tinyint')and COLUMN_NAME not like '%time' then 'bigint' when DATA_TYPEin('varchar','char') then CONCAT('varchar','(',CHARACTER_MAXIMUM_LENGTH*3,')')when DATA_TYPE in('decimal') thenCONCAT('decimal','(',NUMERIC_PRECISION*2,',',NUMERIC_SCALE*2,')') whenDATA_TYPE in('text','enum') then 'text' when COLUMN_NAME like '%time' anddata_type = 'int' then 'timestamp' else DATA_TYPE end data_type

from information_schema.COLUMNS

where TABLE_SCHEMA='$db_name' andtable_name='$tab_name') a1;"` >/home/hs/opt/dw-etl/script_generate/script_dir/rds-qrt_create_tab_script/$tab_name.sql

 

: $(( i++ ))

 

done

 

#sed -i "s/'n'/'\\\n'/g" `grep"lines terminated by" -rl/home/hs/opt/dw-etl/script_generate/script_dir/rds-qrt_create_tab_script`

 

 

2qrt准实时数据从源生产系统拉取数据到rds-qrt数据库

[hs@master script_generate]$ catqrt-create_src2qrt_script.sh

#!/bin/bash

export today=`date +%Y-%m-%d`

export yesterday=`date -d last-day+%Y-%m-%d`

export beforeytd=`date -d "2 daysago" +%Y-%m-%d`

export ytd=${yesterday//-/}

 

rm -rf/home/hs/opt/dw-etl/rds_qrt/src2qrt_script/

 

i=1

 

for tab in $(cat/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

do

 

col_num=$i

 

tab_name=$(awk -F "|"'NR=='$col_num' {print $1}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

db_name=$(awk -F "|"'NR=='$col_num' {print $2}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

db_ip=$(awk -F "|"'NR=='$col_num' {print $3}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

rds_map_tab_name=$(awk -F "|"'NR=='$col_num' {print $4}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

load_type=$(awk -F "|"'NR=='$col_num' {print $5}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

user_name=$(awk -F "|"'NR=='$col_num' {print $6}'/home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

passwd=$(awk -F "|"'NR=='$col_num' {print $7}' /home/hs/opt/dw-etl/script_generate/script_dir/list_dir/real_table_userpass.list)

 

if [ ! -d/home/hs/opt/dw-etl/rds_qrt/src2qrt_script/$db_name ];then

  mkdir -p /home/hs/opt/dw-etl/rds_qrt/src2qrt_script/$db_name

fi

 

rm -rf /home/hs/opt/dw-etl/rds_qrt/src2qrt_script/$db_name/src2qrt_$tab_name.sh

 

if [ $load_type == 'total' ];then

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

selectCONCAT('#!/bin/bashhuanhang#$tab_name total data proc...huanhang/usr/local/bin/mysql -h$db_ip -u$user_name -p$passwd -N -e\"'

,'select CONCAT(',trim(TRAILING ',\'|\''FROM GROUP_CONCAT(CONCAT('ifnull(',case when data_type in ('varchar') thenconcat('replace(replace(replace(\\\\\`',COLUMN_NAME,'\\\\\`,char(13),\'\'),char(10),\'\'),\'|\',\'\')')when COLUMN_NAME like '%time' and data_type = 'int' thenconcat('from_unixtime(\\\\\`',COLUMN_NAME,'\\\\\`,\'%Y-%m-%d %H:%i:%s\')') elseconcat('\\\\\`',COLUMN_NAME,'\\\\\`') end,',\'\')',',\'|\''))) ,') from',TABLE_SCHEMA,'.',table_name,';'

,'\">/home/hs/opt/dw-etl/rds_qrt/real_data_dir/',TABLE_SCHEMA,'.',table_name,'_rt.dathuanhang'

,'') mysql_export_and_hive_load_shell

from information_schema.COLUMNS

where TABLE_SCHEMA='$db_name' andtable_name='$tab_name';"  >>/home/hs/opt/dw-etl/rds_qrt/src2qrt_script/$db_name/src2qrt_$tab_name.sh

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

select

concat('huanhang#','$rds_map_tab_name','table data download from src and upload to qrt rds...huanhang/usr/local/bin/mysql-hrdsipaddress.mysql.rds.aliyuncs.com -udatauser -ppasswordchar -e \"useqrt;truncate table ','$rds_map_tab_name',';\"huanhang#loop load dir fileto rds huanhang/usr/local/bin/mysql -hrdsipaddress.mysql.rds.aliyuncs.com-udatauser -ppasswordchar -e \"use qrt;load data local infile\'/home/hs/opt/dw-etl/rds_qrt/real_data_dir/','$db_name','.','$tab_name','_rt.dat\'into table ','$rds_map_tab_name',' fields terminated by \'|\' enclosed by \'\'lines terminated by \'\\n\' ignore 0 lines;\"huanhang') sql_text

from information_schema.TABLES t

where t.TABLE_SCHEMA='$db_name' andt.TABLE_NAME='$tab_name';"  >>/home/hs/opt/dw-etl/rds_qrt/src2qrt_script/$db_name/src2qrt_$tab_name.sh

 

elif [ $load_type == 'increment' ];then

 

echo $tab_name,data is increment!

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

select CONCAT('#!/bin/bashhuanhangexportlast_maxtime=\`/usr/local/bin/mysql -hrdsipaddress.mysql.rds.aliyuncs.com-udatauser -ppasswordchar -N -e \"use qrt;select max(timeline) max_timefrom qrt.','$rds_map_tab_name',';\"\`huanhang#$tab_name data proc...huanhang/usr/local/bin/mysql -h$db_ip -u$user_name -p$passwd -N -e\"'

,'select CONCAT(',trim(TRAILING ',\'|\''FROM GROUP_CONCAT(CONCAT('ifnull(',case when data_type in ('varchar') thenconcat('replace(replace(replace(\\\\\`',COLUMN_NAME,'\\\\\`,char(13),\'\'),char(10),\'\'),\'|\',\'\')')when COLUMN_NAME like '%time' and data_type = 'int' thenconcat('from_unixtime(\\\\\`',COLUMN_NAME,'\\\\\`,\'%Y-%m-%d %H:%i:%s\')') elseconcat('\\\\\`',COLUMN_NAME,'\\\\\`') end,',\'\')',',\'|\''))) ,') from',TABLE_SCHEMA,'.',table_name,' where timeline >= \'\$last_maxtime\';'

,'\">/home/hs/opt/dw-etl/rds_qrt/real_data_dir/',TABLE_SCHEMA,'.',table_name,'_rt.dathuanhang'

,'') mysql_export_and_hive_load_shell

from information_schema.COLUMNS

where TABLE_SCHEMA='$db_name' and table_name='$tab_name';"  >>/home/hs/opt/dw-etl/rds_qrt/src2qrt_script/$db_name/src2qrt_$tab_name.sh

 

mysql -h$db_ip -u$user_name -p$passwd -N -e"set session group_concat_max_len=20000000;

select

concat('#','$rds_map_tab_name',' tablecurrent all data increment download from hive and upload to qrtrds...huanhang/usr/local/bin/mysql -hrdsipaddress.mysql.rds.aliyuncs.com-udatauser -ppasswordchar -e \"use tmp;drop table if exists','$rds_map_tab_name','_minutes;create table ','$rds_map_tab_name','_minutes asselect * from qrt.','$rds_map_tab_name',' where 1=2;\"huanhang#loop loaddir file to rds huanhang/usr/local/bin/mysql-hrdsipaddress.mysql.rds.aliyuncs.com -udatauser -ppasswordchar -e \"usetmp;load data local infile \'/home/hs/opt/dw-etl/rds_qrt/real_data_dir/','$db_name','.','$tab_name','_rt.dat\'into table ','$rds_map_tab_name','_minutes fields terminated by \'|\' enclosedby \'\' lines terminated by \'\\n\' ignore 0lines;\"huanhang/usr/local/bin/mysql -hrdsipaddress.mysql.rds.aliyuncs.com-udatauser -ppasswordchar -e \"use tmp;alter tabletmp.','$rds_map_tab_name','_minutes add indexidx_','$rds_map_tab_name','_',k.COLUMN_NAME,' (',k.COLUMN_NAME,') usingbtree;delete ca.* from qrt.','$rds_map_tab_name',' ca left jointmp.','$rds_map_tab_name','_minutes i on ca.',k.COLUMN_NAME,' =i.',k.COLUMN_NAME,' where i.',k.COLUMN_NAME,' is not null;insert intoqrt.','$rds_map_tab_name',' select * fromtmp.','$rds_map_tab_name','_minutes;\"huanhang') sql_text

from information_schema.TABLES t

left join information_schema.KEY_COLUMN_USAGEk

on t.TABLE_SCHEMA=k.TABLE_SCHEMA andt.TABLE_NAME=k.TABLE_NAME and k.CONSTRAINT_name='PRIMARY'

where t.TABLE_SCHEMA='$db_name' andt.TABLE_NAME='$tab_name';"  >>/home/hs/opt/dw-etl/rds_qrt/src2qrt_script/$db_name/src2qrt_$tab_name.sh

 

fi

 

sed -i "s/\\\\\`/\\\`/g"/home/hs/opt/dw-etl/rds_qrt/src2qrt_script/$db_name/src2qrt_$tab_name.sh

 

#echo -e \\n >>/home/hs/opt/dw-etl/rds_qrt/src2qrt_script/$db_name/src2qrt_$tab_name.sh

 

sed -i "s/huanhang/\n/g" `grep"mysql" -rl /home/hs/opt/dw-etl/rds_qrt/src2qrt_script/$db_name/`

 

chmod 777/home/hs/opt/dw-etl/rds_qrt/src2qrt_script/$db_name/src2qrt_$tab_name.sh

 

: $(( i++ ))

 

done

 

#sed -i "s/huanhang/\n/g" `grep"mysql" -rl /home/hs/opt/dw-etl/rds_qrt/src2qrt_script/$db_name/`

 

六、数据库建库执行脚本

1hivessa建表语句的执行

[hs@master script_generate]$ catexec_hivessa_create_tab.sh

#/bin/bash

export yesterday=`date -d last-day +%Y%m%d`

 

cd/home/hs/opt/dw-etl/script_generate/script_dir/hive-ssa_create_tab_script

for create_tab_script in *.hql

do

hive -f $create_tab_script

done

 

2rdssor贴源部分建表脚本的执行

[hs@master script_generate]$ catexec_rdssor_close2source_create_tab.sh

#/bin/bash

export yesterday=`date -d last-day +%Y%m%d`

 

cd/home/hs/opt/dw-etl/script_generate/script_dir/rds-sor_close2source_create_tab_script

for create_tab_script in/home/hs/opt/dw-etl/script_generate/script_dir/rds-sor_close2source_create_tab_script/*.sql

do

mysql -hrdsipaddress.mysql.rds.aliyuncs.com-udatauser -ppasswordchar -D sor < $create_tab_script

done

 

3rdsqrt建表脚本的执行

[hs@master script_generate]$ catexec_rdsqrt_create_tab.sh

#/bin/bash

export yesterday=`date -d last-day +%Y%m%d`

 

cd/home/hs/opt/dw-etl/script_generate/script_dir/rds-qrt_create_tab_script

for create_tab_script in/home/hs/opt/dw-etl/script_generate/script_dir/rds-qrt_create_tab_script/*.sql

do

mysql -hrdsipaddress.mysql.rds.aliyuncs.com-udatauser -ppasswordchar -D qrt < $create_tab_script

done

 

【后记】

经过内部多种力量的角逐,数据仓库数据的最终数据存储与传输方案大致如下:

1、将生产系统的各种数据源贴源拉取到hivessa层;

2、数据在hivesor层进行整合,未整合部分也全量拉取到sor

3、将hivesor层数据每天同步到rds-sor库;

4、数据仓库第三层dpa基于rds-sor进行数据运算及操作;

5、同时,为满足一部分的准实时数据处理需要,将real表从生产系统源库直接传送到rds-qrt数据库,供数据分析及查询部门进行使用。

0
0

查看评论
* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
    个人资料
    • 访问:192141次
    • 积分:3346
    • 等级:
    • 排名:第10098名
    • 原创:184篇
    • 转载:3篇
    • 译文:0篇
    • 评论:16条
    文章分类
    最新评论