不想加班到凌晨,又想一键导入昨晚的数据,那咋办呢?
定时启动Shell脚本。
废话少说,直接上:
首先先在mysql创建一个表用来测试:
mysql> create table myorder(ordid int primary key not null auto_increment,orderno varchar(20),orderdate date);
Query OK, 0 rows affected (0.02 sec)
mysql> insert into myorder(orderno,orderdate) values('dd0001','2020-6-25'),('dd0002','2020-6-25'),('dd000)3','2020-6-26'),('dd0004','2020-6-26');
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> select * from myorder;
+-------+---------+------------+
| ordid | orderno | orderdate |
+-------+---------+------------+
| 1 | dd0001 | 2020-06-25 |
| 2 | dd0002 | 2020-06-25 |
| 3 | dd0003 | 2020-06-26 |
| 4 | dd0004 | 2020-06-26 |
+-------+---------+------------+
4 rows in set (0.00 sec)
写这玩意遇到的小难点:
1. shell去除字符串中所有空格:
[root@zjw myshl]# echo '1999-9-9' | sed 's/-//g'
199999
2.Shell怎么获得当前时间 带格式:
nowdate=$(date --date='1 day ago' '+%Y-%m-%d')
3.hdfs怎么取出文件名后几个
[root@zjw myshl]# hdfs dfs -ls /mydata | awk '{print $8}' | cut -d '/' -f3 | grep -v "[0-9]\{4\})"
20/06/26 14:46:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
0610
test
大概就这几个要点,脚本正文:
#! /bin/bash
# get current date
nowdate=$(date --date='1 day ago' '+%Y-%m-%d')
# if exists mydata folder, delete folder where folder'name not confirm 8 number
hdfs dfs -test -e /mydata/
if [ $? -eq 0 ];then
files=`hdfs dfs -ls /mydata | awk '{print $8}' | cut -d '/' -f3 | grep -v "[0-9]\{4\})"`
for fl in $files
do
hdfs dfs -rmr /mydata/$fl
done
else
hdfs dfs -mkdir -p /mydata/
fi
foldername=`echo $nowdate|sed 's/-//g'`
#import mysql data to hdfs
sqoop import --connect jdbc:mysql://192.168.56.100:3306/mydemo --driver com.mysql.jdbc.Driver --username root --password 1234 --query "select * from myorder where orderdate="\'$nowdate\'" and \$CONDITIONS" --target-dir /mydata/$foldername -m 1
授权之后简单运行一下:
直接芜湖。