测试删除hive内部分区表找回:
删除前数据查看
hive (default)> select * from test3 where statis_date='2020-05-17';
OK
clo1 clo2 clo3 clo4 statis_date
zhangsan jiangsu lisi anhui 2020-05-17
sunce dongwu daqiao dongwu 2020-05-17
zhouyu dongwu xiaoqiao dongwu 2020-05-17
wangwu shanghai zhaoliu sichuang 2020-05-17
zhugeliang sichuang fengjie hunan 2020-05-17
zhaoyun dongbei yangyi dongbei 2020-05-17
guanbu shandong diaochang anhui 2020-05-17
Time taken: 0.3 seconds, Fetched: 7 row(s)
hive (default)> drop table test3;
Moved: 'hdfs://ns/user/hive/warehouse/test3' to trash at: hdfs://ns/user/root/.Trash/Current
OK
Time taken: 1.26 seconds
再删除表的时候我们可以看到提示:
Moved: ‘hdfs://ns/user/hive/warehouse/test3’ to trash at: hdfs://ns/user/root/.Trash/Current
可以看出再删除表的时候底层实际做的是将存储表的数据文件移动到.Trash/Current目录下,将表的元数据信息删除了,所以我们再次查找的时候就找不到这个表。但是如果我们误删了表不及时找回,.Trash/Current目录下文件会定时清理,到时就真的丢失了
补充:hadoop设置定时清理垃圾回收站配置
根据上面数据文件移动到的位置我们上hdfs上遍历查找下我们需要的文件
执行: sh hadoop fs -lsr 文件目录
检查数据文件是否已经一过来了:
OK数据文件已经一过来了,现在我们需要重写创建一个一模一样的表结构就可以了
create table if not exists test3(
clo1 string,
clo2 string,
clo3 string,
clo4 string
)comment '测试表' partitioned by (statis_date string)
row format delimited fields terminated by '\t' stored as textfile;
如果不是分区表现在就可以通过sql查看数据了,如果是分区表的话还需要进行修复分区的操作:MSCK REPAIR TABLE teat_table;
hive (default)> select * from test3;
OK
clo1 clo2 clo3 clo4 statis_date
Time taken: 0.241 seconds
hive (default)> msck repair table test3;
OK
Partitions not in metastore: test3:statis_date=2020-05-17
Repair: Added partition to metastore test3:statis_date=2020-05-17
Time taken: 0.421 seconds, Fetched: 2 row(s)
hive (default)> select * from test3 where statis_date='2020-05-17';
OK
clo1 clo2 clo3 clo4 statis_date
zhangsan jiangsu lisi anhui 2020-05-17
sunce dongwu daqiao dongwu 2020-05-17
zhouyu dongwu xiaoqiao dongwu 2020-05-17
wangwu shanghai zhaoliu sichuang 2020-05-17
zhugeliang sichuang fengjie hunan 2020-05-17
zhaoyun dongbei yangyi dongbei 2020-05-17
guanbu shandong diaochang anhui 2020-05-17
Time taken: 0.27 seconds, Fetched: 7 row(s)
OK!数据恢复!