7. 数据装载

1、创建数据库与表

1、创建库

hive> create database sogou;

2、创建表

hive> create table if not exists sogou.sogou_500w(
    > ts string,
    > uid string,
    > keyword string,
    > rank int,
    > orders int,
    > url string) row format delimited fields terminated by '\t' stored as textfile;
OK
Time taken: 0.385 seconds

2、数据装载

hive> load data local inpath '/root/sogou.500w.utf8' overwrite into table sogou.sogou_500w;
Loading data to table sogou.sogou_500w
OK
Time taken: 5.18 seconds
  • 查看;500w数据真的是眼都花了
    在这里插入图片描述

3、提取数据

1、创建表sogou_xj

hive> create table if not exists sogou.sogou_xj(
    > ts string,
    > uid string,
    > keyword string,
    > rank int,
    > orders int,
    > url string) row format delimited fields terminated by '\t' stored as textfile;
OK
Time taken: 0.455 seconds

2、筛选带有‘仙剑奇侠装’的数据插入上表中

hive> insert overwrite table sogou.sogou_xj select * from sogou.sogou_500w where keyword like '%仙剑奇侠传%';
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20231126183129_57894c50-cf47-4706-b6a8-dca2760e413e
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
2023-11-26 18:31:30,701 Stage-1 map = 0%,  reduce = 0%
2023-11-26 18:31:31,761 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_local1259273538_0003
Stage-4 is filtered out by condition resolver.
Stage-3 is selected by condition resolver.
Stage-5 is filtered out by condition resolver.
Launching Job 3 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
2023-11-26 18:31:35,004 Stage-3 map = 100%,  reduce = 0%
Ended Job = job_local1385583283_0004
Loading data to table sogou.sogou_xj
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 3100078204 HDFS Write: 411085 SUCCESS
Stage-Stage-3:  HDFS Read: 1147541830 HDFS Write: 338082 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 6.224 seconds
  • 查看
hive> select * from sogou.sogou_xj limit 100;
OK
20111230003647  2e6a9aa494ae241b95664ff56971b72a        仙剑奇侠传三    2       1       http://www.tvmao.com/drama/LWlTVw==/episode
20111230003727  2e6a9aa494ae241b95664ff56971b72a        仙剑奇侠传三    3       2       http://www.youku.com/show_page/id_zcc07361a962411de83b1.html
20111230010508  89072e859c41da47298d967ab8d787c5        仙剑奇侠传1刘亦菲版全集 2       1       http://www.tudou.com/playlist/id11482399.html
20111230012415  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解 游民    2       1       http://bbs.gamersky.com/thread-927101-1-1.html
20111230012515  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解 游民    4       2       http://www.gamersky.com/Soft/200712/10802.shtml
20111230012532  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解教程     2       1       http://zhidao.baidu.com/question/293761707
20111230012544  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解教程     1       2       http://www.duote.com/tech/7/18820.html
20111230012544  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解教程     9       3       http://game.shangdu.com/news/pcgl/2011-08-17/359792.html
20111230013132  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解教程     8       4       http://my.nuaa.edu.cn/thread-152913-1-1.html
20111230015346  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解教程     10      5       http://tieba.baidu.com/p/1240255091
20111230020738  ebaf6800291f038901bfa25bf193262d        仙剑奇侠传5九宫格攻略   1       1       http://cheats.3234.com/danjigonglue/10216_2.htm
20111230024104  0d14ebeaf5c4429a519f0aa8b73e8b93        仙剑奇侠传2     4       1       http://zhidao.baidu.com/question/109364646
20111230024213  0d14ebeaf5c4429a519f0aa8b73e8b93        仙剑奇侠传4电视剧       2       1       http://tv.sogou.com/series/wxt4vu5644qm7sn5updont6awsv4xrgtz3h3pnvpxowq.html?p=40230600
20111230024246  0d14ebeaf5c4429a519f0aa8b73e8b93        仙剑奇侠传4电视剧       1       2       http://hi.baidu.com/sousun/blog/item/1a3ae9180b0253bc4aedbcc1.html

4、数据备份

hive> create table sogou.sogou_xj_backup as select * from sogou.sogou_xj;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20231126184539_8337b51c-413c-4927-a11d-6b80d3bb4bf9
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
2023-11-26 18:45:40,986 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_local728323564_0005
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://hadoop:8020/user/hive/warehouse/sogou.db/.hive-staging_hive_2023-11-26_18-45-39_474_8148677371035998524-1/-ext-10002
Moving data to directory hdfs://hadoop:8020/user/hive/warehouse/sogou.db/sogou_xj_backup
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 1147726908 HDFS Write: 506639 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 1.785 seconds

  • 查询
hive> select * from sogou.sogou_xj_backup limit 10;
OK
20111230003647  2e6a9aa494ae241b95664ff56971b72a        仙剑奇侠传三    2       1       http://www.tvmao.com/drama/LWlTVw==/episode
20111230003727  2e6a9aa494ae241b95664ff56971b72a        仙剑奇侠传三    3       2       http://www.youku.com/show_page/id_zcc07361a962411de83b1.html
20111230010508  89072e859c41da47298d967ab8d787c5        仙剑奇侠传1刘亦菲版全集 2       1       http://www.tudou.com/playlist/id11482399.html
20111230012415  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解 游民    2       1       http://bbs.gamersky.com/thread-927101-1-1.html
20111230012515  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解 游民    4       2       http://www.gamersky.com/Soft/200712/10802.shtml
20111230012532  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解教程     2       1       http://zhidao.baidu.com/question/293761707
20111230012544  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解教程     1       2       http://www.duote.com/tech/7/18820.html
20111230012544  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解教程     9       3       http://game.shangdu.com/news/pcgl/2011-08-17/359792.html
20111230013132  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解教程     8       4       http://my.nuaa.edu.cn/thread-152913-1-1.html
20111230015346  3a195290e645f623d13c2b12cce4c273        仙剑奇侠传5破解教程     10      5       http://tieba.baidu.com/p/1240255091
Time taken: 0.11 seconds, Fetched: 10 row(s)

5、导入数据

1、创建表sogou_liangjian

hive> create external table if not exists sogou.sogou_liangjian(
    > ts string,
    > uid string,
    > keyword string,
    > rank int,
    > orders int,
    > url string) row format delimited fields terminated by '\t' stored as textfile location '/sogou/liangjian';
OK
Time taken: 0.076 seconds
hive> show tables;
OK
sogou_500w
sogou_liangjian
sogou_xj
sogou_xj_backup
Time taken: 0.041 seconds, Fetched: 4 row(s)
hive> select  * from sogou_liangjian;
OK
Time taken: 0.143 seconds
hive> 

2、hive创建一个临时目录

hive> dfs -mkdir /tmp/export;
hive> dfs -ls /tmp/;
Found 2 items
drwxr-xr-x   - root supergroup          0 2023-11-26 20:13 /tmp/export
drwx-wx-wx   - root supergroup          0 2023-11-20 21:16 /tmp/hive

3、创建文件liangjian.txt;查询 ‘亮剑’ 数据写入临时目录

hive> insert overwrite directory '/tmp/export'
    > row format delimited
    > fields terminated by '\t'
    > select * from sogou.sogou_500w where keyword like '亮剑';

4、将临时目录中数据导入本地文件

[root@hadoop ~]# hadoop fs -cat /tmp/export/* > liangjian.txt
[root@hadoop ~]# ls
a.txt  e1.txt  e2.txt  e3.txt  e.txt  liangjian.txt  sogou.500w.utf8  start.sh  student.txt
[root@hadoop ~]# cat liangjian.txt 
20111230003230  fa1f5f9f7983637d04179187c9fad3b0        亮剑    1       1       http://tv.sohu.com/s2011/liangjian/
20111230010001  574a4e94e666fce031689064ce6b524e        亮剑    2       1       http://tv.sogou.com/series/wxt4vu5644qmdqn5um.html?p=40230600
20111230012058  2a0f064f2afdf8fc5084f7e57c6ab007        亮剑    1       1       http://tv.sohu.com/s2011/liangjian/

5、将上边到处文件进行移动

hadoop fs -put liangjian.txt /sogou/liangjian

  • 查看表数据
hive> select * from sogou.sogou_liangjian limit 20;
OK
20111230003230  fa1f5f9f7983637d04179187c9fad3b0        亮剑    1       1       http://tv.sohu.com/s2011/liangjian/
20111230010001  574a4e94e666fce031689064ce6b524e        亮剑    2       1       http://tv.sogou.com/series/wxt4vu5644qmdqn5um.html?p=40230600
20111230012058  2a0f064f2afdf8fc5084f7e57c6ab007        亮剑    1       1       http://tv.sohu.com/s2011/liangjian/
20111230012451  2dd3dff74740d3b3916f1e677622d27b        亮剑    2       1       http://tv.sogou.com/series/wxt4vu5644qmdqn5um.html?p=40230600
20111230012529  2dd3dff74740d3b3916f1e677622d27b        亮剑    4       2       http://www.tvmao.com/drama/LW5SVw==/episode
20111230012609  2dd3dff74740d3b3916f1e677622d27b        亮剑    1       3       http://tv.sohu.com/s2011/liangjian/
20111230013423  70ccb6376487856cbe3bb504bffa9c1c        亮剑    1       1       http://tv.sohu.com/s2011/liangjian/
20111230032827  118c57f32d2c053059d6c0690977080a        亮剑    2       1       http://tv.sogou.com/series/wxt4vu5644qmdqn5um.html?p=40230600
20111230040244  de4112c7c970329fb99730cfbf408c62        亮剑    2       1       http://tv.sogou.com/series/wxt4vu5644qmdqn5um.html?p=40230600
20111230040323  295ecf2ec896525e065d755c462d1410        亮剑    1       1       http://tv.sohu.com/s2011/liangjian/
20111230043913  8196855fbb30033cde52aee315b11bbd        亮剑    5       1       http://www.youku.com/show_page/id_zcbfb715e962411de83b1.html
20111230052332  325807155a90a376b4fea826f13282e5        亮剑    2       1       http://tv.sogou.com/series/wxt4vu5644qmdqn5um.html?p=40230600
20111230052335  325807155a90a376b4fea826f13282e5        亮剑    3       2       http://www.iqiyi.com/dianshiju/lj.html?src=alddsj
20111230084909  1870be054489ec858a7694a78c1311e6        亮剑    2       1       http://tv.sogou.com/series/wxt4vu5644qmdqn5um.html?p=40230600
20111230091503  c0fbc36d1c97babcd336b26926709197        亮剑    6       1       http://baike.baidu.com/view/3340.htm
20111230091701  680fac79aab4682d3bb91a0241ee79a6        亮剑    1       1       http://tv.sohu.com/s2011/liangjian/
20111230092558  18a523a8c39c242d44236835a9b3a7b6        亮剑    1       1       http://tv.sohu.com/s2011/liangjian/
20111230093823  a9bfdff7b47717cc3d0d946ebb714f46        亮剑    5       1       http://www.youku.com/show_page/id_zcbfb715e962411de83b1.html
20111230094129  75ff86584a04922da136bc5001a793de        亮剑    3       1       http://www.iqiyi.com/dianshiju/lj.html?src=alddsj
20111230095018  2cbc43d3721f8a76c027b9dd8350979b        亮剑    1       1       http://tv.sohu.com/s2011/liangjian/
Time taken: 0.119 seconds, Fetched: 20 row(s)

6、导出数据

  • 查看文件
    hadoop fs -ls /user/hive/warehouse/sogou.db/sogou_xj_backup/000000_0
  • 导出文件
    hadoop fs -get /user/hive/warehouse/sogou.db/sogou_xj_backup/000000_0
  • 查看数据
[root@hadoop ~]# head 000000_0 
201112300036472e6a9aa494ae241b95664ff56971b72a仙剑奇侠传三21http://www.tvmao.com/drama/LWlTVw==/episode
201112300037272e6a9aa494ae241b95664ff56971b72a仙剑奇侠传三32http://www.youku.com/show_page/id_zcc07361a962411de83b1.html
2011123001050889072e859c41da47298d967ab8d787c5仙剑奇侠传1刘亦菲版全集21http://www.tudou.com/playlist/id11482399.html
201112300124153a195290e645f623d13c2b12cce4c273仙剑奇侠传5破解 游民21http://bbs.gamersky.com/thread-927101-1-1.html
201112300125153a195290e645f623d13c2b12cce4c273仙剑奇侠传5破解 游民42http://www.gamersky.com/Soft/200712/10802.shtml
201112300125323a195290e645f623d13c2b12cce4c273仙剑奇侠传5破解教程21http://zhidao.baidu.com/question/293761707
201112300125443a195290e645f623d13c2b12cce4c273仙剑奇侠传5破解教程12http://www.duote.com/tech/7/18820.html
201112300125443a195290e645f623d13c2b12cce4c273仙剑奇侠传5破解教程93http://game.shangdu.com/news/pcgl/2011-08-17/359792.html
201112300131323a195290e645f623d13c2b12cce4c273仙剑奇侠传5破解教程84http://my.nuaa.edu.cn/thread-152913-1-1.html
201112300153463a195290e645f623d13c2b12cce4c273仙剑奇侠传5破解教程105http://tieba.baidu.com/p/1240255091

7、总结

  1. 需要注意的是hive中主要是操作数据库创建与数据导入,在bash命令中使用hadoop操作HDFS文件
  2. 需要手动创建目录,掌握数据导出到本地文件的方法
  3. 注意数据库备份命令,备份文件位置及写法
  • 9
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值