Hive,Mysql,Sqoop安装使用数据导入导出

最新推荐文章于 2024-05-10 04:56:33 发布

宁缺100

最新推荐文章于 2024-05-10 04:56:33 发布

阅读量513

点赞数

分类专栏：大数据数据仓库文章标签： Hive Mysql Sqoop

本文链接：https://blog.csdn.net/qq_24434491/article/details/89463542

版权

大数据同时被 2 个专栏收录

36 篇文章 0 订阅

订阅专栏

数据仓库

6 篇文章 0 订阅

订阅专栏

Hive简介

Facebook开发构建与Hadoop之上的数据仓库，可以结构化数据文件映射为一张数据库表
提供完整的SQL查询功能，将SQL语句转换为MapReduce任务进行
适合静态数据分析，数据变化不频繁
不支持记录级别的更新插入或者删除，不支持事务
提供ETL数据提取转化加载，可以维护海量数据，对数据挖掘后形成任意报告表单
Hive SQL 和 Mysql 更接近
分区表按照分区条件在表目录下将数据文件分类归档，目录名是partition条件

Hive安装

1.解压安装包
2.设置环境变量
3.修改hive-site.xml配置文件
4.将mysql连接包放入hive安装路径下的lib包  /usr/local/hive/lib
5.启动hive

mysql安装

1.更新软件源
2.安装mysql
3.修改编码 character_set_server
4.给hive用户赋权限(hive 能接入数据库)

hive内部表和外部表的区别

Hive安装

1.解压安装包

下载安装包可能会用到的镜像
http://mirror.bit.edu.cn/apache/hive/ 
http://mirrors.hust.edu.cn/apache/hive/ 
http://mirrors.shu.edu.cn/apache/hive/ 
http://mirrors.tuna.tsinghua.edu.cn/apache/hive/

tar -zxvf apache-hive-1.2.2-bin.tar.gz -C /usr/local/
hadoop@dhjvirtualmachine:/usr/local$ sudo mv apache-hive-1.2.2-bin/ hive
sudo chown -R hadoop:hadoop hive

2.设置环境变量

编辑~/.bashrc文件
vim ~/.bashrc
export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin

vim ~/.bashrc
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin

source ~/.bashrc

3.修改hive-site.xml配置文件

cd /usr/local/hive/conf
vi hive-site.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
    <description>username to use against metastore database</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>hive</value>
    <description>password to use against metastore database</description>
  </property>
</configuration>

5.启动hive

先启动Hadoop和mysql
启动Hadoop
start-all.sh
启动hive
hive

mysql安装

更改apt-get 源

sudo apt-get update  #更新软件源
关闭Ubuntu的软件和更新中的所有源

删除/var/lib/apt/lists 所有文件 
```
rm -rf /var/lib/apt/lists/*
```
修改/etc/apt/sources.list
（注意版本匹配：16.1->yakkety;16.04->xenial;15.10->willy;14.04->trusty;12.04->precise）
```
deb http://mirrors.163.com/ubuntu/ trusty main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ trusty-security main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ trusty-updates main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ trusty-proposed main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ trusty-backports main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ trusty main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ trusty-security main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ trusty-updates main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ trusty-proposed main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ trusty-backports main restricted universe multiverse
```

2.安装mysql

sudo apt-get install mysql-server  #安装mysql

3.修改编码 character_set_server

启动mysql
service mysql start

检查
sudo netstat -tap | grep mysql

进入mysql shell界面
mysql -u root -p

show variables like "char%"

character_set_server默认为latin1

修改配置文件
vi /etc/mysql/mysql.conf.d/mysqld.cnf

在[mysqld]下添加一行character_set_server=utf8

重启mysql
service mysql restart

进入mysql shell界面
mysql -u root -p

检查character_set_server变量
show variables like "char%"

4.给hive用户赋权限(hive 能接入数据库)

进入mysql shell界面
mysql -u root -p

创建于hive对接数据库
create database hive;

给hive用户赋权限
grant all on *.* to hive@localhost identified by 'hive';

刷新mysq系统权限关系表
flush privileges;

hive内部表和外部表的区别

1.创建内部表

create table if not exists hive.usr(
      name string comment 'username',
      pwd string comment 'password',
      address struct<street:string,city:string,state:string,zip:int>,
      comment  'home address',
      identify map<int,tinyint> comment 'number,sex') 
      comment 'description of the table'  
     tblproperties('creator'='me','time'='2016.1.1');

2.创建外部表

create external table if not exists usr2(
      name string,
      pwd string,
  address struct<street:string,city:string,state:string,zip:int>,
      identify map<int,tinyint>) 
      row format delimited fields terminated by ','
     location '/usr/local/hive/warehouse/hive.db/usr';

区别

当执行加载数据到表后
load data local inpath '/home/duhj/data/duhj.txt' into table usr;

duhj.text会被加载到HDFS分布式文件系统中，hive将hdfs数据放入表中，表的数据放在配置文指定的路径(hive-site.xml配置文件的hive.metastore.warehouse.dir属性)

当删除这张内部表的时候，会将表的元数据清除，而外部表在创建的时候location 指定了表数据存放位置(表数据不由hive管理)，并不会删除该位置下的数据。

数据管理：内部表由hive管理，删表 hdf上数据也删
          外部表由Hadoop管理，删表  hdf 数据还在
对表修改： 内部表 同步元数据
           外部表 需要修复  MSCK REPAIR TABLE tablename;

数据导入导出 Sqoop

hive 导入mysql

bin/sqoop export
    --connect jdbc:mysql://localhost:3306/database
    --username root
    --password root
    --table    user_log
    --export_dir '/user/hive/warehouse/somedb.db/inner_user_log'
    --fields-interminated-by ','
    
注：
string -> vchar

mysql导出hive

bin/sqoop export
    --connect jdbc:mysql://localhost:3306/database
    --username root
    --password root
    --query 'select id,name,age from user_log'
    --target-dir /user/hive/warehouse/user_log
    --delete-target-dir
    --num-mappers
    --compress
    --compression-code org.apache.hadoop.io.compress.snappycodec
    --fields-termied-by '\t'

hive 再将/user/hive/warehouse/user_log 加载到表中

宁缺100

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Hive,Mysql,Sqoop安装使用数据导入导出

Hive简介Facebook开发构建与Hadoop之上的数据仓库，可以结构化数据文件映射为一张数据库表提供完整的SQL查询功能，将SQL语句转换为MapReduce任务进行适合静态数据分析，数据变化不频繁不支持记录级别的更新插入或者删除，不支持事务提供ETL数据提取转化加载，可以维护海量数据，对数据挖掘后形成任意报告表单Hive SQL 和 Mysql 更接近分区表按照分区...
复制链接

扫一扫