sqoop 1.4.5 增量导入hive 0.12.0

1:导入命令

1) bin/sqoop import --connect jdbc:mysql://10.x.x.x:3306/lir --username dss --password xxx123 --table bb_month --hive-import 
2) bin/sqoop import --connect jdbc:mysql://10.x.x.x:3306/lir --table bb_month --username dss -P --hive-import -- --default-character-set=utf-8

第二个命令需要输入密码

上门的命令都需要先在hive中创建表bb_month,否则hive无此表,创建语句:

CREATE TABLE bb_month (
  ID INT,
  Fcom STRING,
  dept1 STRING,
  dept2 STRING,
  Inputdate STRING,
  year INT,
  month INT,
  quarter INT,
  Requestid FLOAT,
  StandPrice FLOAT,
  FactPrice FLOAT,
  HospPrice FLOAT,
  RatePrice FLOAT
) ;

[jifeng@jifeng02 sqoop]$ bin/sqoop import --connect jdbc:mysql://10.X.X.X:3306/lir --username dss --password XXX123 --table bb_month --hive-import 
Warning: /home/jifeng/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/jifeng/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: $HADOOP_HOME is deprecated.

14/09/08 18:02:52 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5
14/09/08 18:02:52 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/09/08 18:02:52 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
14/09/08 18:02:52 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
14/09/08 18:02:52 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/09/08 18:02:52 INFO tool.CodeGenTool: Beginning code generation
14/09/08 18:02:52 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `bb_month` AS t LIMIT 1
14/09/08 18:02:52 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `bb_month` AS t LIMIT 1
14/09/08 18:02:52 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/jifeng/hadoop/hadoop-1.2.1
注: /tmp/sqoop-jifeng/compile/0e2ce2c74924b539ae5ff77b453d708e/bb_month.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
14/09/08 18:02:53 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-jifeng/compile/0e2ce2c74924b539ae5ff77b453d708e/bb_month.jar
14/09/08 18:02:53 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/09/08 18:02:53 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/09/08 18:02:53 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/09/08 18:02:53 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/09/08 18:02:53 INFO mapreduce.ImportJobBase: Beginning import of bb_month
14/09/08 18:02:53 INFO db.DBInputFormat: Using read commited transaction isolation
14/09/08 18:02:53 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`ID`), MAX(`ID`) FROM `bb_month`
14/09/08 18:02:54 INFO mapred.JobClient: Running job: job_201409072150_0009
14/09/08 18:02:55 INFO mapred.JobClient:  map 0% reduce 0%
14/09/08 18:03:02 INFO mapred.JobClient:  map 1% reduce 0%
14/09/08 18:03:03 INFO mapred.JobClient:  map 3% reduce 0%
14/09/08 18:03:05 INFO mapred.JobClient:  map 6% reduce 0%
14/09/08 18:03:06 INFO mapred.JobClient:  map 7% reduce 0%
14/09/08 18:03:07 INFO mapred.JobClient:  map 9% reduce 0%
14/09/08 18:03:08 INFO mapred.JobClient:  map 10% reduce 0%
14/09/08 18:03:09 INFO mapred.JobClient:  map 11% reduce 0%
14/09/08 18:03:11 INFO mapred.JobClient:  map 14% reduce 0%
14/09/08 18:03:12 INFO mapred.JobClient:  map 16% reduce 0%
14/09/08 18:03:14 INFO mapred.JobClient:  map 19% reduce 0%
14/09/08 18:03:15 INFO mapred.JobClient:  map 21% reduce 0%
14/09/08 18:03:16 INFO mapred.JobClient:  map 22% reduce 0%
14/09/08 18:03:18 INFO mapred.JobClient:  map 24% reduce 0%
14/09/08 18:03:19 INFO mapred.JobClient:  map 26% reduce 0%
14/09/08 18:03:22 INFO mapred.JobClient:  map 30% reduce 0%
14/09/08 18:03:23 INFO mapred.JobClient:  map 31% reduce 0%
14/09/08 18:03:26 INFO mapred.JobClient:  map 34% reduce 0%
14/09/08 18:03:27 INFO mapred.JobClient:  map 35% reduce 0%
14/09/08 18:03:30 INFO mapred.JobClient:  map 37% reduce 0%
14/09/08 18:03:31 INFO mapred.JobClient:  map 39% reduce 0%
14/09/08 18:03:32 INFO mapred.JobClient:  map 41% reduce 0%
14/09/08 18:03:34 INFO mapred.JobClient:  map 43% reduce 0%
14/09/08 18:03:36 INFO mapred.JobClient:  map 45% reduce 0%
14/09/08 18:03:39 INFO mapred.JobClient:  map 47% reduce 0%
14/09/08 18:03:40 INFO mapred.JobClient:  map 51% reduce 0%
14/09/08 18:03:43 INFO mapred.JobClient:  map 53% reduce 0%
14/09/08 18:03:44 INFO mapred.JobClient:  map 55% reduce 0%
14/09/08 18:03:46 INFO mapred.JobClient:  map 57% reduce 0%
14/09/08 18:03:47 INFO mapred.JobClient:  map 61% reduce 0%
14/09/08 18:03:50 INFO mapred.JobClient:  map 63% reduce 0%
14/09/08 18:03:51 INFO mapred.JobClient:  map 65% reduce 0%
14/09/08 18:03:54 INFO mapred.JobClient:  map 68% reduce 0%
14/09/08 18:03:55 INFO mapred.JobClient:  map 71% reduce 0%
14/09/08 18:03:58 INFO mapred.JobClient:  map 73% reduce 0%
14/09/08 18:03:59 INFO mapred.JobClient:  map 75% reduce 0%
14/09/08 18:04:02 INFO mapred.JobClient:  map 79% reduce 0%
14/09/08 18:04:03 INFO mapred.JobClient:  map 81% reduce 0%
14/09/08 18:04:05 INFO mapred.JobClient:  map 83% reduce 0%
14/09/08 18:04:06 INFO mapred.JobClient:  map 85% reduce 0%
14/09/08 18:04:09 INFO mapred.JobClient:  map 88% reduce 0%
14/09/08 18:04:10 INFO mapred.JobClient:  map 91% reduce 0%
14/09/08 18:04:13 INFO mapred.JobClient:  map 93% reduce 0%
14/09/08 18:04:14 INFO mapred.JobClient:  map 95% reduce 0%
14/09/08 18:04:17 INFO mapred.JobClient:  map 100% reduce 0%
14/09/08 18:04:20 INFO mapred.JobClient: Job complete: job_201409072150_0009
14/09/08 18:04:20 INFO mapred.JobClient: Counters: 18
14/09/08 18:04:20 INFO mapred.JobClient:   Job Counters 
14/09/08 18:04:20 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=460888
14/09/08 18:04:20 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
14/09/08 18:04:20 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
14/09/08 18:04:20 INFO mapred.JobClient:     Launched map tasks=103
14/09/08 18:04:20 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3418
14/09/08 18:04:20 INFO mapred.JobClient:   File Output Format Counters 
14/09/08 18:04:20 INFO mapred.JobClient:     Bytes Written=4238
14/09/08 18:04:20 INFO mapred.JobClient:   FileSystemCounters
14/09/08 18:04:20 INFO mapred.JobClient:     HDFS_BYTES_READ=10201
14/09/08 18:04:20 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=6884950
14/09/08 18:04:20 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=4238
14/09/08 18:04:20 INFO mapred.JobClient:   File Input Format Counters 
14/09/08 18:04:20 INFO mapred.JobClient:     Bytes Read=0
14/09/08 18:04:20 INFO mapred.JobClient:   Map-Reduce Framework
14/09/08 18:04:20 INFO mapred.JobClient:     Map input records=42
14/09/08 18:04:20 INFO mapred.JobClient:     Physical memory (bytes) snapshot=4803080192
14/09/08 18:04:20 INFO mapred.JobClient:     Spilled Records=0
14/09/08 18:04:20 INFO mapred.JobClient:     CPU time spent (ms)=36910
14/09/08 18:04:20 INFO mapred.JobClient:     Total committed heap usage (bytes)=1938554880
14/09/08 18:04:20 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=38168121344
14/09/08 18:04:20 INFO mapred.JobClient:     Map output records=42
14/09/08 18:04:20 INFO mapred.JobClient:     SPLIT_RAW_BYTES=10201
14/09/08 18:04:20 INFO mapreduce.ImportJobBase: Transferred 4.1387 KB in 86.9275 seconds (48.7533 bytes/sec)
14/09/08 18:04:20 INFO mapreduce.ImportJobBase: Retrieved 42 records.
14/09/08 18:04:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `bb_month` AS t LIMIT 1
14/09/08 18:04:20 WARN hive.TableDefWriter: Column Requestid had to be cast to a less precise type in Hive
14/09/08 18:04:20 WARN hive.TableDefWriter: Column StandPrice had to be cast to a less precise type in Hive
14/09/08 18:04:20 WARN hive.TableDefWriter: Column FactPrice had to be cast to a less precise type in Hive
14/09/08 18:04:20 WARN hive.TableDefWriter: Column HospPrice had to be cast to a less precise type in Hive
14/09/08 18:04:20 WARN hive.TableDefWriter: Column RatePrice had to be cast to a less precise type in Hive
14/09/08 18:04:20 INFO hive.HiveImport: Removing temporary files from import process: hdfs://jifeng01:9000/user/jifeng/bb_month/_logs
14/09/08 18:04:20 INFO hive.HiveImport: Loading uploaded data into Hive
14/09/08 18:04:21 INFO hive.HiveImport: 
14/09/08 18:04:21 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/home/jifeng/hadoop/hive-0.12.0-bin/lib/hive-common-0.12.0.jar!/hive-log4j.properties
14/09/08 18:04:25 INFO hive.HiveImport: OK
14/09/08 18:04:25 INFO hive.HiveImport: Time taken: 3.491 seconds
14/09/08 18:04:25 INFO hive.HiveImport: Loading data to table default.bb_month
14/09/08 18:04:25 INFO hive.HiveImport: Table default.bb_month stats: [num_partitions: 0, num_files: 202, num_rows: 0, total_size: 8476, raw_data_size: 0]
14/09/08 18:04:25 INFO hive.HiveImport: OK
14/09/08 18:04:25 INFO hive.HiveImport: Time taken: 0.59 seconds
14/09/08 18:04:25 INFO hive.HiveImport: Hive import complete.
14/09/08 18:04:25 INFO hive.HiveImport: Export directory is empty, removing it.

2: 增量导入命令

1) bin/sqoop import --connect jdbc:mysql://10.X.X.X:3306/lir --username dss --password xxx123 --table bb_month --hive-import --incremental append --check-column ID --last-value 714
2) bin/sqoop import --connect jdbc:mysql://10.X.X.X:3306/lir --table bb_month --username dss -P --hive-import --incremental append --check-column ID --last-value 714 -- --default-character-set=utf-8



[jifeng@jifeng02 sqoop]$ bin/sqoop import --connect jdbc:mysql://10.X.X.X:3306/lir --username dss --password xxx123 --table bb_month --hive-import --incremental append --check-column ID --last-value 714
Warning: /home/jifeng/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/jifeng/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: $HADOOP_HOME is deprecated.

14/09/08 20:58:45 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5
14/09/08 20:58:45 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/09/08 20:58:45 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
14/09/08 20:58:45 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
14/09/08 20:58:45 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/09/08 20:58:45 INFO tool.CodeGenTool: Beginning code generation
14/09/08 20:58:46 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `bb_month` AS t LIMIT 1
14/09/08 20:58:46 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `bb_month` AS t LIMIT 1
14/09/08 20:58:46 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/jifeng/hadoop/hadoop-1.2.1
注: /tmp/sqoop-jifeng/compile/5ce12899d2dc88a968a58234a928a423/bb_month.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
14/09/08 20:58:46 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-jifeng/compile/5ce12899d2dc88a968a58234a928a423/bb_month.jar
14/09/08 20:58:47 INFO tool.ImportTool: Maximal id query for free form incremental import: SELECT MAX(`ID`) FROM bb_month
14/09/08 20:58:47 INFO tool.ImportTool: Incremental import based on column `ID`
14/09/08 20:58:47 INFO tool.ImportTool: Lower bound value: 714
14/09/08 20:58:47 INFO tool.ImportTool: Upper bound value: 715
14/09/08 20:58:47 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/09/08 20:58:47 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/09/08 20:58:47 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/09/08 20:58:47 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/09/08 20:58:47 INFO mapreduce.ImportJobBase: Beginning import of bb_month
14/09/08 20:58:47 INFO db.DBInputFormat: Using read commited transaction isolation
14/09/08 20:58:47 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`ID`), MAX(`ID`) FROM `bb_month` WHERE ( `ID` > 714 AND `ID` <= 715 )
14/09/08 20:58:47 INFO mapred.JobClient: Running job: job_201409072150_0016
14/09/08 20:58:48 INFO mapred.JobClient:  map 0% reduce 0%
14/09/08 20:58:56 INFO mapred.JobClient:  map 100% reduce 0%
14/09/08 20:58:57 INFO mapred.JobClient: Job complete: job_201409072150_0016
14/09/08 20:58:57 INFO mapred.JobClient: Counters: 18
14/09/08 20:58:57 INFO mapred.JobClient:   Job Counters 
14/09/08 20:58:57 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=8478
14/09/08 20:58:57 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
14/09/08 20:58:57 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
14/09/08 20:58:57 INFO mapred.JobClient:     Launched map tasks=1
14/09/08 20:58:57 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
14/09/08 20:58:57 INFO mapred.JobClient:   File Output Format Counters 
14/09/08 20:58:57 INFO mapred.JobClient:     Bytes Written=78
14/09/08 20:58:57 INFO mapred.JobClient:   FileSystemCounters
14/09/08 20:58:57 INFO mapred.JobClient:     HDFS_BYTES_READ=103
14/09/08 20:58:57 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=69113
14/09/08 20:58:57 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=78
14/09/08 20:58:57 INFO mapred.JobClient:   File Input Format Counters 
14/09/08 20:58:57 INFO mapred.JobClient:     Bytes Read=0
14/09/08 20:58:57 INFO mapred.JobClient:   Map-Reduce Framework
14/09/08 20:58:57 INFO mapred.JobClient:     Map input records=1
14/09/08 20:58:57 INFO mapred.JobClient:     Physical memory (bytes) snapshot=77537280
14/09/08 20:58:57 INFO mapred.JobClient:     Spilled Records=0
14/09/08 20:58:57 INFO mapred.JobClient:     CPU time spent (ms)=790
14/09/08 20:58:57 INFO mapred.JobClient:     Total committed heap usage (bytes)=31916032
14/09/08 20:58:57 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=506822656
14/09/08 20:58:57 INFO mapred.JobClient:     Map output records=1
14/09/08 20:58:57 INFO mapred.JobClient:     SPLIT_RAW_BYTES=103
14/09/08 20:58:57 INFO mapreduce.ImportJobBase: Transferred 78 bytes in 10.5935 seconds (7.363 bytes/sec)
14/09/08 20:58:57 INFO mapreduce.ImportJobBase: Retrieved 1 records.
14/09/08 20:58:57 INFO util.AppendUtils: Creating missing output directory - bb_month
14/09/08 20:58:57 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `bb_month` AS t LIMIT 1
14/09/08 20:58:57 WARN hive.TableDefWriter: Column Requestid had to be cast to a less precise type in Hive
14/09/08 20:58:57 WARN hive.TableDefWriter: Column StandPrice had to be cast to a less precise type in Hive
14/09/08 20:58:57 WARN hive.TableDefWriter: Column FactPrice had to be cast to a less precise type in Hive
14/09/08 20:58:57 WARN hive.TableDefWriter: Column HospPrice had to be cast to a less precise type in Hive
14/09/08 20:58:57 WARN hive.TableDefWriter: Column RatePrice had to be cast to a less precise type in Hive
14/09/08 20:58:57 INFO hive.HiveImport: Removing temporary files from import process: hdfs://jifeng01:9000/user/jifeng/bb_month/_logs
14/09/08 20:58:57 INFO hive.HiveImport: Loading uploaded data into Hive
14/09/08 20:58:58 INFO hive.HiveImport: 
14/09/08 20:58:58 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/home/jifeng/hadoop/hive-0.12.0-bin/lib/hive-common-0.12.0.jar!/hive-log4j.properties
14/09/08 20:59:01 INFO hive.HiveImport: OK
14/09/08 20:59:01 INFO hive.HiveImport: Time taken: 3.076 seconds
14/09/08 20:59:01 INFO hive.HiveImport: Loading data to table default.bb_month
14/09/08 20:59:02 INFO hive.HiveImport: Table default.bb_month stats: [num_partitions: 0, num_files: 12, num_rows: 0, total_size: 4316, raw_data_size: 0]
14/09/08 20:59:02 INFO hive.HiveImport: OK
14/09/08 20:59:02 INFO hive.HiveImport: Time taken: 0.399 seconds
14/09/08 20:59:02 INFO hive.HiveImport: Hive import complete.
14/09/08 20:59:02 INFO hive.HiveImport: Export directory is empty, removing it.
14/09/08 20:59:02 INFO tool.ImportTool: Incremental import complete! To run another incremental import of all data following this import, supply the following arguments:
14/09/08 20:59:02 INFO tool.ImportTool:  --incremental append
14/09/08 20:59:02 INFO tool.ImportTool:   --check-column ID
14/09/08 20:59:02 INFO tool.ImportTool:   --last-value 715
14/09/08 20:59:02 INFO tool.ImportTool: (Consider saving this with 'sqoop job --create')
[jifeng@jifeng02 sqoop]$ 

hive中查看(显示被删除了数据)

hive> select * from bb_month;
OK

296     上海    null    null    2014-02 2014    2       1       13365.0 2205348.5       876260.7        1727203.5       880151.5
714     长沙    null    null    2014-03 2014    3       1       37582.0 5116978.5       3691360.2       4794782.0       3698047.8
Time taken: 0.057 seconds, Fetched: 42 row(s)
hive> select * from bb_month;
OK

296     上海    null    null    2014-02 2014    2       1       13365.0 2205348.5       876260.7        1727203.5       880151.5
715     广州    null    null    2014-04 2014    3       1       22.0    22.0    22.0    22.0    22.0
714     长沙    null    null    2014-03 2014    3       1       37582.0 5116978.5       3691360.2       4794782.0       3698047.8
Time taken: 0.068 seconds, Fetched: 43 row(s)
hive> 

3:免创建表的导入方式:

bin/sqoop import --connect jdbc:mysql://10.6.x.x:3306/lir --username dss --password xxx123 --table DIM_DEPARTMENT --hive-import --create-hive-table -m 1

参数m:mapreduce的任务数



hive导出到mysql
bin/sqoop export --connect jdbc:mysql://10.6.x.x:3306/lir --username dss --password xx123 --table dep_info --export-dir /user/hive/warehouse/dim_department

http://sqoop.apache.org/docs/1.4.5/SqoopUserGuide.html#_importing_data_into_hive

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值