1.增量添加数据测试:
创建mysql表
CREATE TABLE sqoop_test
(
id
int(11) DEFAULT NULL,
name
varchar(255) DEFAULT NULL,
age
int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
创建hive表(表结构与mysql一致)
hive> create external table sqoop_test (id int,name string,age int)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’
STORED AS TEXTFILE
location ‘/user/hive/external/sqoop_test’;
添加mysql数据
insert into sqoop_test values(1,‘test1’,20),(2,‘test2’,25);
先导入mysql中的原始数据
sqoop import --connect jdbc:mysql://emr-header-1:3306/sqooptest --username root --password EMRroot1234 --table sqoop_test --hive-import --hive-overwrite --hive-table sqoop_test --fields-terminated-by ‘,’ -m 1
导入成功,查看hive表中的数据
hive> select * from sqoop_test;
在mysql中添加几条增量数据
insert into sqoop_test values(1,‘test3’,26),(2,‘test4’,28);
一开始使用一下增量方式:(不成功)
sqoop import --connect jdbc:mysql://emr-header-1:3306/sqooptest --username root --password EMRroot1234 --table sqoop_test --hive-import --hive-table sqoop_test --check-column id --incremental append --last-value 3 -m 1
hive查询增量的数据未null
正确写法,直接写入到hdfs:
sqoop import --connect jdbc:mysql://emr-header-1:3306/test --username root --password EMRroot1234 --table sqoop_test --target-dir ‘/user/hive/external/sqoop_test’ --incremental append --check-column id --last-value 3 -m 1
2、hive3与mysql8之间的数据同步
import例子:
sqoop import --connect jdbc:mysql://rm-uf6rvcc7y5x7jhzf190130.mysql.rds.aliyuncs.com:3306/lsr --username linsirong --password VWfv8FRm! --table linsirong --columns “id,name” --num-mappers 1 --hive-import --fields-terminated-by “\t” --hive-overwrite --hive-table testdb.linsirong
export例子:
sqoop export --connect jdbc:mysql://rm-uf6rvcc7y5x7jhzf190130.mysql.rds.aliyuncs.com:3306/lsr --username xxx --password xxx --table linsirong --num-mappers 1 --export-dir /user/hive/warehouse/testdb.db/linsirong --fields-terminated-by ‘\t’ --columns id,name
注意:export不能使用–direct,否则会报错
2021-04-26 14:12:36,312 INFO [Thread-15] org.apache.sqoop.mapreduce.MySQLExportMapper: mysqlimport: Error: 1227 Access denied;you need (at least one of) the SUPER, SYSTEM_VARIABLES_ADMIN or SESSION_VARIABLES_ADMIN privilege(s) for this operation
原因是:
https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_mysql_direct_connector
–target-dir 会先将hive的数据导入到指定的hdfs路径(必需写的参数)
–hive-import 将–target-dir中的数据load 到hive表中