今天学习了以下几个方面的内容:
sqoop导入mysql数据到hdfs:
./sqoop import --connect jdbc:mysql://[IP]:3306/[数据库名] --username root --P --table [表名] -m 1 --target-dir [目标地址];
-m 1 表示把数据存在一个文件里,默认是放在四个文件里。
sqoop导入mysql数据到hive:
./sqoop import --hive-import --connect jdbc:mysql://[IP]:3306/[数据库名] --username root --P --table [表名] --hive-table [表名];
sqoop导出数据到mysql:
./sqoop export --connect jdbc:mysql://[IP]:3306/[数据库名] --username root --P --table myemp --export-dir=’******’;
HIVE连接JDBC:
1新建maven项目,
2导入相应的依赖包:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.3</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>1.2.2</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.7.3</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-metastore</artifactId>
<version>1.2.2</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>1.2.2</version>
</dependency>
3编写HiveJDBCUtils工具类;
public class HiveJDBCUtils {
//加载驱动
private static String driver="org.apache.hive.jdbc.HiveDriver";
private static String url="jdbc:hive2://10.25.134.142:10000/default";
static {
try {
Class.forName(driver);
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
}
//获取连接
public static Connection getConnection() throws SQLException {
return DriverManager.getConnection(url,"root","144214");
}
//关闭连接
public static void close(Connection connection, Statement statement) throws SQLException {
if(connection!=null){
connection.close();
}
if(statement!=null){
statement.close();
}
}
public static void close(Connection connection, Statement statement,ResultSet resultSet) throws SQLException {
if(connection!=null){
connection.close();
}
if(statement!=null){
statement.close();
}
if(resultSet!=null){
resultSet.close();
}
}
}
4编写测试类;
@Test
public void testConnection(){
try {
Connection connection = HiveJDBCUtils.getConnection();
System.out.println(connection);
} catch (SQLException e) {
e.printStackTrace();
}
}
注意:运行之前要先开启hive服务:
Linux上进入hive的bin目录,输入命令: hive --service hiveserver2
运行测试类,得到如下结果则连接成功。