1、在shell中创建,home/sym,存放文件
2、将文件从服务器上传到集群,
hadoop fs -put data-dns.csv /sym/
查看上传结果:
hadoop fs -ls -R /sym
3、从集群上传到hive仓库
hive -e "load data inpath 'hdfs:///sym/data-dns.csv' into table log_509.df_dns_log_a partition(dd='2021-04-01',hh='12');"
4、查看是否上传成功
select * from log_509.df_dns_log_a where dd='2021-04-01' and hh='13' limit 10;
查看hive中表的步骤
- 1、hive
- 2、use log_509;(base)
- 3、查看表:show tables
- 4、查看表结构:desc df_dns_log_a
IDEA打包jar包:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.example</groupId>
<artifactId>helloworldTest</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
</properties>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.2.0</version>
<configuration>
<archive>
<manifest>
<mainClass>hello.fileTest</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
</plugins>
</build>
</project>
HDFS 常用命令: