①如果IDEA创建maven文件时没有resources,需要新建一个resources在main文件夹中
如果没有:File -> Project Structure -> Modules ->click your module -> Sources(Tab) -> 选择你的resource folder,点Resources
②在resources中添加log4j.properties
③在main文件夹下创建java代码
package pers.zzp.Demo;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import org.apache.log4j.Logger;
public class Java_Flume_PutFile_Hdfs {
private static Logger log = Logger.getLogger(LogC.class.getName());
public static void main(String[] args) throws IOException {
//创建文件对象,这个是本地的文件,将这个文件上传到hdfs中
File file = new File("F:\\Maven\\HadoopOperation\\Test_txt\\Demo.txt");
//获取输入流
InputStreamReader inputStreamReader = new InputStreamReader(new FileInputStream(file),"utf-8");
//获取包装流对象
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
String line = null;
while ((line=bufferedReader.readLine())!=null) {
log.info(line);
//System.out.println(splitline[0]+","+splitline[2]+","+splitline[4]);
}
inputStreamReader.close();
bufferedReader.close();
}
}
④将文件 avro2hdfs.conf ( https://pan.baidu.com/s/1NU1hPtEeZ86vQ_bqzLwpPA 提取码:d2pl )提前放入Linux中flume的安装目录下的conf文件夹下
文件 avro2hdfs.conf 具体内容:
#定义agent名, source、channel、sink的名称
a.sources = r1
a.channels = c1
a.sinks = k1
#具体定义source
a.sources.r1.type = avro
#定义自己的主机名:我的是master
a.sources.r1.bind = master
a.sources.r1.port = 41414
#具体定义channel
a.channels.c1.type = memory
a.channels.c1.capacity = 10000
a.channels.c1.transactionCapacity = 100
#具体定义sink
a.sinks.k1.type = hdfs
#定义自己hdfs中的输出路径
a.sinks.k1.hdfs.path =hdfs://master:9000/flume_ToHdfs
a.sinks.k1.hdfs.filePrefix = events-
a.sinks.k1.hdfs.minBlockReplicas=1
a.sinks.k1.hdfs.fileType = DataStream
#a.sinks.k1.hdfs.fileType = CompressedStream
#a.sinks.k1.hdfs.codeC = gzip
#不按照条数生成文件
a.sinks.k1.hdfs.rollCount = 10
a.sinks.k1.hdfs.rollSize = 0
#每隔N s将临时文件滚动成一个目标文件
a.sinks.k1.hdfs.rollInterval =0
a.sinks.k1.hdfs.idleTimeout=0
#组装source、channel、sink
a.sources.r1.channels = c1
a.sinks.k1.channel = c1
⑤开启flume 命令:flume-ng agent -n a -c ../conf -f ./avro2hdfs.conf -Dflume.root.logger=DEBUG,console
最后出现:Avro source r1 start 即可
⑥运行第三步,我们在IDEA中的写的代码
⑦查看Hdfs中,就可以看见测试文件就被上传
注:
(1)需要提前开启flume,否则IDEA控制台会标红
(2)需要提前添加maven中pom.xml添加相应的依赖
<dependency>
<groupId>org.apache.flume</groupId>
<artifactId>flume-ng-core</artifactId>
<version>1.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.flume.flume-ng-clients</groupId>
<artifactId>flume-ng-log4jappender</artifactId>
<version>1.6.0</version>
</dependency>