Sink-StreamingFileSink的使用

上篇:Sink-redisSink的使用

说明:StreamingFileSink代替WriterSink

WriterSink:在flink1.2的Sink已经过时了  【源码@Deprecated:标记为过时了】

准备工作:

pom文件引入快照、hadoop依赖

<!--		快照版本-->
		<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-connector-filesystem -->
		<dependency>
			<groupId>org.apache.flink</groupId>
			<artifactId>flink-connector-filesystem_2.12</artifactId>
			<version>1.11.2</version>
		</dependency>

<!--		需要hdfs写数据-->
		<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client -->
		<dependency>
			<groupId>org.apache.hadoop</groupId>
			<artifactId>hadoop-client</artifactId>
			<version>3.2.0</version>
		</dependency>


启动hdfs进程

[root@Master ~]# start-all.sh
Starting namenodes on [Master]
Last login: Sun Jul  4 00:33:17 CST 2021 from 192.168.242.1 on pts/2
Starting datanodes
Last login: Sun Jul  4 00:33:59 CST 2021 on pts/2
Starting secondary namenodes [Slave02]
Last login: Sun Jul  4 00:34:04 CST 2021 on pts/2
Starting resourcemanager
Last login: Sun Jul  4 00:34:21 CST 2021 on pts/2
Starting nodemanagers
Last login: Sun Jul  4 00:34:49 CST 2021 on pts/2

编码:

package cn._51doit.flink.day01;
import org.apache.flink.api.common.functions.RuntimeContext;
import org.apache.flink.api.common.serialization.SimpleStringEncoder;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.core.fs.Path;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
import org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink;
import org.apache.flink.streaming.api.functions.sink.filesystem.rollingpolicies.DefaultRollingPolicy;

/**
 * Sink-StreamingFileSink的使用【数据无界流(一直运行)】
 */
public class StreamingFileSinkDemo {
    public static void main(String[] args) throws Exception {

        System.setProperty("HADOOP_USER_NAME", "root");   //设置root用户
        //local模式默认的并行度是当前节点逻辑核的数量
        Configuration configuration = new Configuration();
        StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(configuration);

        env.enableCheckpointing(5000);
        //DataStream的并行度
        int parallelism01 = env.getParallelism();
        System.out.println("执行环境默认的并行度是:" + parallelism01);
        DataStreamSource<String> lines = env.socketTextStream("Master", 9999);//【这个9999端口号需要跟hdfs://Master:9999一样对应,不然会出错】
        //获取DataStream的并行度
        int parallelism = lines.getParallelism();
        System.out.println("SocketSink的并行度:" + parallelism);
        DefaultRollingPolicy<String, String> rollingPolicy = DefaultRollingPolicy.builder()
                .withRolloverInterval(30 * 1000L)//30秒滚动生成一个文件
                .withMaxPartSize(1024L * 1024L * 100L)  //当文件达到100m滚动生成一个文件
                .build();
        //创建StreamingFileSink,数据以行格式写入
        StreamingFileSink<String> sink = StreamingFileSink.forRowFormat(
                new Path("hdfs://Master:9999/out84"),  //指的是文件存储的目录
                new SimpleStringEncoder<String>("UTF-8"))  //指的是文件编码
                .withRollingPolicy(rollingPolicy)   //传入文件滚动生成的策略
                .build();
        //调用DataStream的addsink添加该Sink
        lines.addSink(sink);
        env.execute();

    }

    //定义内部类
    public static class MyPrintSink extends RichSinkFunction<String > {
        private int indexOfThisSubtask;

        //最终把数据输出的方法(如:mysql、jdbc)
        @Override
        public void invoke(String value, Context context) throws Exception {
            //:拿到索引编号[从0开始]
            RuntimeContext runtimeContext = getRuntimeContext();
            int indexOfThisSubtask = runtimeContext.getIndexOfThisSubtask();

            System.out.println(indexOfThisSubtask+"> "+value);
        }
    }


}

在nc -lk 9999命令下输入数据【随便自定义输入】比如:hadoop

运行程序:

查看job:http://localhost:8081/#/job/f17f4e4399519b4ae2adfceca390fca0/overview

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值