java flatmapfunction_Java 8 中 map() 和 flatMap()的那些事

本文探讨了Java 8中的map()和flatMap()的区别。map()方法将每个输入元素转换为一个输出元素,而flatMap()则将每个输入元素转换为零个或多个输出元素。通过代码示例,展示了当处理嵌套列表时,flatMap()如何扁平化输出,而map()则保持嵌套结构。
摘要由CSDN通过智能技术生成

两个方法的背景

这两个方法看起来做着同样的事情,但实际上又有些不一样。看源码部分是这样的package java.util.stream;

map()方法/**

* @param The element type of the new stream

* @param mapper a non-interfering,

* stateless

* function to apply to each element

* @return the new stream

*/

Stream map(Function super T, ? extends R> mapper);

flatMap()方法/**

* @param The element type of the new stream

* @param mapper a non-interfering,

* stateless

* function to apply to each element which produces a stream

* of new values

* @return the new stream

*/

Stream flatMap(Function super T, ? extends Stream extends R>> mapper);

Stream map() Method

看源码做推测,map是一种中间操作,返回的是Stream

代码测试

map()方法public static void main(String[] args) {

System.out.println("Output with simple list");

List vowels = Arrays.asList("A", "E", "I", "O", "U");

vowels.stream().map(vowel -> vowel.toLowerCase())

.forEach(value -> System.out.println(value));

List haiList = new ArrayList<>();

haiList.add("hello");

haiList.add("hai");

haiList.add("hehe");

haiList.add("hi");

System.out.println("Output with nested List of List");

List welcomeList = new ArrayList<>();

welcomeList.add("You got it");

welcomeList.add("Don't mention it");

welcomeList.add("No worries.");

welcomeList.add("Not a problem");

List> nestedList = Arrays.asList(haiList, welcomeList);

nestedList.stream().map(list -> {

return list.stream().map(value -> value.toUpperCase());

}).forEach(value -> System.out.println(value));

}

OutputOutput with simple list

a

e

i

o

u

Output with nested List of List

java.util.stream.ReferencePipeline$3@3b9a45b3

java.util.stream.ReferencePipeline$3@7699a589

flatMap()方法public static void main(String[] args) {

List haiList = new ArrayList<>();

haiList.add("hello");

haiList.add("hai");

haiList.add("hehe");

haiList.add("hi");

System.out.println("Output with nested List of List");

List welcomeList = new ArrayList<>();

welcomeList.add("You got it");

welcomeList.add("Don't mention it");

welcomeList.add("No worries.");

welcomeList.add("Not a problem");

List> nestedList = Arrays.asList(haiList, welcomeList);

nestedList.stream().flatMap(

list -> list.stream())

.map(value -> value.toUpperCase())

.forEach(value -> System.out.println(value));

}

OutputOutput with nested List of List

HELLO

HAI

HEHE

HI

YOU GOT IT

DON'T MENTION IT

NO WORRIES.

NOT A PROBLEM

Java 8 map() vs flatMap()map()和flatMap()方法都可以应用于Stream 和Optional 。 并且都返回Stream 或Optional

区别在于,映射操作为每个输入值生成一个输出值,而flatMap操作为每个输入值生成任意数量(零个或多个)的值。 在flatMap()中,每个输入始终是一个集合,可以是List或Set或Map。 映射操作采用一个函数,该函数将为输入流中的每个值调用,并生成一个结果值,该结果值将发送到输出流。 flatMap操作采用的功能在概念上想消耗一个值并产生任意数量的值。 但是,在Java中,方法返回任意数量的值很麻烦,因为方法只能返回零或一个值。

代码public static void main(String[] args) {

List together = Stream.of(Arrays.asList(1, 2), Arrays.asList(3, 4)) // Stream of List

.map(List::stream)

.collect(Collectors.toList());

System.out.println("Output with map() -> "+together);

List togetherFlatMap = Stream.of(Arrays.asList(1, 2), Arrays.asList(3, 4)) // Stream of List

.flatMap(List::stream)

.map(integer -> integer + 1)

.collect(Collectors.toList());

System.out.println("Output with flatMap() -> "+togetherFlatMap);

}

OutputOutput with map() -> [java.util.stream.ReferencePipeline$Head@16b98e56, java.util.stream.ReferencePipeline$Head@7ef20235]

Output with flatMap() -> [2, 3, 4, 5]

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: 以下是使用Java版本的Flink读取Kafka数据并实时计算UV和PV的完整代码实现: ```java import org.apache.flink.api.common.functions.FlatMapFunction; import org.apache.flink.api.java.tuple.Tuple2; import org.apache.flink.streaming.api.TimeCharacteristic; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.streaming.api.functions.KeyedProcessFunction; import org.apache.flink.streaming.api.functions.timestamps.AscendingTimestampExtractor; import org.apache.flink.streaming.api.windowing.time.Time; import org.apache.flink.util.Collector; public class UVAndPVCalculator { public static void main(String[] args) throws Exception { // 设置执行环境 StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); // 设置件时间特性 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); // 从Kafka获取数据流 DataStream<Tuple2<String, Long>> dataStream = env .addSource(new FlinkKafkaConsumer<>("topic", new SimpleStringSchema(), properties)) .flatMap(new MessageSplitter()) .assignTimestampsAndWatermarks(new AscendingTimestampExtractor<Tuple2<String, Long>>() { @Override public long extractAscendingTimestamp(Tuple2<String, Long> element) { return element.f1; } }); // 按照消息的key进行分组,并计算UV dataStream .keyBy(0) .process(new UVCounter()) .print(); // 根据时间窗口进行分组,并计算PV dataStream .timeWindowAll(Time.minutes(1)) .process(new PVCounter()) .print(); // 执行任务 env.execute("UV and PV Calculator"); } // 自定义flatMap函数,将每条消息拆分为单词进行处理 public static class MessageSplitter implements FlatMapFunction<String, Tuple2<String, Long>> { @Override public void flatMap(String message, Collector<Tuple2<String, Long>> out) { String[] words = message.split(" "); for (String word : words) { out.collect(new Tuple2<>(word, System.currentTimeMillis())); } } } // 自定义KeyedProcessFunction函数,用于计算UV public static class UVCounter extends KeyedProcessFunction<Tuple, Tuple2<String, Long>, Tuple2<String, Long>> { private Set<String> uniqueVisitors = new HashSet<>(); @Override public void processElement(Tuple2<String, Long> value, Context ctx, Collector<Tuple2<String, Long>> out) { uniqueVisitors.add(value.f0); out.collect(new Tuple2<>("UV", (long) uniqueVisitors.size())); } } // 自定义ProcessWindowFunction函数,用于计算PV public static class PVCounter extends ProcessAllWindowFunction< Tuple2<String, Long>, Tuple2<String, Long>, TimeWindow> { @Override public void process(Context context, Iterable<Tuple2<String, Long>> input, Collector<Tuple2<String, Long>> out) { long pvCount = 0L; for (Tuple2<String, Long> element : input) { pvCount += 1; } out.collect(new Tuple2<>("PV", pvCount)); } } } ``` 请注意,上述代码假定你已经在项目引入了Flink和Kafka的相关依赖,并且你需要根据实际情况更改代码的一些参数,例如Kafka的topic以及其他的配置项。 另外,上述代码的实现仅作为示例,将每个单词作为UV的统计单位,并未考虑分区的情况。在实际业务,你可能需要根据具体需求进行更改。 ### 回答2: 下面是一个使用Java版本的Flink读取Kafka数据实时计算UV和PV的完整代码实例: ```java import org.apache.flink.api.common.functions.MapFunction; import org.apache.flink.api.common.serialization.SimpleStringSchema; import org.apache.flink.streaming.api.TimeCharacteristic; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows; import org.apache.flink.streaming.api.windowing.time.Time; import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer; import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase; import org.apache.kafka.clients.consumer.ConsumerConfig; import java.util.Properties; public class KafkaUVAndPV { public static void main(String[] args) throws Exception { // 设置执行环境 StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); // 配置Kafka消费者 Properties properties = new Properties(); properties.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092"); properties.setProperty(ConsumerConfig.GROUP_ID_CONFIG, "test-group"); // 添加Kafka源 DataStream<String> stream = env.addSource(new FlinkKafkaConsumer<>("topic", new SimpleStringSchema(), properties)); // 将输入数据转换为UserBehavior实体类 DataStream<UserBehavior> userBehaviorStream = stream.map(new MapFunction<String, UserBehavior>() { @Override public UserBehavior map(String value) throws Exception { String[] fields = value.split(","); long userId = Long.parseLong(fields[0]); long itemId = Long.parseLong(fields[1]); String behavior = fields[2]; long timestamp = Long.parseLong(fields[3]); return new UserBehavior(userId, itemId, behavior, timestamp); } }); // 提取时间戳和生成Watermark DataStream<UserBehavior> withTimestampsAndWatermarks = userBehaviorStream .assignTimestampsAndWatermarks(new UserBehaviorTimestampExtractor()); // 计算UV DataStream<Long> uvStream = withTimestampsAndWatermarks .filter(userBehavior -> userBehavior.getBehavior().equals("pv")) .map(userBehavior -> userBehavior.getUserId()) .keyBy(userId -> userId) .countWindow(Time.hours(1)) .trigger(new UVWindowTrigger()) .process(new UVWindowProcessFunction()); // 计算PV DataStream<Long> pvStream = withTimestampsAndWatermarks .filter(userBehavior -> userBehavior.getBehavior().equals("pv")) .windowAll(TumblingEventTimeWindows.of(Time.minutes(1))) .trigger(new PVWindowTrigger()) .process(new PVWindowProcessFunction()); // 输出结果 uvStream.print("UV: "); pvStream.print("PV: "); // 执行计算 env.execute("Kafka UV and PV"); } } ``` 以上代码实现了从Kafka读取数据,并根据用户行为计算UV和PV。首先,我们设置执行环境并配置Kafka消费者。然后,我们添加Kafka源并将输入数据转换为UserBehavior对象。接下来,我们提取时间戳和生成Watermark,并使用filter和map操作来筛选出用户PV行为,然后使用keyBy和countWindow对用户进行分组并计算UV。对于PV计算,我们使用filter和windowAll操作来处理所有的用户行为,并使用TumblingEventTimeWindows指定1分钟的窗口大小。最后,我们输出结果并执行计算。 请根据实际环境和需求修改参数和逻辑。 ### 回答3: 下面是使用Java版本的Flink读取Kafka数据并实时计算UV和PV的完整代码实现: 首先,您需要确保已经安装好并正确配置了Java、Flink和Kafka。 import org.apache.flink.api.common.functions.FlatMapFunction; import org.apache.flink.api.java.tuple.Tuple2; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer; import org.apache.flink.util.Collector; import java.util.Properties; public class KafkaUVAndPV { public static void main(String[] args) throws Exception { final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); Properties properties = new Properties(); properties.setProperty("bootstrap.servers", "localhost:9092"); properties.setProperty("group.id", "flink-kafka-consumer"); FlinkKafkaConsumer<String> consumer = new FlinkKafkaConsumer<>("your-kafka-topic", new SimpleStringSchema(), properties); DataStream<String> kafkaStream = env.addSource(consumer); DataStream<Tuple2<String, Integer>> pvStream = kafkaStream.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() { @Override public void flatMap(String value, Collector<Tuple2<String, Integer>> out) { out.collect(new Tuple2<>("pv", 1)); } }); DataStream<Tuple2<String, Integer>> uvStream = kafkaStream.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() { @Override public void flatMap(String value, Collector<Tuple2<String, Integer>> out) { // 在这里实现UV的计算逻辑 // 将每个用户的唯一标识添加到Collector } }).keyBy(0).sum(1); pvStream.print(); uvStream.print(); env.execute("Kafka UV and PV"); } } 请注意,上述代码的"your-kafka-topic"需要替换为您要从其读取数据的Kafka主题。此外,在flatMap函数的UV计算逻辑实现可能因具体业务需求而有所不同,请根据实际情况修改。 以上代码将从Kafka主题读取数据流,然后通过flatMap函数将每条数据转换为Tuple2对象,并将其添加到计数器。最后,使用keyBy和sum函数对计数器进行分组并求和,以分别计算出PV和UV。 请注意,此代码仅为示例,您可能需要根据实际需求和数据格式进行适当的修改和调整。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值