详细的错误信息:Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: This type (GenericType<cn.yan.streaming.TestSocketWordCount.WordCount>) cannot be used as key.
请看我的javabean,
// 定义一个内部类
public static class WordCount {
public String word;
public long count;
public WordCount(String word, long count) {
this.word = word;
this.count = count;
}
@Override
public String toString() {
return "WordCount{" +
"word='" + word + '\'' +
", count=" + count +
'}';
}
}
请注意,一般情况下我们的代码本身没什么问题,但是今天写flink测试代码的时候出现了这个问题,我们少写了一个无参的构造函数,导致了这个错误出现,加上无参的构造函数之后就没有这个问题了。真的是一个无参构造函数触发的血案啊,血淋淋的教训。
我的详细代码如下:
package cn.yan.streaming;
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.api.java.utils.ParameterTool;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.util.Collector;
/**
* flink统计单词
*/
public class TestSocketWordCount {
public static void main(String[] args) throws Exception {
// 获取端口号(输入获取)
int port;
try {
ParameterTool tool = ParameterTool.fromArgs(args);
port = tool.getInt("port");
} catch (Exception e) {
port = 9000;
}
// 获取flink运行环境
StreamExecutionEnvironment environment = StreamExecutionEnvironment.getExecutionEnvironment();
// 定义主机名(flink运行的服务器的ip)
String hostname = "192.168.1.13";
// 分隔符
String delimiter = "\n";
// 获取数据源,通过socket获取数据
DataStreamSource<String> text = environment.socketTextStream(hostname, port, delimiter);
// 执行算子
DataStream<WordCount> results = text.flatMap(new FlatMapFunction<String, WordCount>() {
@Override
public void flatMap(String value, Collector<WordCount> collector) {
String[] splits = value.split("\\s");
for (String word : splits) {
collector.collect(new WordCount(word, 1L));
}
}
}).keyBy("word")
// 每隔一秒连续两秒内的统计数据
.timeWindow(Time.seconds(2), Time.seconds(1))
.sum("count"); // 可以使用sum函数统计,也可使用reduce进行统计
// .reduce(new ReduceFunction<WordCount>() {
// @Override
// public WordCount reduce(WordCount t1, WordCount t2) throws Exception {
// return new WordCount(t1.word, t1.count + t2.count);
// }
// });
// 把数据打印到控制台并设置并行度
results.print().setParallelism(1);
// 调用execute执行程序
environment.execute("socket");
}
// 定义一个内部类
public static class WordCount {
public String word;
public long count;
public WordCount(String word, long count) {
this.word = word;
this.count = count;
}
@Override
public String toString() {
return "WordCount{" +
"word='" + word + '\'' +
", count=" + count +
'}';
}
}
}
正确的代码为
package cn.yan.streaming;
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.api.java.utils.ParameterTool;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.util.Collector;
/**
* flink统计单词
*/
public class TestSocketWordCount {
public static void main(String[] args) throws Exception {
// 获取端口号(输入获取)
int port;
try {
ParameterTool tool = ParameterTool.fromArgs(args);
port = tool.getInt("port");
} catch (Exception e) {
port = 9000;
}
// 获取flink运行环境
StreamExecutionEnvironment environment = StreamExecutionEnvironment.getExecutionEnvironment();
// 定义主机名(flink运行的服务器的ip)
String hostname = "192.168.1.13";
// 分隔符
String delimiter = "\n";
// 获取数据源,通过socket获取数据
DataStreamSource<String> text = environment.socketTextStream(hostname, port, delimiter);
// 执行算子
DataStream<WordCount> results = text.flatMap(new FlatMapFunction<String, WordCount>() {
@Override
public void flatMap(String value, Collector<WordCount> collector) {
String[] splits = value.split("\\s");
for (String word : splits) {
collector.collect(new WordCount(word, 1L));
}
}
}).keyBy("word")
// 每隔一秒连续两秒内的统计数据
.timeWindow(Time.seconds(2), Time.seconds(1))
.sum("count"); // 可以使用sum函数统计,也可使用reduce进行统计
// .reduce(new ReduceFunction<WordCount>() {
// @Override
// public WordCount reduce(WordCount t1, WordCount t2) throws Exception {
// return new WordCount(t1.word, t1.count + t2.count);
// }
// });
// 把数据打印到控制台并设置并行度
results.print().setParallelism(1);
// 调用execute执行程序
environment.execute("socket");
}
// 定义一个内部类
public static class WordCount {
public String word;
public long count;
public WordCount() {
}
public WordCount(String word, long count) {
this.word = word;
this.count = count;
}
@Override
public String toString() {
return "WordCount{" +
"word='" + word + '\'' +
", count=" + count +
'}';
}
}
}
如图所示,启动程序之后正常执行了。
一定要加上无参的构造函数!
一定要加上无参的构造函数!
一定要加上无参的构造函数!
重要的事情说三遍!!!