地址:https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/java8.html
Java8引入新特性,可以更快更清晰的编程,最重要的特性就是Lambda表达式,开启了Java函数式编程的大门。Lambda表达式允许实现和传递匿名函数!
例如:
words.map{x=>(x,1)}中的x=>(x,1)就是一个单一变量到二元组的匿名函数!
最新的Flink版本支持Java 的Lambda表达式编程,这边文档将会讲解Lambda表达式编程和目前的局限!更多介绍参考:Programming Guide
Examples
如下例子就是Lambda表达式的程序,map函数将元数平方之后打印,其中的map里面的函数就是一个匿名函数,不需要声明函数类型,Java8可以推断出类型
env.fromElements(1, 2, 3)
// returns the squared i
.map(i -> i*i)
.print();
如下例子是无法推断出Collector的类型的,所以需要声明类型:
DataSet<Integer> input = env.fromElements(1, 2, 3);
// collector type must be declared
input.flatMap((Integer number, Collector<String> out) -> {
StringBuilder builder = new StringBuilder();
for(int i = 0; i < number; i++) {
builder.append("a");
out.collect(builder.toString());
}
})
// returns (on separate lines) "a", "a", "aa", "a", "aa", "aaa"
.print();
如下是可以根据DataSet推断出类型:
DataSet<Integer> input = env.fromElements(1, 2, 3);
// collector type must not be declared, it is inferred from the type of the dataset
DataSet<String> manyALetters = input.flatMap((number, out) -> {
StringBuilder builder = new StringBuilder();
for(int i = 0; i < number; i++) {
builder.append("a");
out.collect(builder.toString());
}
});
如下代码展示了一个使用lambda表达式的wordcount程序:
DataSet<String> input = env.fromElements("Please count", "the words", "but not this");
// filter out strings that contain "not"
input.filter(line -> !line.contains("not"))
// split each line by space
.map(line -> line.split(" "))
// emit a pair <word,1> for each array element
.flatMap((String[] wordArray, Collector<Tuple2<String, Integer>> out)
-> Arrays.stream(wordArray).forEach(t -> out.collect(new Tuple2<>(t, 1)))
)
// group and sum up
.groupBy(0).sum(1)
// print
.print();
如下是一些Java8使用的限制约束等问题,具体参考文档!
Compiler Limitations
Currently, Flink only supports jobs containing Lambda Expressions completely if they are compiled with the Eclipse JDT compiler contained in Eclipse Luna 4.4.2 (and above).
Only the Eclipse JDT compiler preserves the generic type information necessary to use the entire Lambda Expressions feature type-safely. Other compilers such as the OpenJDK’s and Oracle JDK’s javac
throw away all generic parameters related to Lambda Expressions. This means that types such as Tuple2<String,Integer
or Collector<String>
declared as a Lambda function input or output parameter will be pruned to Tuple2
or Collector
in the compiled .class
files, which is too little information for the Flink Compiler.
How to compile a Flink job that contains Lambda Expressions with the JDT compiler will be covered in the next section.
However, it is possible to implement functions such as map()
or filter()
with Lambda Expressions in Java 8 compilers other than the Eclipse JDT compiler as long as the function has no Collector
s or Iterable
s and only if the function handles unparameterized types such as Integer
, Long
, String
, MyOwnClass
(types without Generics!).
Compile Flink jobs with the Eclipse JDT compiler and Maven
If you are using the Eclipse IDE, you can run and debug your Flink code within the IDE without any problems after some configuration steps. The Eclipse IDE by default compiles its Java sources with the Eclipse JDT compiler. The next section describes how to configure the Eclipse IDE.
If you are using a different IDE such as IntelliJ IDEA or you want to package your Jar-File with Maven to run your job on a cluster, you need to modify your project’s pom.xml
file and build your program with Maven. The quickstart contains preconfigured Maven projects which can be used for new projects or as a reference. Uncomment the mentioned lines in your generated quickstart pom.xml
file if you want to use Java 8 with Lambda Expressions.
Alternatively, you can manually insert the following lines to your Maven pom.xml
file. Maven will then use the Eclipse JDT compiler for compilation.
<!-- put these lines under "project/build/pluginManagement/plugins" of your pom.xml -->
<plugin>
<!-- Use compiler plugin with tycho as the adapter to the JDT compiler. -->
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.8</source>
<target>1.8</target>
<compilerId>jdt</compilerId>
</configuration>
<dependencies>
<!-- This dependency provides the implementation of compiler "jdt": -->
<dependency>
<groupId>org.eclipse.tycho</groupId>
<artifactId>tycho-compiler-jdt</artifactId>
<version>0.21.0</version>
</dependency>
</dependencies>
</plugin>
If you are using Eclipse for development, the m2e plugin might complain about the inserted lines above and marks your pom.xml
as invalid. If so, insert the following lines to your pom.xml
.
<!-- put these lines under "project/build/pluginManagement/plugins/plugin[groupId="org.eclipse.m2e", artifactId="lifecycle-mapping"]/configuration/lifecycleMappingMetadata/pluginExecutions" of your pom.xml -->
<pluginExecution>
<pluginExecutionFilter>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<versionRange>[3.1,)</versionRange>
<goals>
<goal>testCompile</goal>
<goal>compile</goal>
</goals>
</pluginExecutionFilter>
<action>
<ignore></ignore>
</action>
</pluginExecution>
Run and debug Flink jobs within the Eclipse IDE
First of all, make sure you are running a current version of Eclipse IDE (4.4.2 or later). Also make sure that you have a Java 8 Runtime Environment (JRE) installed in Eclipse IDE (Window
-> Preferences
-> Java
-> Installed JREs
).
Create/Import your Eclipse project.
If you are using Maven, you also need to change the Java version in your pom.xml
for the maven-compiler-plugin
. Otherwise right click the JRE System Library
section of your project and open the Properties
window in order to switch to a Java 8 JRE (or above) that supports Lambda Expressions.
The Eclipse JDT compiler needs a special compiler flag in order to store type information in .class
files. Open the JDT configuration file at {project directoy}/.settings/org.eclipse.jdt.core.prefs
with your favorite text editor and add the following line:
org.eclipse.jdt.core.compiler.codegen.lambda.genericSignature=generate
If not already done, also modify the Java versions of the following properties to 1.8
(or above):
org.eclipse.jdt.core.compiler.codegen.targetPlatform=1.8
org.eclipse.jdt.core.compiler.compliance=1.8
org.eclipse.jdt.core.compiler.source=1.8
After you have saved the file, perform a complete project refresh in Eclipse IDE.
If you are using Maven, right click your Eclipse project and select Maven
-> Update Project...
.
You have configured everything correctly, if the following Flink program runs without exceptions:
final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
env.fromElements(1, 2, 3).map((in) -> new Tuple1<String>(" " + in)).print();
env.execute();