Flink之Java 8

地址:https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/java8.html


Java8引入新特性,可以更快更清晰的编程,最重要的特性就是Lambda表达式,开启了Java函数式编程的大门。Lambda表达式允许实现和传递匿名函数!

例如:

words.map{x=>(x,1)}中的x=>(x,1)就是一个单一变量到二元组的匿名函数!


最新的Flink版本支持Java 的Lambda表达式编程,这边文档将会讲解Lambda表达式编程和目前的局限!更多介绍参考:Programming Guide


Examples

如下例子就是Lambda表达式的程序,map函数将元数平方之后打印,其中的map里面的函数就是一个匿名函数,不需要声明函数类型,Java8可以推断出类型

env.fromElements(1, 2, 3)
// returns the squared i
.map(i -> i*i)
.print();

如下例子是无法推断出Collector的类型的,所以需要声明类型:

DataSet<Integer> input = env.fromElements(1, 2, 3);

// collector type must be declared
input.flatMap((Integer number, Collector<String> out) -> {
    StringBuilder builder = new StringBuilder();
    for(int i = 0; i < number; i++) {
        builder.append("a");
        out.collect(builder.toString());
    }
})
// returns (on separate lines) "a", "a", "aa", "a", "aa", "aaa"
.print();
如下是可以根据DataSet推断出类型:
DataSet<Integer> input = env.fromElements(1, 2, 3);

// collector type must not be declared, it is inferred from the type of the dataset
DataSet<String> manyALetters = input.flatMap((number, out) -> {
    StringBuilder builder = new StringBuilder();
    for(int i = 0; i < number; i++) {
       builder.append("a");
       out.collect(builder.toString());
    }
});

如下代码展示了一个使用lambda表达式的wordcount程序:


DataSet<String> input = env.fromElements("Please count", "the words", "but not this");

// filter out strings that contain "not"
input.filter(line -> !line.contains("not"))
// split each line by space
.map(line -> line.split(" "))
// emit a pair <word,1> for each array element
.flatMap((String[] wordArray, Collector<Tuple2<String, Integer>> out)
    -> Arrays.stream(wordArray).forEach(t -> out.collect(new Tuple2<>(t, 1)))
    )
// group and sum up
.groupBy(0).sum(1)
// print
.print();

如下是一些Java8使用的限制约束等问题,具体参考文档!

Compiler Limitations

Currently, Flink only supports jobs containing Lambda Expressions completely if they are compiled with the Eclipse JDT compiler contained in Eclipse Luna 4.4.2 (and above).

Only the Eclipse JDT compiler preserves the generic type information necessary to use the entire Lambda Expressions feature type-safely. Other compilers such as the OpenJDK’s and Oracle JDK’s javac throw away all generic parameters related to Lambda Expressions. This means that types such as Tuple2<String,Integer or Collector<String> declared as a Lambda function input or output parameter will be pruned to Tuple2 or Collector in the compiled .class files, which is too little information for the Flink Compiler.

How to compile a Flink job that contains Lambda Expressions with the JDT compiler will be covered in the next section.

However, it is possible to implement functions such as map() or filter() with Lambda Expressions in Java 8 compilers other than the Eclipse JDT compiler as long as the function has no Collectors or Iterableand only if the function handles unparameterized types such as IntegerLongStringMyOwnClass (types without Generics!).

If you are using the Eclipse IDE, you can run and debug your Flink code within the IDE without any problems after some configuration steps. The Eclipse IDE by default compiles its Java sources with the Eclipse JDT compiler. The next section describes how to configure the Eclipse IDE.

If you are using a different IDE such as IntelliJ IDEA or you want to package your Jar-File with Maven to run your job on a cluster, you need to modify your project’s pom.xml file and build your program with Maven. The quickstart contains preconfigured Maven projects which can be used for new projects or as a reference. Uncomment the mentioned lines in your generated quickstart pom.xml file if you want to use Java 8 with Lambda Expressions.

Alternatively, you can manually insert the following lines to your Maven pom.xml file. Maven will then use the Eclipse JDT compiler for compilation.

<!-- put these lines under "project/build/pluginManagement/plugins" of your pom.xml -->

<plugin>
    <!-- Use compiler plugin with tycho as the adapter to the JDT compiler. -->
    <artifactId>maven-compiler-plugin</artifactId>
    <configuration>
        <source>1.8</source>
        <target>1.8</target>
        <compilerId>jdt</compilerId>
    </configuration>
    <dependencies>
        <!-- This dependency provides the implementation of compiler "jdt": -->
        <dependency>
            <groupId>org.eclipse.tycho</groupId>
            <artifactId>tycho-compiler-jdt</artifactId>
            <version>0.21.0</version>
        </dependency>
    </dependencies>
</plugin>

If you are using Eclipse for development, the m2e plugin might complain about the inserted lines above and marks your pom.xml as invalid. If so, insert the following lines to your pom.xml.

<!-- put these lines under "project/build/pluginManagement/plugins/plugin[groupId="org.eclipse.m2e", artifactId="lifecycle-mapping"]/configuration/lifecycleMappingMetadata/pluginExecutions" of your pom.xml -->

<pluginExecution>
    <pluginExecutionFilter>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <versionRange>[3.1,)</versionRange>
        <goals>
            <goal>testCompile</goal>
            <goal>compile</goal>
        </goals>
    </pluginExecutionFilter>
    <action>
        <ignore></ignore>
    </action>
</pluginExecution>

First of all, make sure you are running a current version of Eclipse IDE (4.4.2 or later). Also make sure that you have a Java 8 Runtime Environment (JRE) installed in Eclipse IDE (Window -> Preferences -> Java -> Installed JREs).

Create/Import your Eclipse project.

If you are using Maven, you also need to change the Java version in your pom.xml for the maven-compiler-plugin. Otherwise right click the JRE System Library section of your project and open the Properties window in order to switch to a Java 8 JRE (or above) that supports Lambda Expressions.

The Eclipse JDT compiler needs a special compiler flag in order to store type information in .class files. Open the JDT configuration file at {project directoy}/.settings/org.eclipse.jdt.core.prefs with your favorite text editor and add the following line:

org.eclipse.jdt.core.compiler.codegen.lambda.genericSignature=generate

If not already done, also modify the Java versions of the following properties to 1.8 (or above):

org.eclipse.jdt.core.compiler.codegen.targetPlatform=1.8
org.eclipse.jdt.core.compiler.compliance=1.8
org.eclipse.jdt.core.compiler.source=1.8

After you have saved the file, perform a complete project refresh in Eclipse IDE.

If you are using Maven, right click your Eclipse project and select Maven -> Update Project....

You have configured everything correctly, if the following Flink program runs without exceptions:

final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
env.fromElements(1, 2, 3).map((in) -> new Tuple1<String>(" " + in)).print();
env.execute();







评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值