2018/11/11
java模板
doc 地址:https://ci.apache.org/projects/flink/flink-docs-release-1.6/quickstart/java_api_quickstart.html。
- 新建一个目录,在该目录下:
- curl https://flink.apache.org/q/quickstart.sh | bash -s 1.6.1
- 导入IDEA中一看,有些懵逼,还真只是个框架:
- 这个模板的pom文件中,有不少可以借鉴的写法,以及打包减少体积和未来冲突可能性的办法。
ExecutionEnvironment
ExecutionEnvironment有两个子类:
- {@link LocalEnvironment} will cause execution in the current JVM;
- {@link RemoteEnvironment} will cause execution on a remote setup.
环境的初始化:
/** The environment of the context (local by default, cluster if invoked through command line). */
private static ExecutionEnvironmentFactory contextEnvironmentFactory;
默认创建本地执行环境:
/** The local execution environment will run the program in a
* multi-threaded fashion in the same JVM as the environment was created in. The default
* parallelism of the local environment is the number of hardware contexts
*/
public static LocalEnvironment createLocalEnvironment() {
return createLocalEnvironment(defaultLocalDop);
}
一些成员变量:
Batch 示例
wordcount
java版
该示例说明了一些Flink的DataSet API的使用。
github下载源码后,代码路径为:flink161_src/flink-release-1.6.1/flink-examples/flink-examples-batch/src
public class WordCount {
public static void main(String[] args) throws Exception {
final ParameterTool params = ParameterTool.fromArgs(args);
// set up the execution environment
final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
// make parameters available in the web interface
env.getConfig().setGlobalJobParameters(params);
// get input data
DataSet<String> text;
if (params.has("input")) {
// read the text file from given input path
text = env.readTextFile(params.get("input"));
} else {
// get default test text data
System.out.println("Executing WordCount example with default input data set.");
System.out.println("Use --input to specify file input.");
text = WordCountData.getDefaultTextLineDataSet(env);
}
DataSet<Tuple2<String, Integer>> counts =
<