Flink 1.19.1 standalone 集群模式部署及配置

flink 1.19起 conf/flink-conf.yaml 更改为新的 conf/config.yaml

7df23ada99d64064a148a28d1770c559.png

standalone集群: dev001、dev002、dev003

config.yaml: jobmanager address 统一使用 dev001,bind-port 统一改成 0.0.0.0,taskmanager address 分别更改为dev所在host

dev001 config.yaml:
################################################################################
#  Licensed to the Apache Software Foundation (ASF) under one
#  or more contributor license agreements.  See the NOTICE file
#  distributed with this work for additional information
#  regarding copyright ownership.  The ASF licenses this file
#  to you under the Apache License, Version 2.0 (the
#  "License"); you may not use this file except in compliance
#  with the License.  You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS,
#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#  See the License for the specific language governing permissions and
#  limitations under the License.
################################################################################

# These parameters are required for Java 17 support.
# They can be safely removed when using Java 8/11.
env:
  java:
    opts:
      all: --add-exports=java.base/sun.net.util=ALL-UNNAMED --add-exports=java.rmi/sun.rmi.registry=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-exports=java.security.jgss/sun.security.krb5=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.text=ALL-UNNAMED --add-opens=java.base/java.time=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED

#==============================================================================
# Common
#==============================================================================

jobmanager:
  # The host interface the JobManager will bind to. By default, this is localhost, and will prevent
  # the JobManager from communicating outside the machine/container it is running on.
  # On YARN this setting will be ignored if it is set to 'localhost', defaulting to 0.0.0.0.
  # On Kubernetes this setting will be ignored, defaulting to 0.0.0.0.
  #
  # To enable this, set the bind-host address to one that has access to an outside facing network
  # interface, such as 0.0.0.0.
  bind-host: 0.0.0.0
  rpc:
    # The external address of the host on which the JobManager runs and can be
    # reached by the TaskManagers and any clients which want to connect. This setting
    # is only used in Standalone mode and may be overwritten on the JobManager side
    # by specifying the --host <hostname> parameter of the bin/jobmanager.sh executable.
    # In high availability mode, if you use the bin/start-cluster.sh script and setup
    # the conf/masters file, this will be taken care of automatically. Yarn
    # automatically configure the host name based on the hostname of the node where the
    # JobManager runs.
    address: dev001
    # The RPC port where the JobManager is reachable.
    port: 6123
  memory:
    process:
      # The total process memory size for the JobManager.
      # Note this accounts for all memory usage within the JobManager process, including JVM metaspace and other overhead.
      size: 1600m
  execution:
    # The failover strategy, i.e., how the job computation recovers from task failures.
    # Only restart tasks that may have been affected by the task failure, which typically includes
    # downstream tasks and potentially upstream tasks if their produced data is no longer available for consumption.
    failover-strategy: region

taskmanager:
  # The host interface the TaskManager will bind to. By default, this is localhost, and will prevent
  # the TaskManager from communicating outside the machine/container it is running on.
  # On YARN this setting will be ignored if it is set to 'localhost', defaulting to 0.0.0.0.
  # On Kubernetes this setting will be ignored, defaulting to 0.0.0.0.
  #
  # To enable this, set the bind-host address to one that has access to an outside facing network
  # interface, such as 0.0.0.0.
  bind-host: 0.0.0.0
  # The address of the host on which the TaskManager runs and can be reached by the JobManager and
  # other TaskManagers. If not specified, the TaskManager will try different strategies to identify
  # the address.
  #
  # Note this address needs to be reachable by the JobManager and forward traffic to one of
  # the interfaces the TaskManager is bound to (see 'taskmanager.bind-host').
  #
  # Note also that unless all TaskManagers are running on the same machine, this address needs to be
  # configured separately for each TaskManager.
  host: dev001
  # The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
  numberOfTaskSlots: 2
  memory:
    process:
      # The total process memory size for the TaskManager.
      #
      # Note this accounts for all memory usage within the TaskManager process, including JVM metaspace and other overhead.
      # To exclude JVM metaspace and overhead, please, use total Flink memory size instead of 'taskmanager.memory.process.size'.
      # It is not recommended to set both 'taskmanager.memory.process.size' and Flink memory.
      size: 1728m

parallelism:
  # The parallelism used for programs that did not specify and other parallelism.
  default: 1

# # The default file system scheme and authority.
# # By default file paths without scheme are interpreted relative to the local
# # root file system 'file:///'. Use this to override the default and interpret
# # relative paths relative to a different file system,
# # for example 'hdfs://mynamenode:12345'
# fs:
#   default-scheme: hdfs://mynamenode:12345

#==============================================================================
# High Availability
#==============================================================================

# high-availability:
#   # The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
#   type: zookeeper
#   # The path where metadata for master recovery is persisted. While ZooKeeper stores
#   # the small ground truth for checkpoint and leader election, this location stores
#   # the larger objects, like persisted dataflow graphs.
#   #
#   # Must be a durable file system that is accessible from all nodes
#   # (like HDFS, S3, Ceph, nfs, ...)
#   storageDir: hdfs:///flink/ha/
#   zookeeper:
#     # The list of ZooKeeper quorum peers that coordinate the high-availability
#     # setup. This must be a list of the form:
#     # "host1:clientPort,host2:clientPort,..." (default clientPort: 2181)
#     quorum: localhost:2181
#     client:
#       # ACL options are based on https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_BuiltinACLSchemes
#       # It can be either "creator" (ZOO_CREATE_ALL_ACL) or "open" (ZOO_OPEN_ACL_UNSAFE)
#       # The default value is "open" and it can be changed to "creator" if ZK security is enabled
#       acl: open

#==============================================================================
# Fault tolerance and checkpointing
#==============================================================================

# The backend that will be used to store operator state checkpoints if
# checkpointing is enabled. Checkpointing is enabled when execution.checkpointing.interval > 0.

# # Execution checkpointing related parameters. Please refer to CheckpointConfig and ExecutionCheckpointingOptions for more details.
# execution:
#   checkpointing:
#     interval: 3min
#     externalized-checkpoint-retention: [DELETE_ON_CANCELLATION, RETAIN_ON_CANCELLATION]
#     max-concurrent-checkpoints: 1
#     min-pause: 0
#     mode: [EXACTLY_ONCE, AT_LEAST_ONCE]
#     timeout: 10min
#     tolerable-failed-checkpoints: 0
#     unaligned: false

# state:
#   backend:
#     # Supported backends are 'hashmap', 'rocksdb', or the
#     # <class-name-of-factory>.
#     type: hashmap
#     # Flag to enable/disable incremental checkpoints for backends that
#     # support incremental checkpoints (like the RocksDB state backend).
#     incremental: false
#   checkpoints:
#       # Directory for checkpoints filesystem, when using any of the default bundled
#       # state backends.
#       dir: hdfs://namenode-host:port/flink-checkpoints
#   savepoints:
#       # Default target directory for savepoints, optional.
#       dir: hdfs://namenode-host:port/flink-savepoints

#==============================================================================
# Rest & web frontend
#==============================================================================

rest:
  # The address to which the REST client will connect to
  address: dev001
  # The address that the REST & web server binds to
  # By default, this is localhost, which prevents the REST & web server from
  # being able to communicate outside of the machine/container it is running on.
  #
  # To enable this, set the bind address to one that has access to outside-facing
  # network interface, such as 0.0.0.0.
  bind-address: 0.0.0.0
  # # The port to which the REST client connects to. If rest.bind-port has
  # # not been specified, then the server will bind to this port as well.
  # port: 8081
  # # Port range for the REST and web server to bind to.
  # bind-port: 8080-8090

# web:
#   submit:
#     # Flag to specify whether job submission is enabled from the web-based
#     # runtime monitor. Uncomment to disable.
#     enable: false
#   cancel:
#     # Flag to specify whether job cancellation is enabled from the web-based
#     # runtime monitor. Uncomment to disable.
#     enable: false

#==============================================================================
# Advanced
#==============================================================================

# io:
#   tmp:
#     # Override the directories for temporary files. If not specified, the
#     # system-specific Java temporary directory (java.io.tmpdir property) is taken.
#     #
#     # For framework setups on Yarn, Flink will automatically pick up the
#     # containers' temp directories without any need for configuration.
#     #
#     # Add a delimited list for multiple directories, using the system directory
#     # delimiter (colon ':' on unix) or a comma, e.g.:
#     # /data1/tmp:/data2/tmp:/data3/tmp
#     #
#     # Note: Each directory entry is read from and written to by a different I/O
#     # thread. You can include the same directory multiple times in order to create
#     # multiple I/O threads against that directory. This is for example relevant for
#     # high-throughput RAIDs.
#     dirs: /tmp

# classloader:
#   resolve:
#     # The classloading resolve order. Possible values are 'child-first' (Flink's default)
#     # and 'parent-first' (Java's default).
#     #
#     # Child first classloading allows users to use different dependency/library
#     # versions in their application than those in the classpath. Switching back
#     # to 'parent-first' may help with debugging dependency issues.
#     order: child-first

# The amount of memory going to the network stack. These numbers usually need
# no tuning. Adjusting them may be necessary in case of an "Insufficient number
# of network buffers" error. The default min is 64MB, the default max is 1GB.
#
# taskmanager:
#   memory:
#     network:
#       fraction: 0.1
#       min: 64mb
#       max: 1gb

#==============================================================================
# Flink Cluster Security Configuration
#==============================================================================

# Kerberos authentication for various components - Hadoop, ZooKeeper, and connectors -
# may be enabled in four steps:
# 1. configure the local krb5.conf file
# 2. provide Kerberos credentials (either a keytab or a ticket cache w/ kinit)
# 3. make the credentials available to various JAAS login contexts
# 4. configure the connector to use JAAS/SASL

# # The below configure how Kerberos credentials are provided. A keytab will be used instead of
# # a ticket cache if the keytab path and principal are set.
# security:
#   kerberos:
#     login:
#       use-ticket-cache: true
#       keytab: /path/to/kerberos/keytab
#       principal: flink-user
#       # The configuration below defines which JAAS login contexts
#       contexts: Client,KafkaClient

#==============================================================================
# ZK Security Configuration
#==============================================================================

# zookeeper:
#   sasl:
#     # Below configurations are applicable if ZK ensemble is configured for security
#     #
#     # Override below configuration to provide custom ZK service name if configured
#     # zookeeper.sasl.service-name: zookeeper
#     #
#     # The configuration below must match one of the values set in "security.kerberos.login.contexts"
#     login-context-name: Client

#==============================================================================
# HistoryServer
#==============================================================================

# The HistoryServer is started and stopped via bin/historyserver.sh (start|stop)
#
# jobmanager:
#   archive:
#     fs:
#       # Directory to upload completed jobs to. Add this directory to the list of
#       # monitored directories of the HistoryServer as well (see below).
#       dir: hdfs:///completed-jobs/

# historyserver:
#   web:
#     # The address under which the web-based HistoryServer listens.
#     address: 0.0.0.0
#     # The port under which the web-based HistoryServer listens.
#     port: 8082
#   archive:
#     fs:
#       # Comma separated list of directories to monitor for completed jobs.
#       dir: hdfs:///completed-jobs/
#       # Interval in milliseconds for refreshing the monitored directories.
#       fs.refresh-interval: 10000


dev002 config.yaml:
################################################################################
#  Licensed to the Apache Software Foundation (ASF) under one
#  or more contributor license agreements.  See the NOTICE file
#  distributed with this work for additional information
#  regarding copyright ownership.  The ASF licenses this file
#  to you under the Apache License, Version 2.0 (the
#  "License"); you may not use this file except in compliance
#  with the License.  You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS,
#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#  See the License for the specific language governing permissions and
#  limitations under the License.
################################################################################

# These parameters are required for Java 17 support.
# They can be safely removed when using Java 8/11.
env:
  java:
    opts:
      all: --add-exports=java.base/sun.net.util=ALL-UNNAMED --add-exports=java.rmi/sun.rmi.registry=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-exports=java.security.jgss/sun.security.krb5=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.text=ALL-UNNAMED --add-opens=java.base/java.time=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED

#==============================================================================
# Common
#==============================================================================

jobmanager:
  # The host interface the JobManager will bind to. By default, this is localhost, and will prevent
  # the JobManager from communicating outside the machine/container it is running on.
  # On YARN this setting will be ignored if it is set to 'localhost', defaulting to 0.0.0.0.
  # On Kubernetes this setting will be ignored, defaulting to 0.0.0.0.
  #
  # To enable this, set the bind-host address to one that has access to an outside facing network
  # interface, such as 0.0.0.0.
  bind-host: 0.0.0.0
  rpc:
    # The external address of the host on which the JobManager runs and can be
    # reached by the TaskManagers and any clients which want to connect. This setting
    # is only used in Standalone mode and may be overwritten on the JobManager side
    # by specifying the --host <hostname> parameter of the bin/jobmanager.sh executable.
    # In high availability mode, if you use the bin/start-cluster.sh script and setup
    # the conf/masters file, this will be taken care of automatically. Yarn
    # automatically configure the host name based on the hostname of the node where the
    # JobManager runs.
    address: dev001
    # The RPC port where the JobManager is reachable.
    port: 6123
  memory:
    process:
      # The total process memory size for the JobManager.
      # Note this accounts for all memory usage within the JobManager process, including JVM metaspace and other overhead.
      size: 1600m
  execution:
    # The failover strategy, i.e., how the job computation recovers from task failures.
    # Only restart tasks that may have been affected by the task failure, which typically includes
    # downstream tasks and potentially upstream tasks if their produced data is no longer available for consumption.
    failover-strategy: region

taskmanager:
  # The host interface the TaskManager will bind to. By default, this is localhost, and will prevent
  # the TaskManager from communicating outside the machine/container it is running on.
  # On YARN this setting will be ignored if it is set to 'localhost', defaulting to 0.0.0.0.
  # On Kubernetes this setting will be ignored, defaulting to 0.0.0.0.
  #
  # To enable this, set the bind-host address to one that has access to an outside facing network
  # interface, such as 0.0.0.0.
  bind-host: 0.0.0.0
  # The address of the host on which the TaskManager runs and can be reached by the JobManager and
  # other TaskManagers. If not specified, the TaskManager will try different strategies to identify
  # the address.
  #
  # Note this address needs to be reachable by the JobManager and forward traffic to one of
  # the interfaces the TaskManager is bound to (see 'taskmanager.bind-host').
  #
  # Note also that unless all TaskManagers are running on the same machine, this address needs to be
  # configured separately for each TaskManager.
  host: dev002
  # The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
  numberOfTaskSlots: 2
  memory:
    process:
      # The total process memory size for the TaskManager.
      #
      # Note this accounts for all memory usage within the TaskManager process, including JVM metaspace and other overhead.
      # To exclude JVM metaspace and overhead, please, use total Flink memory size instead of 'taskmanager.memory.process.size'.
      # It is not recommended to set both 'taskmanager.memory.process.size' and Flink memory.
      size: 1728m

parallelism:
  # The parallelism used for programs that did not specify and other parallelism.
  default: 1

# # The default file system scheme and authority.
# # By default file paths without scheme are interpreted relative to the local
# # root file system 'file:///'. Use this to override the default and interpret
# # relative paths relative to a different file system,
# # for example 'hdfs://mynamenode:12345'
# fs:
#   default-scheme: hdfs://mynamenode:12345

#==============================================================================
# High Availability
#==============================================================================

# high-availability:
#   # The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
#   type: zookeeper
#   # The path where metadata for master recovery is persisted. While ZooKeeper stores
#   # the small ground truth for checkpoint and leader election, this location stores
#   # the larger objects, like persisted dataflow graphs.
#   #
#   # Must be a durable file system that is accessible from all nodes
#   # (like HDFS, S3, Ceph, nfs, ...)
#   storageDir: hdfs:///flink/ha/
#   zookeeper:
#     # The list of ZooKeeper quorum peers that coordinate the high-availability
#     # setup. This must be a list of the form:
#     # "host1:clientPort,host2:clientPort,..." (default clientPort: 2181)
#     quorum: localhost:2181
#     client:
#       # ACL options are based on https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_BuiltinACLSchemes
#       # It can be either "creator" (ZOO_CREATE_ALL_ACL) or "open" (ZOO_OPEN_ACL_UNSAFE)
#       # The default value is "open" and it can be changed to "creator" if ZK security is enabled
#       acl: open

#==============================================================================
# Fault tolerance and checkpointing
#==============================================================================

# The backend that will be used to store operator state checkpoints if
# checkpointing is enabled. Checkpointing is enabled when execution.checkpointing.interval > 0.

# # Execution checkpointing related parameters. Please refer to CheckpointConfig and ExecutionCheckpointingOptions for more details.
# execution:
#   checkpointing:
#     interval: 3min
#     externalized-checkpoint-retention: [DELETE_ON_CANCELLATION, RETAIN_ON_CANCELLATION]
#     max-concurrent-checkpoints: 1
#     min-pause: 0
#     mode: [EXACTLY_ONCE, AT_LEAST_ONCE]
#     timeout: 10min
#     tolerable-failed-checkpoints: 0
#     unaligned: false

# state:
#   backend:
#     # Supported backends are 'hashmap', 'rocksdb', or the
#     # <class-name-of-factory>.
#     type: hashmap
#     # Flag to enable/disable incremental checkpoints for backends that
#     # support incremental checkpoints (like the RocksDB state backend).
#     incremental: false
#   checkpoints:
#       # Directory for checkpoints filesystem, when using any of the default bundled
#       # state backends.
#       dir: hdfs://namenode-host:port/flink-checkpoints
#   savepoints:
#       # Default target directory for savepoints, optional.
#       dir: hdfs://namenode-host:port/flink-savepoints

#==============================================================================
# Rest & web frontend
#==============================================================================

rest:
  # The address to which the REST client will connect to
  address: dev002
  # The address that the REST & web server binds to
  # By default, this is localhost, which prevents the REST & web server from
  # being able to communicate outside of the machine/container it is running on.
  #
  # To enable this, set the bind address to one that has access to outside-facing
  # network interface, such as 0.0.0.0.
  bind-address: 0.0.0.0
  # # The port to which the REST client connects to. If rest.bind-port has
  # # not been specified, then the server will bind to this port as well.
  # port: 8081
  # # Port range for the REST and web server to bind to.
  # bind-port: 8080-8090

# web:
#   submit:
#     # Flag to specify whether job submission is enabled from the web-based
#     # runtime monitor. Uncomment to disable.
#     enable: false
#   cancel:
#     # Flag to specify whether job cancellation is enabled from the web-based
#     # runtime monitor. Uncomment to disable.
#     enable: false

#==============================================================================
# Advanced
#==============================================================================

# io:
#   tmp:
#     # Override the directories for temporary files. If not specified, the
#     # system-specific Java temporary directory (java.io.tmpdir property) is taken.
#     #
#     # For framework setups on Yarn, Flink will automatically pick up the
#     # containers' temp directories without any need for configuration.
#     #
#     # Add a delimited list for multiple directories, using the system directory
#     # delimiter (colon ':' on unix) or a comma, e.g.:
#     # /data1/tmp:/data2/tmp:/data3/tmp
#     #
#     # Note: Each directory entry is read from and written to by a different I/O
#     # thread. You can include the same directory multiple times in order to create
#     # multiple I/O threads against that directory. This is for example relevant for
#     # high-throughput RAIDs.
#     dirs: /tmp

# classloader:
#   resolve:
#     # The classloading resolve order. Possible values are 'child-first' (Flink's default)
#     # and 'parent-first' (Java's default).
#     #
#     # Child first classloading allows users to use different dependency/library
#     # versions in their application than those in the classpath. Switching back
#     # to 'parent-first' may help with debugging dependency issues.
#     order: child-first

# The amount of memory going to the network stack. These numbers usually need
# no tuning. Adjusting them may be necessary in case of an "Insufficient number
# of network buffers" error. The default min is 64MB, the default max is 1GB.
#
# taskmanager:
#   memory:
#     network:
#       fraction: 0.1
#       min: 64mb
#       max: 1gb

#==============================================================================
# Flink Cluster Security Configuration
#==============================================================================

# Kerberos authentication for various components - Hadoop, ZooKeeper, and connectors -
# may be enabled in four steps:
# 1. configure the local krb5.conf file
# 2. provide Kerberos credentials (either a keytab or a ticket cache w/ kinit)
# 3. make the credentials available to various JAAS login contexts
# 4. configure the connector to use JAAS/SASL

# # The below configure how Kerberos credentials are provided. A keytab will be used instead of
# # a ticket cache if the keytab path and principal are set.
# security:
#   kerberos:
#     login:
#       use-ticket-cache: true
#       keytab: /path/to/kerberos/keytab
#       principal: flink-user
#       # The configuration below defines which JAAS login contexts
#       contexts: Client,KafkaClient

#==============================================================================
# ZK Security Configuration
#==============================================================================

# zookeeper:
#   sasl:
#     # Below configurations are applicable if ZK ensemble is configured for security
#     #
#     # Override below configuration to provide custom ZK service name if configured
#     # zookeeper.sasl.service-name: zookeeper
#     #
#     # The configuration below must match one of the values set in "security.kerberos.login.contexts"
#     login-context-name: Client

#==============================================================================
# HistoryServer
#==============================================================================

# The HistoryServer is started and stopped via bin/historyserver.sh (start|stop)
#
# jobmanager:
#   archive:
#     fs:
#       # Directory to upload completed jobs to. Add this directory to the list of
#       # monitored directories of the HistoryServer as well (see below).
#       dir: hdfs:///completed-jobs/

# historyserver:
#   web:
#     # The address under which the web-based HistoryServer listens.
#     address: 0.0.0.0
#     # The port under which the web-based HistoryServer listens.
#     port: 8082
#   archive:
#     fs:
#       # Comma separated list of directories to monitor for completed jobs.
#       dir: hdfs:///completed-jobs/
#       # Interval in milliseconds for refreshing the monitored directories.
#       fs.refresh-interval: 10000


dev003 config.yaml:
################################################################################
#  Licensed to the Apache Software Foundation (ASF) under one
#  or more contributor license agreements.  See the NOTICE file
#  distributed with this work for additional information
#  regarding copyright ownership.  The ASF licenses this file
#  to you under the Apache License, Version 2.0 (the
#  "License"); you may not use this file except in compliance
#  with the License.  You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS,
#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#  See the License for the specific language governing permissions and
#  limitations under the License.
################################################################################

# These parameters are required for Java 17 support.
# They can be safely removed when using Java 8/11.
env:
  java:
    opts:
      all: --add-exports=java.base/sun.net.util=ALL-UNNAMED --add-exports=java.rmi/sun.rmi.registry=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-exports=java.security.jgss/sun.security.krb5=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.text=ALL-UNNAMED --add-opens=java.base/java.time=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED

#==============================================================================
# Common
#==============================================================================

jobmanager:
  # The host interface the JobManager will bind to. By default, this is localhost, and will prevent
  # the JobManager from communicating outside the machine/container it is running on.
  # On YARN this setting will be ignored if it is set to 'localhost', defaulting to 0.0.0.0.
  # On Kubernetes this setting will be ignored, defaulting to 0.0.0.0.
  #
  # To enable this, set the bind-host address to one that has access to an outside facing network
  # interface, such as 0.0.0.0.
  bind-host: 0.0.0.0
  rpc:
    # The external address of the host on which the JobManager runs and can be
    # reached by the TaskManagers and any clients which want to connect. This setting
    # is only used in Standalone mode and may be overwritten on the JobManager side
    # by specifying the --host <hostname> parameter of the bin/jobmanager.sh executable.
    # In high availability mode, if you use the bin/start-cluster.sh script and setup
    # the conf/masters file, this will be taken care of automatically. Yarn
    # automatically configure the host name based on the hostname of the node where the
    # JobManager runs.
    address: dev001
    # The RPC port where the JobManager is reachable.
    port: 6123
  memory:
    process:
      # The total process memory size for the JobManager.
      # Note this accounts for all memory usage within the JobManager process, including JVM metaspace and other overhead.
      size: 1600m
  execution:
    # The failover strategy, i.e., how the job computation recovers from task failures.
    # Only restart tasks that may have been affected by the task failure, which typically includes
    # downstream tasks and potentially upstream tasks if their produced data is no longer available for consumption.
    failover-strategy: region

taskmanager:
  # The host interface the TaskManager will bind to. By default, this is localhost, and will prevent
  # the TaskManager from communicating outside the machine/container it is running on.
  # On YARN this setting will be ignored if it is set to 'localhost', defaulting to 0.0.0.0.
  # On Kubernetes this setting will be ignored, defaulting to 0.0.0.0.
  #
  # To enable this, set the bind-host address to one that has access to an outside facing network
  # interface, such as 0.0.0.0.
  bind-host: 0.0.0.0
  # The address of the host on which the TaskManager runs and can be reached by the JobManager and
  # other TaskManagers. If not specified, the TaskManager will try different strategies to identify
  # the address.
  #
  # Note this address needs to be reachable by the JobManager and forward traffic to one of
  # the interfaces the TaskManager is bound to (see 'taskmanager.bind-host').
  #
  # Note also that unless all TaskManagers are running on the same machine, this address needs to be
  # configured separately for each TaskManager.
  host: dev003
  # The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
  numberOfTaskSlots: 2
  memory:
    process:
      # The total process memory size for the TaskManager.
      #
      # Note this accounts for all memory usage within the TaskManager process, including JVM metaspace and other overhead.
      # To exclude JVM metaspace and overhead, please, use total Flink memory size instead of 'taskmanager.memory.process.size'.
      # It is not recommended to set both 'taskmanager.memory.process.size' and Flink memory.
      size: 1728m

parallelism:
  # The parallelism used for programs that did not specify and other parallelism.
  default: 1

# # The default file system scheme and authority.
# # By default file paths without scheme are interpreted relative to the local
# # root file system 'file:///'. Use this to override the default and interpret
# # relative paths relative to a different file system,
# # for example 'hdfs://mynamenode:12345'
# fs:
#   default-scheme: hdfs://mynamenode:12345

#==============================================================================
# High Availability
#==============================================================================

# high-availability:
#   # The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
#   type: zookeeper
#   # The path where metadata for master recovery is persisted. While ZooKeeper stores
#   # the small ground truth for checkpoint and leader election, this location stores
#   # the larger objects, like persisted dataflow graphs.
#   #
#   # Must be a durable file system that is accessible from all nodes
#   # (like HDFS, S3, Ceph, nfs, ...)
#   storageDir: hdfs:///flink/ha/
#   zookeeper:
#     # The list of ZooKeeper quorum peers that coordinate the high-availability
#     # setup. This must be a list of the form:
#     # "host1:clientPort,host2:clientPort,..." (default clientPort: 2181)
#     quorum: localhost:2181
#     client:
#       # ACL options are based on https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_BuiltinACLSchemes
#       # It can be either "creator" (ZOO_CREATE_ALL_ACL) or "open" (ZOO_OPEN_ACL_UNSAFE)
#       # The default value is "open" and it can be changed to "creator" if ZK security is enabled
#       acl: open

#==============================================================================
# Fault tolerance and checkpointing
#==============================================================================

# The backend that will be used to store operator state checkpoints if
# checkpointing is enabled. Checkpointing is enabled when execution.checkpointing.interval > 0.

# # Execution checkpointing related parameters. Please refer to CheckpointConfig and ExecutionCheckpointingOptions for more details.
# execution:
#   checkpointing:
#     interval: 3min
#     externalized-checkpoint-retention: [DELETE_ON_CANCELLATION, RETAIN_ON_CANCELLATION]
#     max-concurrent-checkpoints: 1
#     min-pause: 0
#     mode: [EXACTLY_ONCE, AT_LEAST_ONCE]
#     timeout: 10min
#     tolerable-failed-checkpoints: 0
#     unaligned: false

# state:
#   backend:
#     # Supported backends are 'hashmap', 'rocksdb', or the
#     # <class-name-of-factory>.
#     type: hashmap
#     # Flag to enable/disable incremental checkpoints for backends that
#     # support incremental checkpoints (like the RocksDB state backend).
#     incremental: false
#   checkpoints:
#       # Directory for checkpoints filesystem, when using any of the default bundled
#       # state backends.
#       dir: hdfs://namenode-host:port/flink-checkpoints
#   savepoints:
#       # Default target directory for savepoints, optional.
#       dir: hdfs://namenode-host:port/flink-savepoints

#==============================================================================
# Rest & web frontend
#==============================================================================

rest:
  # The address to which the REST client will connect to
  address: dev003
  # The address that the REST & web server binds to
  # By default, this is localhost, which prevents the REST & web server from
  # being able to communicate outside of the machine/container it is running on.
  #
  # To enable this, set the bind address to one that has access to outside-facing
  # network interface, such as 0.0.0.0.
  bind-address: 0.0.0.0
  # # The port to which the REST client connects to. If rest.bind-port has
  # # not been specified, then the server will bind to this port as well.
  # port: 8081
  # # Port range for the REST and web server to bind to.
  # bind-port: 8080-8090

# web:
#   submit:
#     # Flag to specify whether job submission is enabled from the web-based
#     # runtime monitor. Uncomment to disable.
#     enable: false
#   cancel:
#     # Flag to specify whether job cancellation is enabled from the web-based
#     # runtime monitor. Uncomment to disable.
#     enable: false

#==============================================================================
# Advanced
#==============================================================================

# io:
#   tmp:
#     # Override the directories for temporary files. If not specified, the
#     # system-specific Java temporary directory (java.io.tmpdir property) is taken.
#     #
#     # For framework setups on Yarn, Flink will automatically pick up the
#     # containers' temp directories without any need for configuration.
#     #
#     # Add a delimited list for multiple directories, using the system directory
#     # delimiter (colon ':' on unix) or a comma, e.g.:
#     # /data1/tmp:/data2/tmp:/data3/tmp
#     #
#     # Note: Each directory entry is read from and written to by a different I/O
#     # thread. You can include the same directory multiple times in order to create
#     # multiple I/O threads against that directory. This is for example relevant for
#     # high-throughput RAIDs.
#     dirs: /tmp

# classloader:
#   resolve:
#     # The classloading resolve order. Possible values are 'child-first' (Flink's default)
#     # and 'parent-first' (Java's default).
#     #
#     # Child first classloading allows users to use different dependency/library
#     # versions in their application than those in the classpath. Switching back
#     # to 'parent-first' may help with debugging dependency issues.
#     order: child-first

# The amount of memory going to the network stack. These numbers usually need
# no tuning. Adjusting them may be necessary in case of an "Insufficient number
# of network buffers" error. The default min is 64MB, the default max is 1GB.
#
# taskmanager:
#   memory:
#     network:
#       fraction: 0.1
#       min: 64mb
#       max: 1gb

#==============================================================================
# Flink Cluster Security Configuration
#==============================================================================

# Kerberos authentication for various components - Hadoop, ZooKeeper, and connectors -
# may be enabled in four steps:
# 1. configure the local krb5.conf file
# 2. provide Kerberos credentials (either a keytab or a ticket cache w/ kinit)
# 3. make the credentials available to various JAAS login contexts
# 4. configure the connector to use JAAS/SASL

# # The below configure how Kerberos credentials are provided. A keytab will be used instead of
# # a ticket cache if the keytab path and principal are set.
# security:
#   kerberos:
#     login:
#       use-ticket-cache: true
#       keytab: /path/to/kerberos/keytab
#       principal: flink-user
#       # The configuration below defines which JAAS login contexts
#       contexts: Client,KafkaClient

#==============================================================================
# ZK Security Configuration
#==============================================================================

# zookeeper:
#   sasl:
#     # Below configurations are applicable if ZK ensemble is configured for security
#     #
#     # Override below configuration to provide custom ZK service name if configured
#     # zookeeper.sasl.service-name: zookeeper
#     #
#     # The configuration below must match one of the values set in "security.kerberos.login.contexts"
#     login-context-name: Client

#==============================================================================
# HistoryServer
#==============================================================================

# The HistoryServer is started and stopped via bin/historyserver.sh (start|stop)
#
# jobmanager:
#   archive:
#     fs:
#       # Directory to upload completed jobs to. Add this directory to the list of
#       # monitored directories of the HistoryServer as well (see below).
#       dir: hdfs:///completed-jobs/

# historyserver:
#   web:
#     # The address under which the web-based HistoryServer listens.
#     address: 0.0.0.0
#     # The port under which the web-based HistoryServer listens.
#     port: 8082
#   archive:
#     fs:
#       # Comma separated list of directories to monitor for completed jobs.
#       dir: hdfs:///completed-jobs/
#       # Interval in milliseconds for refreshing the monitored directories.
#       fs.refresh-interval: 10000

conf/masters 及 conf/works 使用 xsync 同步分发命令 分发到各机器节点即可:

5c4cf13312fd443a81162dbbec3460c8.png

-- masters
dev001:8081

-- workers
dev001
dev002
dev003

补充:

linux 查看port是否被占用:nestat -apn|grep 8081

linux 查看各节点flink任务:jps

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
### 回答1: Flink standalone集群搭建步骤如下: 1. 下载Flink安装包并解压缩到指定目录。 2. 配置Flink集群的masters和workers节点,可以在conf目录下的masters和workers文件中进行配置。 3. 启动Flink集群的masters节点,可以使用bin/start-cluster.sh命令启动。 4. 启动Flink集群的workers节点,可以使用bin/taskmanager.sh start命令启动。 5. 验证Flink集群是否正常运行,可以使用bin/flink list命令查看当前运行的Flink作业。 6. 在Flink集群中提交作业,可以使用bin/flink run命令提交作业。 7. 监控Flink集群的运行状态,可以使用Flink的Web UI或者JMX监控工具进行监控。 以上就是Flink standalone集群搭建的基本步骤,希望对您有所帮助。 ### 回答2: Apache Flink是一个处理流和批量数据的通用分布式计算引擎,可在大规模数据集上快速实现低延迟和高吞吐量。Flink提供了一个Standalone集群模式,使开发人员可以在自己的本地机器上测试和验证他们的应用程序,而无需构建一个完整的分布式环境。在本文中,我们将介绍如何搭建一个Flink Standalone集群。 1. 确保你的环境满足Flink的要求,比如安装Java环境等。 2. 下载Flink二进制文件。从Flink官网下载最新的tar文件,然后解压到一个目录下。 3. 配置Flink。打开conf/flink-conf.yaml文件,配置Flink的参数,比如jobmanager.rpc.address(JobManager监听的主机地址),taskmanager.numberOfTaskSlots(每个TaskManager能够执行的任务数)等。 4. 启动JobManager。在Flink的bin目录下执行以下命令: ./start-cluster.sh 这将启动JobManager和TaskManager进程。 5. 访问Flink Web Dashboard。在浏览器中输入http://localhost:8081,可以访问Flink Web Dashboard。这里可以查看集群的状态、运行中的任务、日志等。 6. 启动应用程序。使用Flink提供的运行脚本(bin/flink run)来提交应用程序。 7. 观察应用程序的运行状态。可以在Flink Web Dashboard中查看应用程序的运行状态和日志,还可以监控各种指标,如吞吐量、延迟、资源使用情况等。 8. 停止集群。在bin目录下执行以下命令: ./stop-cluster.sh 这将停止JobManager和TaskManager进程。 总之,通过Flink Standalone集群,您可以在本地机器上测试和验证您的应用程序,并且几乎没有任何成本。值得注意的是,Standalone集群并不适合生产环境,但当您需要在本地机器上调试应用程序时,它是一个很好的选择。 ### 回答3: Apache Flink是一个开源的分布式流处理系统。它以高效、可伸缩和容错为设计目标,因此广泛应用于大数据领域。Flink可以运行在各种集群上,包括Hadoop YARN和Apache Mesos等。在本文中,我们将讨论如何在Flink standalone集群上搭建分布式流处理系统。 Flink standalone集群搭建的准备工作: 在搭建Flink standalone集群之前,需要确保已经完成以下准备工作: 1. 安装Java 8或更高版本。 2. 下载Flink发行版,并解压缩至安装目录。 Flink standalone集群搭建的步骤: 1. 在主节点上启动Flink集群管理器。在Flink所在目录下,输入以下命令: ./bin/start-cluster.sh 2. 查看集群状态。在Flink所在目录下,输入以下命令: ./bin/flink list 如果输出结果为空,则说明集群状态正常。 3. 在从节点上启动TaskManager。在从节点所在机器上,输入以下命令: ./bin/taskmanager.sh start 4. 查看TaskManager状态。在从节点所在机器上,输入以下命令: ./bin/taskmanager.sh status 如果输出结果为“正常运行”,则说明TaskManager已经成功启动。 5. 提交Flink作业。在Flink所在目录下,输入以下命令: ./bin/flink run ./examples/streaming/SocketWindowWordCount.jar --port 9000 其中,SocketWindowWordCount.jar是一个简单的Flink作业,用于计算流式数据的词频统计。 6. 监控作业运行情况。在浏览器中输入以下地址: http://localhost:8081 可以查看作业的运行状态、性能指标等信息。 总结: 通过以上步骤,我们已经成功搭建了Flink standalone集群,并提交了一个简单的流处理作业。需要注意的是,本文仅提供了基础的搭建步骤,实际生产环境中还需要进行更加细致的配置和管理。同时,Flink具有丰富的API和生态系统,可以灵活应对不同的数据处理场景。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

SunTecTec

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值