出现原因:
/opt/module/flink-1.17.0//bin/yarn-session.sh -d
报错信息:
2024-06-24 18:58:56,437 ERROR org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Error while running the Flink session.
org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:437) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:608) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$4(FlinkYarnSessionCli.java:869) ~[flink-dist-1.17.0.jar:1.17.0]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_221]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_221]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) ~[hadoop-common-3.1.3.jar:?]
at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:869) [flink-dist-1.17.0.jar:1.17.0]
Caused by: org.apache.flink.configuration.IllegalConfigurationException: The number of requested virtual cores for application master 1 exceeds the maximum number of virtual cores 0 available in the Yarn Cluster.
at org.apache.flink.yarn.YarnClusterDescriptor.isReadyForDeployment(YarnClusterDescriptor.java:338) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:567) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:430) ~[flink-dist-1.17.0.jar:1.17.0]
... 7 more
排查过程:
- Yarn容器虚拟内存率不足, 因为是用的虚拟机, 资源相对不足, yarn的虚拟内存比较小.导致启动后yarn创建applicationMaster时内存不足
解决方案:
vi /usr/local/hadoop3/etc/hadoop/yarn-site.xml
#<configuration></configuration>里面添加
<!-- 关闭yarn内存检查 -->
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
按照上诉方法你以为我解决了?
NONONO!!!
最终解决方法
vi /usr/local/hadoop3/etc/hadoop/yarn-site.xml
#修改
<property>
<name>yarn.resourcemanager.hostname</name>
<value>paimon01</value>
</property>
#改为
<property>
<name>yarn.resourcemanager.hostname</name>
<value>paimon02</value>
</property>
排查思路:查看nodeManager日志
启动的时候nodeManager报错
Caused by: java.net.BindException: Port in use: paimon02:8088
at org.apache.hadoop.http.HttpServer2.constructBindException(HttpServer2.java:1213)
at org.apache.hadoop.http.HttpServer2.bindForSinglePort(HttpServer2.java:1235)
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:1294)
at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:1149)
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:439)
... 4 more
Caused by: java.net.BindException: Cannot assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
```
就想到是不是因为端口冲突导致的?
修改resourceManager之后解决