nacos注册报错排查
结论
nacos2版本客户端无法连接nacos1版本的服务端,原因是Nacos2.0版本相比1.X新增了gRPC的通信方式,因此需要增加2个端口
问题背景
本机启动若依系统路由模块,nacos配置为公司服务器提供nacos,发现启动报错;
配置的端口为30848,但是日志显示的是31848端口不可用,以下为相关报错日志和nacos配置
--- #################### 注册中心相关配置 ####################
spring:
cloud:
nacos:
server-addr: 172.25.21.186:30848
discovery:
namespace: dev # 命名空间。这里使用 dev 开发环境
register-enabled: false
--- #################### 配置中心相关配置 ####################
spring:
cloud:
nacos:
# Nacos Config 配置项,对应 NacosConfigProperties 配置属性类
config:
server-addr: 172.25.21.186:30848 # Nacos 服务器地址
namespace: dev # 命名空间。这里使用 dev 开发环境
group: DEFAULT_GROUP # 使用的 Nacos 配置分组,默认为 DEFAULT_GROUP
name: # 使用的 Nacos 配置集的 dataId,默认为 spring.application.name
file-extension: yaml # 使用的 Nacos 配置集的 dataId 的文件拓展名,同时也是 Nacos 配置集的配置格式,默认为 properties
2023-02-02 11:05:37.429 ERROR 12708 --- [main] [TID: N/A] c.a.n.c.remote.client.grpc.GrpcClient : Server check fail, please check server 172.25.21.186 ,port 31848 is available , error ={}
java.util.concurrent.ExecutionException: com.alibaba.nacos.shaded.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
at com.alibaba.nacos.shaded.com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:566)
at com.alibaba.nacos.shaded.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:445)
at com.alibaba.nacos.common.remote.client.grpc.GrpcClient.serverCheck(GrpcClient.java:148)
at com.alibaba.nacos.common.remote.client.grpc.GrpcClient.connectToServer(GrpcClient.java:264)
at com.alibaba.nacos.common.remote.client.RpcClient.start(RpcClient.java:390)
at com.alibaba.nacos.client.config.impl.ClientWorker$ConfigRpcTransportClient.ensureRpcClient(ClientWorker.java:885)
at com.alibaba.nacos.client.config.impl.ClientWorker$ConfigRpcTransportClient.getOneRunningClient(ClientWorker.java:1044)
at com.alibaba.nacos.client.config.impl.ClientWorker$ConfigRpcTransportClient.queryConfig(ClientWorker.java:940)
at com.alibaba.nacos.client.config.impl.ClientWorker.getServerConfig(ClientWorker.java:397)
at com.alibaba.nacos.client.config.NacosConfigService.getConfigInner(NacosConfigService.java:166)
at com.alibaba.nacos.client.config.NacosConfigService.getConfig(NacosConfigService.java:94)
at com.alibaba.cloud.nacos.client.NacosPropertySourceBuilder.loadNacosData(NacosPropertySourceBuilder.java:85)
at com.alibaba.cloud.nacos.client.NacosPropertySourceBuilder.build(NacosPropertySourceBuilder.java:73)
at com.alibaba.cloud.nacos.client.NacosPropertySourceLocator.loadNacosPropertySource(NacosPropertySourceLocator.java:199)
at com.alibaba.cloud.nacos.client.NacosPropertySourceLocator.loadNacosDataIfPresent(NacosPropertySourceLocator.java:186)
at com.alibaba.cloud.nacos.client.NacosPropertySourceLocator.loadApplicatio,onfiguration(NacosPropertySourceLocator.java:141)
at com.alibaba.cloud.nacos.client.NacosPropertySourceLocator.locate(NacosPropertySourceLocator.java:103)
at org.springframework.cloud.bootstrap.config.PropertySourceLocator.locateCollection(PropertySourceLocator.java:51)
at org.springframework.cloud.bootstrap.config.PropertySourceLocator.locateCollection(PropertySourceLocator.java:47)
at org.springframework.cloud.bootstrap.config.PropertySourceBootstrapConfiguration.initialize(PropertySourceBootstrapConfiguration.java:95)
at org.springframework.boot.SpringApplication.applyInitializers(SpringApplication.java:604)
at org.springframework.boot.SpringApplication.prepareContext(SpringApplication.java:373)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:306)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1303)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1292)
at cn.iocoder.yudao.gateway.GatewayServerApplication.main(GatewayServerApplication.java:11)
Caused by: com.alibaba.nacos.shaded.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
at com.alibaba.nacos.shaded.io.grpc.Status.asRuntimeException(Status.java:533)
at com.alibaba.nacos.shaded.io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:490)
at com.alibaba.nacos.shaded.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at com.alibaba.nacos.shaded.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at com.alibaba.nacos.shaded.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at com.alibaba.nacos.shaded.io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:700)
at com.alibaba.nacos.shaded.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at com.alibaba.nacos.shaded.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at com.alibaba.nacos.shaded.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at com.alibaba.nacos.shaded.io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:399)
at com.alibaba.nacos.shaded.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:510)
at com.alibaba.nacos.shaded.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:66)
at com.alibaba.nacos.shaded.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:630)
at com.alibaba.nacos.shaded.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:518)
at com.alibaba.nacos.shaded.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:692)
at com.alibaba.nacos.shaded.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:681)
at com.alibaba.nacos.shaded.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at com.alibaba.nacos.shaded.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
问题跟踪
排查思路
通过debug找出31848端口这个端口具体出现的位置,排查相关逻辑,确认问题
排查过程
- 排查类GrpcClient的connectToServer方法
@Override
public Connection connectToServer(ServerInfo serverInfo) {
try {
if (grpcExecutor == null) {
int threadNumber = ThreadUtils.getSuitableThreadCount(8);
grpcExecutor = new ThreadPoolExecutor(threadNumber, threadNumber, 10L, TimeUnit.SECONDS,
new LinkedBlockingQueue<>(10000),
new ThreadFactoryBuilder().daemon(true).nameFormat("nacos-grpc-client-executor-%d")
.build());
grpcExecutor.allowCoreThreadTimeOut(true);
}
int port = serverInfo.getServerPort() + rpcPortOffset();
RequestGrpc.RequestFutureStub newChannelStubTemp = createNewChannelStub(serverInfo.getServerIp(), port);
if (newChannelStubTemp != null) {
Response response = serverCheck(serverInfo.getServerIp(), port, newChannelStubTemp);
if (response == null || !(response instanceof ServerCheckResponse)) {
shuntDownChannel((ManagedChannel) newChannelStubTemp.getChannel());
return null;
}
BiRequestStreamGrpc.BiRequestStreamStub biRequestStreamStub = BiRequestStreamGrpc
.newStub(newChannelStubTemp.getChannel());
GrpcConnection grpcConn = new GrpcConnection(serverInfo, grpcExecutor);
grpcConn.setConnectionId(((ServerCheckResponse) response).getConnectionId());
//create stream request and bind connection event to this connection.
StreamObserver<Payload> payloadStreamObserver = bindRequestStream(biRequestStreamStub, grpcConn);
// stream observer to send response to server
grpcConn.setPayloadStreamObserver(payloadStreamObserver);
grpcConn.setGrpcFutureServiceStub(newChannelStubTemp);
grpcConn.setChannel((ManagedChannel) newChannelStubTemp.getChannel());
//send a setup request.
ConnectionSetupRequest conSetupRequest = new ConnectionSetupRequest();
conSetupRequest.setClientVersion(VersionUtils.getFullClientVersion());
conSetupRequest.setLabels(super.getLabels());
conSetupRequest.setAbilities(super.clientAbilities);
conSetupRequest.setTenant(super.getTenant());
grpcConn.sendRequest(conSetupRequest);
//wait to register connection setup
Thread.sleep(100L);
return grpcConn;
}
return null;
} catch (Exception e) {
LOGGER.error("[{}]Fail to connect to server!,error={}", GrpcClient.this.getName(), e);
}
return null;
}
-
问题聚焦到下面的代码**int port = serverInfo.getServerPort() + rpcPortOffset()**上,debug发现此处的rpcPortOffset()方法返回1000的int型数据,该方法导致启动连接了nacos的31848端口
-
rpcPortOffset为RpcClient类的抽象方法,debug跳转到子类GrpcClusterClient上,以下为相关实现
public class GrpcClusterClient extends GrpcClient {
private static final String NACOS_SERVER_CLUSTER_GRPC_PORT_DEFAULT_OFFSET = "1001";
/**
* Empty constructor.
*
* @param name name of client.
*/
public GrpcClusterClient(String name) {
super(name);
}
@Override
public int rpcPortOffset() {
return Integer.parseInt(System.getProperty(
NACOS_SERVER_GRPC_PORT_OFFSET_KEY, NACOS_SERVER_CLUSTER_GRPC_PORT_DEFAULT_OFFSET));
}
}
-
可以确定,此处读取了配置信息NACOS_SERVER_GRPC_PORT_OFFSET_KEY(nacos.server.grpc.port.offset),该配置默认为NACOS_SERVER_CLUSTER_GRPC_PORT_DEFAULT_OFFSET(1001)
-
以下为若依系统的nacos版本截图
<dependency>
<groupId>com.alibaba.nacos</groupId>
<artifactId>nacos-client</artifactId>
<version>2.0.4</version>
</dependency>
- 查看nacos的官方文档,发现该配置是nacos2版本的新特性,新版本新增了grpc的通信方式,因此nacos服务端新增了2个端口用于grpc通信,而服务器提供的nacos版本1.4.2,版本不对应
- Nacos2.0的服务端完全兼容1.X客户端。Nacos2.0客户端由于使用了gRPC,无法兼容Nacos1.X服务端,请勿使用2.0以上版本客户端连接Nacos1.X服务端。
- Nacos2.0版本相比1.X新增了gRPC的通信方式,因此需要增加2个端口。新增端口是在配置的主端口(server.port)基础上,进行一定偏移量自动生成。
端口 | 与主端口的偏移量 | 描述 |
---|---|---|
9848 | 1000 | 客户端gRPC请求服务端端口,用于客户端向服务端发起连接和请求 |
9849 | 1001 | 服务端gRPC请求服务端端口,用于服务间同步等 |
相关链接
若依系统:https://github.com/YunaiV/yudao-cloud
Nacos2指导文档:https://nacos.io/zh-cn/docs/v2/what-is-nacos.html