Flink的RPC服务是基于Akka Remote实现的
一个简单的Akka Remoting ActorSystem的配置如下(基于akka 2.6.0版本):
服务端配置及代码:application_server.conf
akka {
actor {
provider = remote
allow-java-serialization = on
serializers {
jackson-json = "akka.serialization.jackson.JacksonJsonSerializer"
}
}
remote {
artery {
enabled = on
transport = tcp # See Selecting a transport below
canonical.hostname = "127.0.0.1"
canonical.port = 25522
}
}
}
public class RemoteServerActor extends AbstractActor {
@Override
public Receive createReceive() {
return receiveBuilder().match(String.class, msg -> {
System.out.println("消息类型为String: " + msg);
}).matchAny(msg -> {
System.out.println("消息类型为任意: " + msg);
}).build();
}
public static void main(String[] args) {
// 配置actorSystem,指定加载的配置文件 server端服务
ActorSystem actorSystem = ActorSystem.create("RemoteService", ConfigFactory.load("application_server.conf"));
ActorRef actorRef = actorSystem.actorOf(Props.create(RemoteServerActor.class), "actorServer");
System.out.println(actorRef.path().toString()); // akka://RemoteService/user/actorServer (输出最终生成的akka actor路径)
}
}
从该配置文件可以看出,要建立一个ActorSystem,首先需要提供ActorSystem运行的机器的地址和端口以及传输的协议tcp/udp;在上诉配置中启用了jackson序列化来传递pojo类型;之后便创建一个actorRef用来代理真实actor的对象;
客户端配置及代码:application_client.conf
参考Akka Remoting的文档,获取远程节点的Actor有两条途径:
- 通过actorSelection(path),在这儿需要知道远程节点的地址。获取到了ActorSelection就已经可以发送消息过去,也可以通过回信获取这个Actor的ActorRef
- 通过配置,远程系统的daemon会被请求建立这个Actor,ActorRef可以直接通过system.actorOf(new Props(...)获取
我们此处采用第一种远程节点地址的方式来进行客户端的构建:
akka {
actor {
provider = remote
allow-java-serialization = on
serializers {
jackson-json = "akka.serialization.jackson.JacksonJsonSerializer"
}
}
remote {
artery {
enabled = on
transport = tcp # See Selecting a transport below
canonical.hostname = "127.0.0.1"
canonical.port = 25523
}
}
}
public class RemoteClientActor {
public static void main(String[] args) throws Exception {
ActorSystem actorSystem = ActorSystem.create("app", ConfigFactory.load("application_client.conf"));
ActorSelection actorSelection = actorSystem.actorSelection("akka://RemoteService@127.0.0.1:25522/user/actorServer"); // 连接远程节点地址
int i = 0;
while (true) {
try { // 周期性发送 测试数据
actorSelection.tell("tcp test: " + i, actorSelection.anchor());
actorSelection.tell("我在测试", actorSelection.anchor());
actorSelection.tell(Message.builder().name("meton" + i).age(i).build(), actorSelection.anchor());
} catch (Exception e) {
e.printStackTrace();
}
i++;
Thread.sleep(5 * 1000);
}
}
@Data
@Builder
static class Message implements Serializable {
private String name;
private Integer age;
}
}
Flink RPC解析:
1、flink中定义的RPC协议
RPC协议是客户端和服务端的通信接口。如下所示定义了一个BaseGateway的通信接口。
public interface BaseGateway extends RpcGateway {
CompletableFuture<Integer> foobar();
}
在Flink中,RPC协议的定义通过实现RpcGateway.
/**
* Rpc gateway interface which has to be implemented by Rpc gateways.
*/
public interface RpcGateway {
/**
* Returns the fully qualified address under which the associated rpc endpoint is reachable.
*
* @return Fully qualified (RPC) address under which the associated rpc endpoint is reachable
*/
String getAddress();
/**
* Returns the fully qualified hostname under which the associated rpc endpoint is reachable.
*
* @return Fully qualified hostname under which the associated rpc endpoint is reachable
*/
String getHostname();
}
这个接口需要实现两个方法,分别是getAddress和getHostname。原因如下:
- 如上文所述,想要通过ActorSystem获取远程Actor,必须要有地址。
- 而在Flink中,例如Yarn这种模式下,JobMaster会先建立ActorSystem,这时TaskExecutor的Container都还没有分配,自然无法在配置中指定远程Actor的地址,所以一个远程节点提供自己的地址是必须的。
2、实现RPC协议
Flink的RPC协议一般定义为一个Java接口,服务端需要实现这个接口。如下是上面定义的BaseGateway的实现。
public static class BaseEndpoint extends RpcEndpoint implements BaseGateway {
private final int foobarValue;
protected BaseEndpoint(RpcService rpcService, int foobarValue) {
super(rpcService);
this.foobarValue = foobarValue;
}
@Override
public CompletableFuture<Integer> foobar() {
return CompletableFuture.completedFuture(foobarValue);
}
@Override
public CompletableFuture<Void> postStop() {
return CompletableFuture.completedFuture(null);
}
}
RpcEndpoint是rpc请求的接收端的基类。RpcEndpoint是通过RpcService来启动的。
3、构造并启动RpcService
- RpcService会在每一个ClusterEntrypoint(JobMaster)和TaskManagerRunner(TaskExecutor)启动的过程中被初始化并启动。
- RpcService主要负责启动RpcEndpoint(也就是服务端),连接到远程的RpcEndpoint并提供一个代理(也就是客户端)。
- 此外,为了防止状态的concurrent modification,RpcEndpoint上所有的Rpc调用都只会运行在主线程上,RpcService提供了运行在其它线程的方法。
4、构造并启动RpcEndpoint(服务端)
每一个RpcEndpoint在初始化阶段会通过该节点的RpcService的startServer方法来初始化服务。其主要代码在RpcEndpoint#RpcEndpoint()构造函数中;主要调用rpcService#startServer
/**
* Initializes the RPC endpoint.
*
* @param rpcService The RPC server that dispatches calls to this RPC endpoint.
* @param endpointId Unique identifier for this endpoint
*/
protected RpcEndpoint(final RpcService rpcService, final String endpointId) {
this.rpcService = checkNotNull(rpcService, "rpcService");
this.endpointId = checkNotNull(endpointId, "endpointId");
this.rpcServer = rpcService.startServer(this);
this.mainThreadExecutor = new MainThreadExecutor(rpcServer);
}
- 在该方法中创建了一个Akka的Actor,这个Actor也是Rpc调用的实际接收者,Rpc的请求会在客户端被封装成RpcInvocation对象以Akka消息的形式发送
- 接下来生成一个本地的InvocationHandler,用于将调用转换成消息发送到相应的RpcEndpoint(具体细节在下一节发送Rpc请求会详细介绍)
- 通过Rpc接口和InvocationHandler构造一个代理对象,这个代理对象存在RpcEndpoint的RpcServer变量中,是给RpcEndpoint所在的JVM本地调用使用
- 启动RpcEndpoint;实际上就是启动构造阶段生成的RpcServer的start方法,这个方法由AkkaInvocationHandler实现,实际上就是向绑定的RpcEndpoint的Actor发送一条START消息,通知它服务已启动。
@Override
public <C extends RpcEndpoint & RpcGateway> RpcServer startServer(C rpcEndpoint) {
...
// 新建一个包含了rpcEndpoint的AkkaRpcActor,负责接收封装成Akka消息的rpc请求
if (rpcEndpoint instanceof FencedRpcEndpoint) {
akkaRpcActorProps = Props.create(FencedAkkaRpcActor.class, rpcEndpoint, terminationFuture, getVersion());
} else {
akkaRpcActorProps = Props.create(AkkaRpcActor.class, rpcEndpoint, terminationFuture, getVersion());
}
ActorRef actorRef;
synchronized (lock) {
checkState(!stopped, "RpcService is stopped");
actorRef = actorSystem.actorOf(akkaRpcActorProps, rpcEndpoint.getEndpointId());
actors.put(actorRef, rpcEndpoint);
}
LOG.info("Starting RPC endpoint for {} at {} .", rpcEndpoint.getClass().getName(), actorRef.path());
final String akkaAddress = AkkaUtils.getAkkaURL(actorSystem, actorRef);
final String hostname;
Option<String> host = actorRef.path().address().host();
if (host.isEmpty()) {
hostname = "localhost";
} else {
hostname = host.get();
}
// 获取这个Endpoint的所有Gateway,也就是所有RPC协议的接口
Set<Class<?>> implementedRpcGateways = new HashSet<>(RpcUtils.extractImplementedRpcGateways(rpcEndpoint.getClass()));
implementedRpcGateways.add(RpcServer.class);
implementedRpcGateways.add(AkkaBasedEndpoint.class);
// 新建一个InvocationHandler,用于将rpc请求包装成LocalRpcInvocation消息并发送给RpcServer(本地)
final InvocationHandler akkaInvocationHandler;
if (rpcEndpoint instanceof FencedRpcEndpoint) {
// a FencedRpcEndpoint needs a FencedAkkaInvocationHandler
akkaInvocationHandler = new FencedAkkaInvocationHandler<>(
akkaAddress,
hostname,
actorRef,
timeout,
maximumFramesize,
terminationFuture,
((FencedRpcEndpoint<?>) rpcEndpoint)::getFencingToken);
implementedRpcGateways.add(FencedMainThreadExecutable.class);
} else {
akkaInvocationHandler = new AkkaInvocationHandler(
akkaAddress,
hostname,
actorRef,
timeout,
maximumFramesize,
terminationFuture);
}
// Rather than using the System ClassLoader directly, we derive the ClassLoader
// from this class . That works better in cases where Flink runs embedded and all Flink
// code is loaded dynamically (for example from an OSGI bundle) through a custom ClassLoader
ClassLoader classLoader = getClass().getClassLoader();
// 生成一个包含这些接口的代理,将调用转发到InvocationHandler
@SuppressWarnings("unchecked")
RpcServer server = (RpcServer) Proxy.newProxyInstance(
classLoader,
implementedRpcGateways.toArray(new Class<?>[implementedRpcGateways.size()]),
akkaInvocationHandler);
return server;
}
// 启动RpcEndpoint; 实际上就是启动构造阶段(上文)生成的RpcServer的start方法;
// 这个方法交由代理类AkkaInvocationHandler处理以此来实现真实类start方法的调用;
// 其最终实际上就是向绑定的RpcEndpoint的Actor发送一条START消息,通知它服务已启动;
AkkaInvocationHandler#invoke()方法
public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
Class<?> declaringClass = method.getDeclaringClass();
Object result;
if (declaringClass.equals(AkkaBasedEndpoint.class) ||
declaringClass.equals(Object.class) ||
declaringClass.equals(RpcGateway.class) ||
declaringClass.equals(StartStoppable.class) ||
declaringClass.equals(MainThreadExecutable.class) ||
declaringClass.equals(RpcServer.class)) {
result = method.invoke(this, args); // 调用此处(指向调用当前的start方法)
} else if (declaringClass.equals(FencedRpcGateway.class)) {
throw new UnsupportedOperationException("AkkaInvocationHandler does not support the call FencedRpcGateway#" +
method.getName() + ". This indicates that you retrieved a FencedRpcGateway without specifying a " +
"fencing token. Please use RpcService#connect(RpcService, F, Time) with F being the fencing token to " +
"retrieve a properly FencedRpcGateway.");
} else {
result = invokeRpc(method, args);
}
return result;
}
@Override
public void start() {
rpcEndpoint.tell(Processing.START, ActorRef.noSender());
}
5、构造Rpc客户端
- Rpc的客户端实际上是一个代理对象,构造这个代理对象,需要提供实现的接口和InvocationHandler,在Flink中有AkkaInvocationHandler的实现;
- 在构造RpcEndpoint的过程中实际上已经生成了一个供本地使用的Rpc客户端。并且通过RpcEndpoint的getSelfGateway方法可以直接获取这个代理对象;
- 而在远程调用时,则通过RpcService的connect方法获取远程RpcEndpoint的客户端(也是一个代理)。connect方法需要提供Actor的地址;
(至于地址是如何获得的,可以通过LeaderRetrievalService,在这个部分不多做介绍)
// this method does not mutate state and is thus thread-safe
@Override
public <C extends RpcGateway> CompletableFuture<C> connect(
final String address,
final Class<C> clazz) {
return connectInternal(
address,
clazz,
(ActorRef actorRef) -> {
Tuple2<String, String> addressHostname = extractAddressHostname(actorRef);
return new AkkaInvocationHandler(
addressHostname.f0,
addressHostname.f1,
actorRef,
timeout,
maximumFramesize,
null);
});
}
private <C extends RpcGateway> CompletableFuture<C> connectInternal(
final String address,
final Class<C> clazz,
Function<ActorRef, InvocationHandler> invocationHandlerFactory) {
checkState(!stopped, "RpcService is stopped");
LOG.debug("Try to connect to remote RPC endpoint with address {}. Returning a {} gateway.",
address, clazz.getName());
// 首先通过地址获取ActorSelection, 这是连接远程Actor的方法之一
// 通过地址获取ActorSelection, 并获取ActorRef引用
final ActorSelection actorSel = actorSystem.actorSelection(address);
final Future<ActorIdentity> identify = Patterns
.ask(actorSel, new Identify(42), timeout.toMilliseconds())
.<ActorIdentity>mapTo(ClassTag$.MODULE$.<ActorIdentity>apply(ActorIdentity.class));
final CompletableFuture<ActorIdentity> identifyFuture = FutureUtils.toJava(identify);
final CompletableFuture<ActorRef> actorRefFuture = identifyFuture.thenApply(
(ActorIdentity actorIdentity) -> {
if (actorIdentity.getRef() == null) {
throw new CompletionException(new RpcConnectionException("Could not connect to rpc endpoint under address " + address + '.'));
} else {
return actorIdentity.getRef();
}
});
// 通过ActorSelection获取ActorRef并发送握手消息 // 发送handshake消息
final CompletableFuture<HandshakeSuccessMessage> handshakeFuture = actorRefFuture.thenCompose(
(ActorRef actorRef) -> FutureUtils.toJava(
Patterns
.ask(actorRef, new RemoteHandshakeMessage(clazz, getVersion()), timeout.toMilliseconds())
.<HandshakeSuccessMessage>mapTo(ClassTag$.MODULE$.<HandshakeSuccessMessage>apply(HandshakeSuccessMessage.class))));
// 最后根据ActorRef,通过InvocationHandlerFactory生成AkkaInvocationHandler并构造代理
// 根据ActorRef引用生成InvocationHandler
return actorRefFuture.thenCombineAsync(
handshakeFuture,
(ActorRef actorRef, HandshakeSuccessMessage ignored) -> {
InvocationHandler invocationHandler = invocationHandlerFactory.apply(actorRef);
// Rather than using the System ClassLoader directly, we derive the ClassLoader
// from this class . That works better in cases where Flink runs embedded and all Flink
// code is loaded dynamically (for example from an OSGI bundle) through a custom ClassLoader
ClassLoader classLoader = getClass().getClassLoader();
@SuppressWarnings("unchecked")
C proxy = (C) Proxy.newProxyInstance(
classLoader,
new Class<?>[]{clazz},
invocationHandler);
return proxy;
},
actorSystem.dispatcher());
}
6、发送Rpc请求
上文中客户端会提供代理对象,而代理对象会调用AkkaInvocationHandler#invoke()方法并传入Rpc调用的方法和参数信息;
而在AkkaInvocationHandler对该方法的实现中,会判断方法属于哪个类,如果是Rpc方法的话就会调用invokeRpc方法。
AkkaInvocationHandler#invoke()方法
public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
Class<?> declaringClass = method.getDeclaringClass();
Object result;
if (declaringClass.equals(AkkaBasedEndpoint.class) ||
declaringClass.equals(Object.class) ||
declaringClass.equals(RpcGateway.class) ||
declaringClass.equals(StartStoppable.class) ||
declaringClass.equals(MainThreadExecutable.class) ||
declaringClass.equals(RpcServer.class)) {
result = method.invoke(this, args); // 调用此处(指向调用当前的start方法)
} else if (declaringClass.equals(FencedRpcGateway.class)) {
throw new UnsupportedOperationException("AkkaInvocationHandler does not support the call FencedRpcGateway#" +
method.getName() + ". This indicates that you retrieved a FencedRpcGateway without specifying a " +
"fencing token. Please use RpcService#connect(RpcService, F, Time) with F being the fencing token to " +
"retrieve a properly FencedRpcGateway.");
} else {
result = invokeRpc(method, args); // rpc方法调用
}
return result;
}
/**
* Invokes a RPC method by sending the RPC invocation details to the rpc endpoint.
*
* @param method to call
* @param args of the method call
* @return result of the RPC
* @throws Exception if the RPC invocation fails
*/
private Object invokeRpc(Method method, Object[] args) throws Exception {
String methodName = method.getName();
Class<?>[] parameterTypes = method.getParameterTypes();
Annotation[][] parameterAnnotations = method.getParameterAnnotations();
Time futureTimeout = extractRpcTimeout(parameterAnnotations, args, timeout);
// 将 方法名、参数类型、传递参数等封装成RpcInvocation(LocalRpcInvocation、RemoteRpcInvocation)
final RpcInvocation rpcInvocation = createRpcInvocationMessage(methodName, parameterTypes, args);
Class<?> returnType = method.getReturnType();
final Object result;
// 根据返回类型判断使用tell还是ask的形式发送akka消息
if (Objects.equals(returnType, Void.TYPE)) {
tell(rpcInvocation);
result = null;
} else if (Objects.equals(returnType, CompletableFuture.class)) {
// execute an asynchronous call
result = ask(rpcInvocation, futureTimeout);
} else {
// execute a synchronous call
CompletableFuture<?> futureResult = ask(rpcInvocation, futureTimeout);
result = futureResult.get(futureTimeout.getSize(), futureTimeout.getUnit());
}
return result;
}
对于invokeRpc方法的rpc调用过程:首先将方法封装成一个RpcInvocation,其主要封装(rpc所调用的方法名,参数类型以及传递参数;methodName, parameterTypes, args);它有两种实现,其会根据当前AkkaInvocationHandler和对应的RpcEndpoint是否在同一个JVM中来判断生成哪一个
- 一种是本地的LocalRpcInvocation,不需要序列化
- 另一种是远程的RemoteRpcInvocation;
7、Rpc请求的处理
首先Rpc消息是通过RpcEndpoint所绑定的Actor的ActorRef发送的,所以接收到消息的就是RpcEndpoint构造期间生成的AkkRpcActor(RpcServer#startServer(rpcEndpoint))
public <C extends RpcEndpoint & RpcGateway> RpcServer startServer(C rpcEndpoint) {
checkNotNull(rpcEndpoint, "rpc endpoint");
CompletableFuture<Void> terminationFuture = new CompletableFuture<>();
final Props akkaRpcActorProps;
if (rpcEndpoint instanceof FencedRpcEndpoint) {
akkaRpcActorProps = Props.create(FencedAkkaRpcActor.class, rpcEndpoint, terminationFuture, getVersion());
} else {
akkaRpcActorProps = Props.create(AkkaRpcActor.class, rpcEndpoint, terminationFuture, getVersion()); // 构建Props属性
}
ActorRef actorRef;
synchronized (lock) {
checkState(!stopped, "RpcService is stopped");
actorRef = actorSystem.actorOf(akkaRpcActorProps, rpcEndpoint.getEndpointId()); // 生成真实ActorRef
actors.put(actorRef, rpcEndpoint);
}
......
}
AkkaRpcActor接收到的消息总共有三种
- 一种是握手消息,如上文所述,在客户端构造时会通过ActorSelection发送过来。收到消息后会检查接口,版本,如果一致就返回成功;
- 第二种是启停消息。例如在RpcEndpoint调用start方法后,就会向自身发送一条Processing.START消息,来转换当前Actor的状态为STARTED。STOP也类似。并且只有在Actor状态为STARTED时才会处理Rpc请求;
- 第三种就是Rpc请求消息,通过解析RpcInvocation获取方法名和参数类型,并从RpcEndpoint类中找到Method对象,并通过反射调用该方法。如果有返回结果,会以Akka消息的形式发送回sender;
一个简单的基于Flink RPC的示例DEMO如下(基于flink 1.7.0版本):
1、服务端(server端)
// 先定义基本接口:
public interface HelloGateway extends RpcGateway {
String hello();
}
// 其具体接口实现类,继承flink rpc抽象类RpcEndpoint
public class HelloRpcEndpoint extends RpcEndpoint implements HelloGateway {
public HelloRpcEndpoint(RpcService rpcService) {
super(rpcService);
}
@Override
public String hello() {
System.out.println("Server端调用:hello()函数");
return "hello";
}
@Override
public CompletableFuture<Void> postStop() {
return CompletableFuture.completedFuture(null);
}
}
// 定义akka,并启动rpc server服务
public class RpcServer {
public static void main(String[] args) throws Exception {
ActorSystem actorSystem = AkkaUtils.createActorSystem(new Configuration(), "", 25533); // flink rpc默认 akkaSystem名字为 flink
RpcService rpcService = new AkkaRpcService(actorSystem, Time.seconds(10L));
// 启动rpc服务,并绑定监听本地端口25533
HelloRpcEndpoint helloRpcEndpoint = new HelloRpcEndpoint(rpcService);
helloRpcEndpoint.start();
}
}
配置log4j日志,在rpc server端启动时,看到输出日志如下:
INFO : akka.event.slf4j.Slf4jLogger - Slf4jLogger started
DEBUG: akka.event.EventStream - logger log1-Slf4jLogger started
DEBUG: akka.event.EventStream - Default Loggers started
INFO : akka.remote.Remoting - Starting remoting
DEBUG: org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.SelectorUtil - Using select timeout of 500
DEBUG: org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.SelectorUtil - Epoll-bug workaround enabled = false
INFO : akka.remote.Remoting - Remoting started; listening on addresses :[akka.tcp://flink@192.168.100.1:25533]
INFO : org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for com.meton.flink.rpc.RpcInterface$HelloRpcEndpoint at akka://flink/user/eaf7d36f-ab11-48b7-b2a2-eda1b6cee78d .
其中主要关注rpc server端所监听的端口信息和对应提供的rpc服务的endpoint信息:
- listening on addresses :[akka.tcp://flink@192.168.100.1:25533]
- Starting RPC endpoint for com.meton.flink.rpc.RpcInterface$HelloRpcEndpoint at akka://flink/user/eaf7d36f-ab11-48b7-b2a2-eda1b6cee78d .
由此可以看出其akka system所默认的名称和其监听端口信息分别为flink和25533;启动的endpoint所在的actorRef所指向的akka路径为 /user/eaf7d36f-ab11-48b7-b2a2-eda1b6cee78d .
2、客户端(client):
// 定义flink rpc客户端
public class RpcClient {
public static void main(String[] args) throws Exception {
ActorSystem actorSystem = AkkaUtils.createActorSystem(new Configuration(), "", 25532); // 绑定端口25532
RpcService rpcService = new AkkaRpcService(actorSystem, Time.seconds(10L));
// akka.tcp://flink@192.168.100.1:25533/user/eaf7d36f-ab11-48b7-b2a2-eda1b6cee78d
HelloGateway proxyHello = rpcService.connect("akka.tcp://flink@192.168.100.1:25533/user/eaf7d36f-ab11-48b7-b2a2-eda1b6cee78d", HelloGateway.class).get();
System.out.println(proxyHello.hello());
rpcService.stopService();
actorSystem.shutdown();
System.exit(0);
}
}
由上诉Server端的输出日志可以得出远程节点的地址为:akka.tcp://flink@192.168.100.1:25533/user/eaf7d36f-ab11-48b7-b2a2-eda1b6cee78d;
所以在构建客户端的时候只需要传递服务端actor所对应的远程节点地址即可。