失败更像是分布式系统的一个特性。因此Akka用一个容忍失败的模型,在你的业务逻辑与失败处理逻辑(supervision逻辑)中间你能有一个清晰的边界。只需要一点点工作,这很赞。这就是我们要讨论的主题。
ACTOR SUPERVISION
想象一个方法调用了你栈顶的方法但却出了一个异常。那么在栈下的方法能做什么呢?
- 抓住异常并按顺序处理恢复
- 抓住异常,也许记个日志并保持安静。
- 下层的方法也可以选择无视这个异常(或者抓住并重扔出来)
想象下一直扔到main方法仍然没有处理这个异常。这种情况下,程序肯定会输出一个异常给console后退出。
你可以把同样的情况套用在线程上。如果一个子线程抛了异常而再假设run或**call*方法没有处理它,那么这个异常就会期望放在父线程或主线程中解决,无论哪种情况,如果主线程没有处理他,系统就会退出。
让我们再看看 - 如果被context.actorof创建出来的子Actor因为一个异常失败了。父actor(指supervisor)可以处理子actor的任何失败。如果父actor做了,他可以处理并恢复(Restart/Resume)。另外,把异常传递(Escalate)给父actor。 还有一种做法,可以直接stop掉子actor - 这就是那个子actor结局了。 为什么我说父actor(那个supervisor)?这是因为akka的监护方式为家长监护 - 这意味着只有创建了actor的人才能监护他们。
就这么多了!我们已经覆盖到所有监护指令(Directoives)了。
策略
我忘了说一点: 你已经知道一个Akka Actor可以创建子actor并且子actor也可以随意创建他们自己的子actor。
现在,想下以下两个场景:
1.OneForOneStrategy
你的actor创建了很多子actor并且每一个子actor都连接了不同的数据源。假设你运行的是一个将英语翻译成多种语言的应用。
假设,一个子actor失败了然而你可以接受在最终结果里跳过这个结果,你想怎么做?关掉这个服务?当然不,你可能想要重启/关闭这个有问题的子actor。是吧?现在这个策略在Akka的监护策略中叫OneForOneStrategy策略 - 如果一个actor挂了,只单独处理这个actor。
基于你的业务异常,你可能需要对不同的异常有不同的反应(停止,重启,升级,恢复)。要配置你自己的策略,你只需要override你Actor类中的supervisorStrategy方法。
声明OneForOneStrategy的例子
import akka.actor.Actor
import akka.actor.ActorLogging
import akka.actor.OneForOneStrategy
import akka.actor.SupervisorStrategy.Stop
class TeacherActorOneForOne extends Actor with ActorLogging {
...
...
override val supervisorStrategy=OneForOneStrategy() {
case _: MinorRecoverableException => Restart
case _: Exception => Stop
}
...
...
2.AllForOneStrategy策略
假设你在做一个外部排序 (这是个又证明了我没啥创造力的例子!),你的每个块都被一个不同的actor处理。突然,一个Actor失败了并抛了一个异常。这样再往下处理就没什么意思了因为最终结果肯定是错的。所以,逻辑就是停止stop所有的actor。
我为什么说stop而不是重启?因为在这个例子里重启也没用,每个actor的mailbox在重启时并不会被清理。所以,如果我们重启了,另外的chunk仍然会被处理。这不是我们想要的。重建actor并用新的mailbox在这里是个更合适的策略。
与OneForOneStrategy一样,只需要用AllForOneStrategy的实现覆写supervisorStrategy
下面是例子
import akka.actor.{Actor, ActorLogging}
import akka.actor.AllForOneStrategy
import akka.actor.SupervisorStrategy.Escalate
import akka.actor.SupervisorStrategy.Stop
class TeacherActorAllForOne extends Actor with ActorLogging {
...
override val supervisorStrategy = AllForOneStrategy() {
case _: MajorUnRecoverableException => Stop
case _: Exception => Escalate
}
...
...
指令 DIRECTIVES
AllForOneStrategy和OneForOneStrategy的构造方法都接受一个叫Decider的PartialFunction[Throwable,Directive]方法,他把Throwable与Directive指令做了一个映射:
case _: MajorUnRecoverableException => Stop
这就简单的四个指令 - Stop,Resume,Escalate和Restart
Stop
在异常发生时子actor会停止,任何发给停止的actor的消息都会被转到deadLetter队列。
Resume
子actor会忽略抛出异常的消息并且继续处理队列中的其他消息。
Restart
子actor会停止并且一个新的actor会初始化。继续处理mailbox中其他的消息。世界对这个是无感知的因为同样的ActorRef指向了新的Actor。
Escalate
supervisor复制了失败并让他的supervisor处理这个异常。
缺省策略
如果我们的actor没指定任何策略但是创建了子actor。他们会怎样处理?Actor会有一个缺省的策略:
override val supervisorStrategy=OneForOneStrategy() {
case _: ActorInitializationException=> Stop
case _: ActorKilledException => Stop
case _: DeathPactException => Stop
case _: Exception => Restart
}
所以,缺省策略处理了四个case:
1. ACTORINITIALIZATIONEXCEPTION => STOP
当actor不能初始化,他会抛出一个ActorInitializationException。actor会被停止。让我们在preStart调用中模拟下这个:
package me.rerun.akkanotes.supervision
import akka.actor.{ActorSystem, Props}
import me.rerun.akkanotes.protocols.TeacherProtocol.QuoteRequest
import akka.actor.Actor
import akka.actor.ActorLogging
object ActorInitializationExceptionApp extends App{
val actorSystem=ActorSystem("ActorInitializationException")
val actor=actorSystem.actorOf(Props[ActorInitializationExceptionActor], "initializationExceptionActor")
actor!"someMessageThatWillGoToDeadLetter"
}
class ActorInitializationExceptionActor extends Actor with ActorLogging{
override def preStart={
throw new Exception("Some random exception")
}
def receive={
case _=>
}
}
Ru
运行ActorInitializationExceptionApp会产生一个ActorInitializationException 异常然后所有的消息都会进deadLetterActor的消息队列:
Log
[ERROR] [11/10/2014 16:08:46.569] [ActorInitializationException-akka.actor.default-dispatcher-2] [akka://ActorInitializationException/user/initializationExceptionActor] Some random exception
akka.actor.ActorInitializationException: exception during creation
at akka.actor.ActorInitializationException$.apply(Actor.scala:164)
...
...
Caused by: java.lang.Exception: Some random exception
at me.rerun.akkanotes.supervision.ActorInitializationExceptionActor.preStart(ActorInitializationExceptionApp.scala:17)
...
...
[INFO] [11/10/2014 16:08:46.581] [ActorInitializationException-akka.actor.default-dispatcher-4] [akka://ActorInitializationException/user/initializationExceptionActor] Message [java.lang.String] from Actor[akka://ActorInitializationException/deadLetters] to Actor[akka://ActorInitializationException/user/initializationExceptionActor#-1290470495] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2. ACTORKILLEDEXCEPTION => STOP
当Actor被kill消息关闭后,他会抛出一个ActorKilledException。如果抛这个异常,缺省策略会让子actor停止。看起来停止一个被kill掉的actor没什么意义。但想想这个:
ActorKilledException 会被传递给supervisor。 那么之前我们在DeathWatch里面提到的Actor里的生命周期watch或deathwatchers。 直到Actor被停掉前watcher不会知道任何事情。
给Actor发送kill只会让那个特定的监管actor知道。用stop处理会暂停那个actor的mailbox,暂停了子actor的mailbox,停止了子actor,发送了Terminated给所有子actor的watcher,发送给所有类一个Terminated,然后actor的watcher都会迅速失败最终让Actor自己停止、
package me.rerun.akkanotes.supervision
import akka.actor.{ActorSystem, Props}
import me.rerun.akkanotes.protocols.TeacherProtocol.QuoteRequest
import akka.actor.Actor
import akka.actor.ActorLogging
import akka.actor.Kill
object ActorKilledExceptionApp extends App{
val actorSystem=ActorSystem("ActorKilledExceptionSystem")
val actor=actorSystem.actorOf(Props[ActorKilledExceptionActor])
actor!"something"
actor!Kill
actor!"something else that falls into dead letter queue"
}
class ActorKilledExceptionActor extends Actor with ActorLogging{
def receive={
case message:String=> log.info (message)
}
}
Log
日志说只要ActorKilledException 进来,supervisor就会停掉actor并且消息会进入deadLetter队列
INFO m.r.a.s.ActorKilledExceptionActor - something
ERROR akka.actor.OneForOneStrategy - Kill
akka.actor.ActorKilledException: Kill
INFO akka.actor.RepointableActorRef - Message [java.lang.String] from Actor[akka://ActorKilledExceptionSystem/deadLetters] to Actor[akka://ActorKilledExceptionSystem/user/$a#-1569063462] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
3. DEATHPACTEXCEPTION => STOP
在DeathWatch文中,你可以看到当一个Actor观察一个子Actor时,他期望在他的receive中处理Terminated消息。如果没有呢?你会得到一个DeathPactException
代码演示了supervisorwatch子actor但没有从子actor处理Terminated消息。
package me.rerun.akkanotes.supervision
import akka.actor.{ActorSystem, Props}
import me.rerun.akkanotes.protocols.TeacherProtocol.QuoteRequest
import akka.actor.Actor
import akka.actor.ActorLogging
import akka.actor.Kill
import akka.actor.PoisonPill
import akka.actor.Terminated
object DeathPactExceptionApp extends App{
val actorSystem=ActorSystem("DeathPactExceptionSystem")
val actor=actorSystem.actorOf(Props[DeathPactExceptionParentActor])
actor!"create_child" //Throws DeathPactException
Thread.sleep(2000) //Wait until Stopped
actor!"someMessage" //Message goes to DeadLetters
}
class DeathPactExceptionParentActor extends Actor with ActorLogging{
def receive={
case "create_child"=> {
log.info ("creating child")
val child=context.actorOf(Props[DeathPactExceptionChildActor])
context.watch(child) //Watches but doesnt handle terminated message. Throwing DeathPactException here.
child!"stop"
}
case "someMessage" => log.info ("some message")
//Doesnt handle terminated message
//case Terminated(_) =>
}
}
class DeathPactExceptionChildActor extends Actor with ActorLogging{
def receive={
case "stop"=> {
log.info ("Actor going to stop and announce that it's terminated")
self!PoisonPill
}
}
}
Log
日志告诉我们DeathPactException 进来了,supervisor停止了actor然后消息都进入了deadLetter的队列
INFO m.r.a.s.DeathPactExceptionParentActor - creating child
INFO m.r.a.s.DeathPactExceptionChildActor - Actor going to stop and announce that it's terminated
ERROR akka.actor.OneForOneStrategy - Monitored actor [Actor[akka://DeathPactExceptionSystem/user/$a/$a#-695506341]] terminated
akka.actor.DeathPactException: Monitored actor [Actor[akka://DeathPactExceptionSystem/user/$a/$a#-695506341]] terminated
INFO akka.actor.RepointableActorRef - Message [java.lang.String] from Actor[akka://DeathPactExceptionSystem/deadLetters] to Actor[akka://DeathPactExceptionSystem/user/$a#-1452955980] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
4. EXCEPTION => RESTART
对于其他的异常,缺省的指令是重启Actor。看下这个应用。只是要证明下Actor重启了,OtherExceptionParentActor让child抛出一个异常并立刻发送一条消息。消息在子actor重启的时候到达了mailbox,并被处理了。真不错!
[图片描述][5]
package me.rerun.akkanotes.supervision
import akka.actor.Actor
import akka.actor.ActorLogging
import akka.actor.ActorSystem
import akka.actor.OneForOneStrategy
import akka.actor.Props
import akka.actor.SupervisorStrategy.Stop
object OtherExceptionApp extends App{
val actorSystem=ActorSystem("OtherExceptionSystem")
val actor=actorSystem.actorOf(Props[OtherExceptionParentActor])
actor!"create_child"
}
class OtherExceptionParentActor extends Actor with ActorLogging{
def receive={
case "create_child"=> {
log.info ("creating child")
val child=context.actorOf(Props[OtherExceptionChildActor])
child!"throwSomeException"
child!"someMessage"
}
}
}
class OtherExceptionChildActor extends akka.actor.Actor with ActorLogging{
override def preStart={
log.info ("Starting Child Actor")
}
def receive={
case "throwSomeException"=> {
throw new Exception ("I'm getting thrown for no reason")
}
case "someMessage" => log.info ("Restarted and printing some Message")
}
override def postStop={
log.info ("Stopping Child Actor")
}
}
Log
1.异常抛出了,我们能在trace中看到
- 子类重启了 - stop和start被调用了(我们稍后能看到preRestart和postRestart)
- 消息在重启开始前被发送给子actor。
INFO m.r.a.s.OtherExceptionParentActor - creating child
INFO m.r.a.s.OtherExceptionChildActor - Starting Child Actor
ERROR akka.actor.OneForOneStrategy - I'm getting thrown for no reason
java.lang.Exception: I'm getting thrown for no reason
at me.rerun.akkanotes.supervision.OtherExceptionChildActor$$anonfun$receive$2.applyOrElse(OtherExceptionApp.scala:39) ~[classes/:na]
at akka.actor.Actor$class.aroundReceive(Actor.scala:465) ~[akka-actor_2.11-2.3.4.jar:na]
...
...
INFO m.r.a.s.OtherExceptionChildActor - Stopping Child Actor
INFO m.r.a.s.OtherExceptionChildActor - Starting Child Actor
INFO m.r.a.s.OtherExceptionChildActor - Restarted and printing some Message
ESCALATE AND RESUME
我们在defaultStrategy中看到stop和restart的例子。现在让我们快速看下Escalate。
Resume忽略了异常并处理mailbox中的下条消息。这就像是抓住了异常但什么事也没做。
Escalating更像是异常是致命的而supervisor不能处理它。所以,他要向他的supervisor求救。让我们看个例子。
假设有三个Actor - EscalateExceptionTopLevelActor, EscalateExceptionParentActor 和 EscalateExceptionChildActor。 如果一个子actor抛出一个日常并且父级别actor不能处理它,他可以Escalate这个异常到顶级actor。顶级actor也可以选择对哪些指令做出响应。在我们的例子里,我们只是做了stop。stop会立即停掉child(这里是EscalateExceptionParentActor)。我们知道,当一个actor执行stop时,他的所有子类都会在actor自己停掉前先停止。
package me.rerun.akkanotes.supervision
import akka.actor.Actor
import akka.actor.ActorLogging
import akka.actor.ActorSystem
import akka.actor.OneForOneStrategy
import akka.actor.Props
import akka.actor.SupervisorStrategy.Escalate
import akka.actor.SupervisorStrategy.Stop
import akka.actor.actorRef2Scala
object EscalateExceptionApp extends App {
val actorSystem = ActorSystem("EscalateExceptionSystem")
val actor = actorSystem.actorOf(Props[EscalateExceptionTopLevelActor], "topLevelActor")
actor ! "create_parent"
}
class EscalateExceptionTopLevelActor extends Actor with ActorLogging {
override val supervisorStrategy = OneForOneStrategy() {
case _: Exception => {
log.info("The exception from the Child is now handled by the Top level Actor. Stopping Parent Actor and its children.")
Stop //Stop will stop the Actor that threw this Exception and all its children
}
}
def receive = {
case "create_parent" => {
log.info("creating parent")
val parent = context.actorOf(Props[EscalateExceptionParentActor], "parentActor")
parent ! "create_child" //Sending message to next level
}
}
}
class EscalateExceptionParentActor extends Actor with ActorLogging {
override def preStart={
log.info ("Parent Actor started")
}
override val supervisorStrategy = OneForOneStrategy() {
case _: Exception => {
log.info("The exception is ducked by the Parent Actor. Escalating to TopLevel Actor")
Escalate
}
}
def receive = {
case "create_child" => {
log.info("creating child")
val child = context.actorOf(Props[EscalateExceptionChildActor], "childActor")
child ! "throwSomeException"
}
}
override def postStop = {
log.info("Stopping parent Actor")
}
}
class EscalateExceptionChildActor extends akka.actor.Actor with ActorLogging {
override def preStart={
log.info ("Child Actor started")
}
def receive = {
case "throwSomeException" => {
throw new Exception("I'm getting thrown for no reason.")
}
}
override def postStop = {
log.info("Stopping child Actor")
}
}
Log
可以在log中看到,
- 子actor抛了异常。
- supervisor(EscalateExceptionParentActor)升级了(escalate)异常抛给了他的supervisor(EscalateExceptionTopLevelActor)
- EscalateExceptionTopLevelActor 的指令是关闭actor。在顺序上,子actor先停止。
- 父actor之后再停止(在watcher被通知后)
INFO m.r.a.s.EscalateExceptionTopLevelActor - creating parent
INFO m.r.a.s.EscalateExceptionParentActor - Parent Actor started
INFO m.r.a.s.EscalateExceptionParentActor - creating child
INFO m.r.a.s.EscalateExceptionChildActor - Child Actor started
INFO m.r.a.s.EscalateExceptionParentActor - The exception is ducked by the Parent Actor. Escalating to TopLevel Actor
INFO m.r.a.s.EscalateExceptionTopLevelActor - The exception from the Child is now handled by the Top level Actor. Stopping Parent Actor and its children.
ERROR akka.actor.OneForOneStrategy - I'm getting thrown for no reason.
java.lang.Exception: I'm getting thrown for no reason.
at me.rerun.akkanotes.supervision.EscalateExceptionChildActor$$anonfun$receive$3.applyOrElse(EscalateExceptionApp.scala:71) ~[classes/:na]
...
...
INFO m.r.a.s.EscalateExceptionChildActor - Stopping child Actor
INFO m.r.a.s.EscalateExceptionParentActor - Stopping parent Actor
请记住无论哪个指令发出只会使用在被escalated的子类上。 例如,一个restart指令从顶层发出,只有父类会被重启并且在构造函数中/preStart中的都会被执行。如果一个父actor的子类在构造函数中呗创建,他们就会被创建。然而,在消息中创建的child的父actor仍然会在Terminated状态。
TRIVIA
实际上,你可以控制是否preStart被调用。我们可以在下节看到。如果你好奇,可以看下Actor中的**postRestart*方法
def postRestart(reason: Throwable): Unit = {
preStart()
}
代码
跟往常一样,代码在github
文章来自微信平台「麦芽面包」(微信扫描二维码关注)。未经允许,禁止转载。