Akka 编程(20)：容错处理(一)

最新推荐文章于 2022-11-08 22:00:00 发布

引路蜂

最新推荐文章于 2022-11-08 22:00:00 发布

阅读量1.6w

点赞数 2

分类专栏： Scala 教程

教程同时被 2 个专栏收录

666 篇文章 1 订阅

订阅专栏

Scala

13 篇文章 0 订阅

订阅专栏

我们在前面介绍Actor系统时说过每个Actor都是其子Actor的管理员，并且每个Actor定义了发生错误时的管理策略，策略一旦定义好，之后不能修改，就像是Actor系统不可分割的一部分。
实用错误处理
首先我们来看一个例子来显示一种处理数据存储错误的情况，这是现实中一个应用可能出现的典型错误。当然实际的应用可能针对数据源不存在时有不同的处理，这里我们使用重新连接的处理方法。
下面是例子的源码，比较长，需要仔细阅读，最好是实际运行，参考日志来理解：

 
   1import akka.actor._
 
   2import akka.actor.SupervisorStrategy._
 
   3import scala.concurrent.duration._
 
   4import akka.util.Timeout
 
   5import akka.event.LoggingReceive
 
   6import akka.pattern.{ask, pipe}
 
   7import com.typesafe.config.ConfigFactory
 
   8 
 
   9/**
 
   10 * Runs the sample
 
   11 */
 
   12object FaultHandlingDocSample extends App {
 
   13 
 
   14  import Worker._
 
   15 
 
   16  val config = ConfigFactory.parseString( """
 
   17      akka.loglevel = "DEBUG"
 
   18      akka.actor.debug {
 
   19      receive = on
 
   20      lifecycle = on
 
   21      }
 
   22      """)
 
   23 
 
   24  val system = ActorSystem("FaultToleranceSample", config)
 
   25  val worker = system.actorOf(Props[Worker], name = "worker")
 
   26  val listener = system.actorOf(Props[Listener], name = "listener")
 
   27  // start the work and listen on progress
 
   28  // note that the listener is used as sender of the tell,
 
   29  // i.e. it will receive replies from the worker
 
   30  worker.tell(Start, sender = listener)
 
   31}
 
   32 
 
   33/**
 
   34 * Listens on progress from the worker and shuts down the system when enough
 
   35 * work has been done.
 
   36 */
 
   37class Listener extends Actor with ActorLogging {
 
   38 
 
   39  import Worker._
 
   40 
 
   41  // If we don’t get any progress within 15 seconds then the service is unavailable
 
   42  context.setReceiveTimeout(15 seconds)
 
   43 
 
   44  def receive = {
 
   45    case Progress(percent) =>
 
   46      log.info("Current progress: {} %", percent)
 
   47      if (percent >= 100.0) {
 
   48        log.info("That’s all, shutting down")
 
   49        context.system.shutdown()
 
   50      }
 
   51    case ReceiveTimeout =>
 
   52      // No progress within 15 seconds, ServiceUnavailable
 
   53      log.error("Shutting down due to unavailable service")
 
   54      context.system.shutdown()
 
   55  }
 
   56}
 
   57 
 
   58object Worker {
 
   59 
 
   60  case object Start
 
   61 
 
   62  case object Do
 
   63 
 
   64  final case class Progress(percent: Double)
 
   65 
 
   66}
 
   67 
 
   68/**
 
   69 * Worker performs some work when it receives the ‘Start‘ message.
 
   70 * It will continuously notify the sender of the ‘Start‘ message
 
   71 * of current ‘‘Progress‘‘. The ‘Worker‘ supervise the ‘CounterService‘.
 
   72 */
 
   73class Worker extends Actor with ActorLogging {
 
   74 
 
   75  import Worker._
 
   76  import CounterService._
 
   77 
 
   78  implicit val askTimeout = Timeout(5 seconds)
 
   79  // Stop the CounterService child if it throws ServiceUnavailable
 
   80  override val supervisorStrategy = OneForOneStrategy() {
 
   81    case _: CounterService.ServiceUnavailable => Stop
 
   82  }
 
   83  // The sender of the initial Start message will continuously be notified
 
   84  // about progress
 
   85  var progressListener: Option[ActorRef] = None
 
   86  val counterService = context.actorOf(Props[CounterService], name ="counter")
 
   87  val totalCount = 51
 
   88 
 
   89  import context.dispatcher
 
   90 
 
   91  // Use this Actors’ Dispatcher as ExecutionContext
 
   92  def receive = LoggingReceive {
 
   93    case Start if progressListener.isEmpty =>
 
   94      progressListener = Some(sender())
 
   95      context.system.scheduler.schedule(Duration.Zero, 1 second, self, Do)
 
   96    case Do =>
 
   97      counterService ! Increment(1)
 
   98      counterService ! Increment(1)
 
   99      counterService ! Increment(1)
 
   100      // Send current progress to the initial sender
 
   101      counterService ? GetCurrentCount map {
 
   102        case CurrentCount(_, count) => Progress(100.0 * count / totalCount)
 
   103      } pipeTo progressListener.get
 
   104  }
 
   105}
 
   106 
 
   107object CounterService {
 
   108 
 
   109  final case class Increment(n: Int)
 
   110 
 
   111  case object GetCurrentCount
 
   112 
 
   113  final case class CurrentCount(key: String, count: Long)
 
   114 
 
   115  class ServiceUnavailable(msg: String) extends RuntimeException(msg)
 
   116 
 
   117  private case object Reconnect
 
   118 
 
   119}
 
   120 
 
   121/**
 
   122 * Adds the value received in ‘Increment‘ message to a persistent
 
   123 * counter. Replies with ‘CurrentCount‘ when it is asked for ‘CurrentCount‘.
 
   124 * ‘CounterService‘ supervise ‘Storage‘ and ‘Counter‘.
 
   125 */
 
   126class CounterService extends Actor {
 
   127 
 
   128  import CounterService._
 
   129  import Counter._
 
   130  import Storage._
 
   131 
 
   132  // Restart the storage child when StorageException is thrown.
 
   133  // After 3 restarts within 5 seconds it will be stopped.
 
   134  override val supervisorStrategy = OneForOneStrategy(maxNrOfRetries = 3,
 
   135    withinTimeRange = 5 seconds) {
 
   136    case _: Storage.StorageException => Restart
 
   137  }
 
   138  val key = self.path.name
 
   139  var storage: Option[ActorRef] = None
 
   140  var counter: Option[ActorRef] = None
 
   141  var backlog = IndexedSeq.empty[(ActorRef, Any)]
 
   142  val MaxBacklog = 10000
 
   143 
 
   144  import context.dispatcher
 
   145 
 
   146  // Use this Actors’ Dispatcher as ExecutionContext
 
   147  override def preStart() {
 
   148    initStorage()
 
   149  }
 
   150 
 
   151  /**
 
   152   * The child storage is restarted in case of failure, but after 3 restarts,
 
   153   * and still failing it will be stopped. Better to back-off than continuously
 
   154   * failing. When it has been stopped we will schedule a Reconnect after a delay.
 
   155   * Watch the child so we receive Terminated message when it has been terminated.
 
   156   */
 
   157  def initStorage() {
 
   158    storage = Some(context.watch(context.actorOf(Props[Storage], name ="storage")))
 
   159    // Tell the counter, if any, to use the new storage
 
   160    counter foreach {
 
   161      _ ! UseStorage(storage)
 
   162    }
 
   163    // We need the initial value to be able to operate
 
   164    storage.get ! Get(key)
 
   165  }
 
   166 
 
   167  def receive = LoggingReceive {
 
   168    case Entry(k, v) if k == key && counter == None =>
 
   169      // Reply from Storage of the initial value, now we can create the Counter
 
   170      val c = context.actorOf(Props(classOf[Counter], key, v))
 
   171      counter = Some(c)
 
   172      // Tell the counter to use current storage
 
   173      c ! UseStorage(storage)
 
   174      // and send the buffered backlog to the counter
 
   175      for ((replyTo, msg) <- backlog) c.tell(msg, sender = replyTo)
 
   176      backlog = IndexedSeq.empty
 
   177    case msg@Increment(n) => forwardOrPlaceInBacklog(msg)
 
   178 
 
   179    case msg@GetCurrentCount => forwardOrPlaceInBacklog(msg)
 
   180    case Terminated(actorRef) if Some(actorRef) == storage =>
 
   181      // After 3 restarts the storage child is stopped.
 
   182      // We receive Terminated because we watch the child, see initStorage.
 
   183      storage = None
 
   184      // Tell the counter that there is no storage for the moment
 
   185      counter foreach {
 
   186        _ ! UseStorage(None)
 
   187      }
 
   188      // Try to re-establish storage after while
 
   189      context.system.scheduler.scheduleOnce(10 seconds, self, Reconnect)
 
   190    case Reconnect =>
 
   191      // Re-establish storage after the scheduled delay
 
   192      initStorage()
 
   193  }
 
   194 
 
   195  def forwardOrPlaceInBacklog(msg: Any) {
 
   196    // We need the initial value from storage before we can start delegate to
 
   197    // the counter. Before that we place the messages in a backlog, to be sent
 
   198    // to the counter when it is initialized.
 
   199    counter match {
 
   200      case Some(c) => c forward msg
 
   201      case None =>
 
   202        if (backlog.size >= MaxBacklog)
 
   203          throw new ServiceUnavailable(
 
   204            "CounterService not available, lack of initial value")
 
   205        backlog :+= (sender() -> msg)
 
   206    }
 
   207  }
 
   208}
 
   209 
 
   210object Counter {
 
   211 
 
   212  final case class UseStorage(storage: Option[ActorRef])
 
   213 
 
   214}
 
   215 
 
   216/**
 
   217 * The in memory count variable that will send current
 
   218 * value to the ‘Storage‘, if there is any storage
 
   219 * available at the moment.
 
   220 */
 
   221class Counter(key: String, initialValue: Long) extends Actor {
 
   222 
 
   223  import Counter._
 
   224  import CounterService._
 
   225  import Storage._
 
   226 
 
   227  var count = initialValue
 
   228  var storage: Option[ActorRef] = None
 
   229 
 
   230  def receive = LoggingReceive {
 
   231    case UseStorage(s) =>
 
   232      storage = s
 
   233      storeCount()
 
   234    case Increment(n) =>
 
   235      count += n
 
   236      storeCount()
 
   237    case GetCurrentCount =>
 
   238      sender() ! CurrentCount(key, count)
 
   239  }
 
   240 
 
   241  def storeCount() {
 
   242    // Delegate dangerous work, to protect our valuable state.
 
   243    // We can continue without storage.
 
   244    storage foreach {
 
   245      _ ! Store(Entry(key, count))
 
   246    }
 
   247  }
 
   248}
 
   249 
 
   250object DummyDB {
 
   251 
 
   252  import Storage.StorageException
 
   253 
 
   254  private var db = Map[String, Long]()
 
   255 
 
   256  @throws(classOf[StorageException])
 
   257  def save(key: String, value: Long): Unit = synchronized {
 
   258    if (11 <= value && value <= 14)
 
   259      throw new StorageException("Simulated store failure " + value)
 
   260    db += (key -> value)
 
   261  }
 
   262 
 
   263  @throws(classOf[StorageException])
 
   264  def load(key: String): Option[Long] = synchronized {
 
   265    db.get(key)
 
   266  }
 
   267}
 
   268 
 
   269object Storage {
 
   270 
 
   271  final case class Store(entry: Entry)
 
   272 
 
   273  final case class Get(key: String)
 
   274 
 
   275  final case class Entry(key: String, value: Long)
 
   276 
 
   277  class StorageException(msg: String) extends RuntimeException(msg)
 
   278 
 
   279}
 
   280 
 
   281/**
 
   282 * Saves key/value pairs to persistent storage when receiving ‘Store‘ message.
 
   283 * Replies with current value when receiving ‘Get‘ message.
 
   284 * Will throw StorageException if the underlying data store is out of order.
 
   285 */
 
   286class Storage extends Actor {
 
   287 
 
   288  import Storage._
 
   289 
 
   290  val db = DummyDB
 
   291 
 
   292  def receive = LoggingReceive {
 
   293    case Store(Entry(key, count)) => db.save(key, count)
 
   294    case Get(key) => sender() ! Entry(key, db.load(key).getOrElse(0L))
 
   295  }
 
   296}

这个例子定义了五个Actor，分别是Worker, Listener, CounterService ,Counter 和 Storage,下图给出了系统正常运行时的流程（无错误发生的情况）：

其中Worker是CounterService的父Actor（管理员），CounterService是Counter和Storage的父Actor（管理员）图中浅红色，白色代表引用，其中Worker引用了Listener，Listener也引用了Worker，它们之间不存在父子关系，同样Counter也引用了Storage，但Counter不是Storage的管理员。

正常流程如下：

步骤	描述
1	progress Listener 通知Worker开始工作.
2	Worker通过定时发送Do消息给自己来完成工作
3，4，5	Worker接受到Do消息时，通知其子Actor CounterService 三次递增计数器, CounterService 将Increment消息转发给Counter，它将递增计数器变量然后把当前值发送给Storeage保存
6，7	Workier询问CounterService 当前计数器的值，然后通过管道把结果传给Listener

下图给出系统出错的情况，例子中Worker和CounterService作为管理员分别定义了两个管理策略，Worker在收到CounterService 的ServiceUnaviable上终止CounterService的运行，而CounterService在收到StorageException时重启Storage。

出错时的流程

步骤	描述
1	Storage抛出StorageException异常
2	Storage的管理员CounterService根据策略在接受到StorageException异常后重启Storage
3，4，5，6	Storage继续出错并重启
7	如果在5秒钟之内Storage出错三次并重启，其管理员（CounterService）就终止Storage运行
8	CounterService 同时监听Storage的Terminated消息，它在Storeage终止后接受到Terminated消息
9，10，11	并且通知Counter 暂时没有Storage
12	CounterService 延时一段时间给自己发生Reconnect消息
13，14	当它收到Reconnect消息时，重新创建一个Storage
15，16	然后通知Counter使用新的Storage

这里给出运行的一个日志供参考。

引路蜂

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Akka 编程(20)：容错处理(一)

我们在前面介绍Actor系统时说过每个Actor都是其子Actor的管理员，并且每个Actor定义了发生错误时的管理策略，策略一旦定义好，之后不能修改，就像是Actor系统不可分割的一部分。实用错误处理首先我们来看一个例子来显示一种处理数据存储错误的情况，这是现实中一个应用可能出现的典型错误。当然实际的应用可能针对数据源不存在时有不同的处理，这里我们使用重新连接的处理方法。下面是例子的源码，比较长
复制链接

扫一扫

专栏目录