发送器和接收器模块自动处理父节点的故障和恢复。 但是,在其中执行进程的处理器或队列模块不具有处理故障和恢复中断的内置功能。 下面描述了设计用于处理故障和恢复中断的三个典型过程示例。
路由拓扑示例
如果节点维护的路由拓扑在节点和链路条件发生变化时动态更新,则更新拓扑的典型方法是将模块添加到负责该任务的节点模型。 您应该将模块的流程模型的“failure intrpts”和“recovery intrpts”属性设置为“network wide”。 当任何节点或链接更改系统中的条件时,模块内的根进程:
•接收故障或恢复中断;
•调用内核过程op_intrpt_type()以确定发生了哪种类型的中断;
•更新路由拓扑以反映更改。
下图显示了流程模型中的示例代码片段,该流程模型基于整个网络中的对象故障和恢复来更新路由表。
/** This state is entered when a link connected to this node or **/
/** a directly connected router fails/recovers. In both cases **/
/** the link status is changed. **/
/** Link failure leads to setting the metric for the entries in **/
/** the RIP routing table for which the next hop was reachable **/
/** through the failed interface to infinite. **/
/** Link recovery inserts the route to the directly connected **/
/** neighbor on the recovered link into the routing table. **/
/** Both cases trigger sending of RIP route update messages. **/
if (intrpt_type == OPC_INTRPT_FAIL)
{
/** Link has failed. **/
/* Handle the failure of the connected link. */
status_changed = rip_link_fail (table_index);
}
else
{
/** Link has recovered. **/
/* Handle the link recovery. */
status_changed = rip_link_recover (table_index);
}
/* Routing table has changed. Schedule generation of */
/* route update messages. */
·
·
·
节点故障/恢复示例
节点故障通常被认为是节点内模块的灾难性事件。发送器自动中止传输中的任何数据包并刷新其传输队列。节点禁用时,接收方将不接受数据包。此外,进程可能需要通知节点的故障,以便它可以使自己处于干净状态。例如,队列模块中的进程可能需要刷新其数据包缓冲区,否则进程可能会破坏其子进程。如果模块需要通知父节点的失败,则模块的“failure intrpt”属性可以设置为“仅本地”。设置为接收中断的进程将在节点实际失败之前调用,以便它可以采取任何必要的操作来处理节点故障。通常,进程调用内核过程op_intrpt_type()以确定中断是失败中断。下图显示了进程模型中的示例代码片段,该代码片段在其父节点发生故障之前刷新其数据包缓冲区。
/* This process just received an interrupt. Based on the type of */
/* interrupt, take the appropriate actions. */
/* Determine which interrupt type just occurred. */
intrpt_type = op_intrpt_type ();
if (intrpt_type == OPC_INTRPT_STREAM)
{
/* Received stream intrpt. Call procedure to handle the packet. */
packet_receive_and_enqueue ();
}
else if (intrpt_type == OPC_INTRPT_SELF)
{
/* Received self interrupt. Call procedure to send a packet. */
packet_dequeue_and_send ();
}
else if (intrpt_type == OPC_INTRPT_FAIL)
{
/* This process just received a failure interrupt for its */
/* parent node and must flush the subqueues and cancel self */
/* interrupts. */
/* Flush the subqueues. */
op_q_flush ();
/* Cancel all self interrupts. */
op_intrpt_clear_self ();
}
类似地,当节点恢复时,该节点内的进程通常需要重新初始化自己。 例如,进程可以重置其计数器,将其数组清零,并转换到它期望传入数据包的状态。 如果进程仅在其父节点恢复时需要通知,则其包含模块的“recovery intrpts”属性将设置为“仅本地”,以便在节点恢复时,将恢复中断传递给模块。 该进程调用内核过程op_intrpt_type()以区分此中断类型与其他类型,例如流或自身中断。 下面的清单显示了一个流程模型的示例代码片段,该流程模型在其父节点恢复时重新初始化其状态变量。
/* This process just received an interrupt. Based on the type of */
/* interrupt, take the appropriate actions. */
/* Determine which interrupt type just occurred. */
intrpt_type = op_intrpt_type ();
if (intrpt_type == OPC_INTRPT_SELF)
{
/* Received self interrupt. Re-transmit packet. */
packet_retransmit ();
}
else if (intrpt_type == OPC_INTRPT_STREAM)
{
/* Received stream interrupt. Handle the packet. */
packet_receive ();
}
else if (intrpt_type == OPC_INTRPT_RECOVER)
{
/* This process just received a recovery interrupt for */
/* its parent node and must re-initialize its state variables. */
/* Reset counters to 0. */
num_packets = 0;
num_errors = 0;
/* Set current time. */
start_time = op_sim_time ();
/* Reset array values. */
for (i = 0; i = NUM_ENTRIES; i++)
entry_array [i] = NULL_ENTRY;
}