继续看看path&assoc的断开和恢复管理。
二. Manage transport andassociation
偶联的多归属管理主要针对transport,但多个transport/path的断开必然会倒致association也断开。所以追踪path的更新、断开和恢复,也离不开assoc的断开和恢复管理。
每个path的传送失败(即收不到SACK),除了本path出错计数外,assoc的出错计数器也要递增。除了primary transport(主通路)在传送DATA期间外,在primarytransport闲时和alternatetransport(备用通路)上,一般是通过发送HeartBeat来检测链路状态。
path和assoc的出错计数器分别如下:
transport->error_count和asoc->overall_error_count
path和assoc的几种状态分别如下:
pathstate: 0-Unactive,1-Active,2-Unconfirm。
assocstate:0~max,4是已建立。/proc/net/sctp/assoc中“ST”项表示偶联状态。
1、何时更新path
操作函数:sctp_assoc_control_transport #net/sctp/associola.c
操作对象:asoc->primary_path
asoc->active_path
asoc->retran_path
操作类型:up / down
(1). SCTP_TRANSPORT_UP点:sctp_check_transmitted,sct_cmd_transport_on
(2). SCTP_TRANSPORT_DOWN点:sctp_do_8_2_transport_strike #可能会更新active_path!
2、DOWN:何时断开path
path重传次数超过最大值(可通过/proc/sys/net/sctp/path_max_retrans设置),path通路断开。
操作函数:sctp_do_8_2_transport_strike,实现源码如下所示:
/* The check for association's overall error counter exceeding the
* threshold is done in the state function.
*/
/* We are here due to a timer expiration. If the timer was
* not a HEARTBEAT, then normal error tracking is done.
* If the timer was a heartbeat, we only increment error counts
* when we already have an outstanding HEARTBEAT that has not
* been acknowledged.
* Additionally, some tranport states inhibit error increments.
*/
if (!is_hb) {
asoc->overall_error_count++;
if (transport->state != SCTP_INACTIVE)
transport->error_count++; //传送失败次数统计,下同
} else if (transport->hb_sent) {
if (transport->state != SCTP_UNCONFIRMED)
asoc->overall_error_count++;
if (transport->state != SCTP_INACTIVE)
transport->error_count++;
}
//。。。(略),SCTP_PF状态处理
if (transport->state != SCTP_INACTIVE &&
(transport->error_count > transport->pathmaxrxt)) { //通路失败次数比较
SCTP_DEBUG_PRINTK_IPADDR("transport_strike:association %p",
" transport IP: port:%d failed.\n",
asoc,
(&transport->ipaddr),
ntohs(transport->ipaddr.v4.sin_port));
sctp_assoc_control_transport(asoc, transport,
SCTP_TRANSPORT_DOWN, //通路断开
SCTP_FAILED_THRESHOLD);
}
sctp_do_8_2_transport_strike这个函数何时被调用:(都在sctp_cmd_interpreter中)
(1). SCTP_CMD_STRIKE -> sctp_do_8_2_transport_strike
触发点:sctp_sf_do_6_3_3_rtx,sctp_sf_t2_timer_expire, sctp_sf_t4_timer_expire
(2). SCTP_CMD_TRANSPORT_RESET -> sctp_cmd_transport_reset -> sctp_do_8_2_transport_strike
触发点:sctp_sf_sendbeat_8_3,sctp_sf_do_prm_requestheartbeat
3、UP:何时清掉transport->error_count,表明path恢复正常
(1). sctp_cmd_interpreter(SCTP_CMD_UPDATE_ASSOC) -> sctp_assoc_update -> sctp_transport_reset
(2). sctp_cmd_interpreter(SCTP_CMD_PROCESS_SACK) -> sctp_cmd_process_sack -> sctp_outq_sack -> sctp_check_transmitted //收到SACK
(3). sctp_cmd_interpreter(SCTP_CMD_TRANSPORT_ON) -> sctp_cmd_transport_on
4、何时断开偶联
assoc重传次数超过最大值(可通过/proc/sys/net/sctp/association_max_retrans设置),偶联断开。
操作函数:sctp_sf_do_6_3_3_rtx, sctp_sf_sendbeat_8_3, sctp_sf_t4_timer_expire等。
以sctp_sf_do_6_3_3_rtx为例:
if (asoc->overall_error_count >= asoc->max_retrans) { //偶联失败次数判断
if (asoc->state == SCTP_STATE_SHUTDOWN_PENDING) {
/*
* We are here likely because the receiver had its rwnd
* closed for a while and we have not been able to
* transmit the locally queued data within the maximum
* retransmission attempts limit. Start the T5
* shutdown guard timer to give the receiver one last
* chance and some additional time to recover before
* aborting.
*/
sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_START_ONCE,
SCTP_TO(SCTP_EVENT_TIMEOUT_T5_SHUTDOWN_GUARD));
} else {
sctp_add_cmd_sf(commands, SCTP_CMD_SET_SK_ERR,
SCTP_ERROR(ETIMEDOUT));
/* CMD_ASSOC_FAILED calls CMD_DELETE_TCB. */
sctp_add_cmd_sf(commands, SCTP_CMD_ASSOC_FAILED, //偶联断开
SCTP_PERR(SCTP_ERROR_NO_ERROR));
SCTP_INC_STATS(net, SCTP_MIB_ABORTEDS);
SCTP_DEC_STATS(net, SCTP_MIB_CURRESTAB);
return SCTP_DISPOSITION_DELETE_TCB;
}
}
5、何时清掉asoc->overall_error_count,表明偶联恢复正常
(1). sctp_cmd_interpreter(SCTP_CMD_UPDATE_ASSOC) -> sctp_assoc_update
(2). sctp_cmd_interpreter(SCTP_CMD_PROCESS_SACK) -> sctp_cmd_process_sack -> sctp_outq_sack -> sctp_check_transmitted //收到SACK
(3). sctp_cmd_interpreter(SCTP_CMD_TRANSPORT_ON) -> sctp_cmd_transport_on
(4). sctp_cmd_interpreter(SCTP_CMD_GEN_SHUTDOWN)
PS:鉴于SCTP代码的相对稳定,如果不是特别说明,所分析源码的内核版本是2.6.21。
[原创文章不易,转载请注明出处链接]
[注本文在此处同步:SCTP协议源码分析--多归属特性multi-homed(二) ]