tcp拥塞算法分析三（cubic）

最新推荐文章于 2023-12-10 11:17:16 发布

卢纳尔多

最新推荐文章于 2023-12-10 11:17:16 发布

阅读量7k

点赞数 10

本文链接：https://blog.csdn.net/u013218035/article/details/87874288

版权

本文深入分析了Linux内核中的CUBIC拥塞控制算法，探讨其如何解决传统算法在高带宽延迟乘积场景下的窗口增长缓慢问题。CUBIC通过使用三次方程替代BIC-TCP的对数凹函数，实现窗口增长与RTT无关，提高稳定性与可扩展性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

本文分析linux-4.14.69代码的cubic拥塞算法，参考论文 “CUBIC: A New TCP-Friendly High-Speed TCP Variant"

传统的拥塞算法（TCP-Reno, TCP-NewReno and TCP-SACK）在BDP(bandwidth and delay product)很大的情况下，窗口增加的很慢．各种各样的"high-speed"算法（例如，FAST, HSTCP, STCP, HTCP, SQRT, Westwood and BIC-TCP）应运而生，在一系列的第三方测试和性能验证后，在2004年，从2.6.8内核开始，把BIC-TCP作为默认的tcp拥塞算法，其它的拥塞算法则作为可选的．

论文中有介绍选择BIC-TCP的原因：整个窗口生长函数只是一个对数凹函数。这个凹函数使饱和点或平衡处的拥塞窗口比凸函数或线性函数更长（凸函数或者线性函数在饱和点处具有最大的窗口增量，因此它们发生分组丢失时具有最大的波动）。这些功能使BIC-TCP非常稳定，同时具有高度可扩展性。

The whole window growth function is simply a logarithmic concave function. This 
concave function keeps the congestion window much longer at the　saturation point
or equilibrium than convex or linear functions where they have the largest window 
increment at the saturation point and thus　have the largest overshoot at the time 
packet losses occur.These features make BIC-TCP very stable and at the same time 
highly scalable.

但当平衡点（saturation point）增大很多的时候，BIC-TCP需要很长时间找到这个最大窗口．(CUBIC也有这个问题)．

CUBIC是BIC-TCP的下一代版本，用一个cubic 函数替换了BIC-TCP的窗口变化增长逻辑．CUBIC的关键特征是，窗口的增长依赖两次丢包的时间，因此窗口的增长独立于RTT，具有rtt公平性特征．在2006年，从2.6.18内核开始，CUBIC取代了BIC-TCP，成为了默认的tcp拥塞算法．

cubic窗口增长函数：W (t) = C(t − K) 3 + Wmax．其中C是CUBIC参数，t是离最近一次(丢包)窗口减小的时间，K是窗口从W增加到Wmax所用的时间．在不丢包的情况下，K=(beta*Wmax／C)^(1/3) (通过W(0)=-beta*Wmax得到，这里如果不好理解的话可以参考博客https://blog.csdn.net/dog250/article/details/53013410)

cubic详细过程

在拥塞避免阶段接收到ack：用cubic函数估算下一个rtt的窗口cwnd，及函数在(t+RTT)处的函数值，作为target
当cwnd<=RENOcwnd. CUBIC在 TCP-friendly region
当cwnd>RENOcwnd
当cwnd < Wmax. CUBIC 在concave region
当cwnd>=Wmax. CUBIC在convex region

cubic_update

看update函数前，先看下论文中对应算法伪代码：(其中origin_point是新的最大窗口，K是到达origin_point的耗时)
首先更新ack_cnt。
如果epoch_start<=0, 即丢包了，开始一个新的时段， epoch_start设置为当前时间tcp_time_stamp。重置参数。
当cwnd<last_max_cwnd。K=((last_max_cwnd-cwnd)/C)^(1/3) , origin_point=last_max_cwnd。
当cwnd>=last_max_cwnd。K=0, orgin_point = cwnd。
ack_cnt=1, Wtcp = cwnd。
/* 下面是cubic函数计算 */
计算时间t. t=tcp_time_stamp+dMin-epoch_start。(t=当前时间+最小rtt-这段开始的时间)
计算按照cubic函数的速度，经过时间t到达的目标窗口. target=origin_point+C(t-K)^3。
如果目标窗口比现在窗口大，我们就增大速度早点达到目标，cnt = cwnd/(target-cwnd)。（这个就保证了一个rtt能增到目标窗口）
如果target<=cwnd. 我们就降低速度。 cnt = 100 * cwnd。

/*
 * Compute congestion window to use.
 */
static inline void bictcp_update(struct bictcp *ca, u32 cwnd, u32 acked)
{
	u32 delta, bic_target, max_cnt;
	u64 offs, t;

	ca->ack_cnt += acked;	/* count the number of ACKed packets */

	if (ca->last_cwnd == cwnd &&
	    (s32)(tcp_jiffies32 - ca->last_time) <= HZ / 32)
		return;

	/* The CUBIC function can update ca->cnt at most once per jiffy.
	 * On all cwnd reduction events, ca->epoch_start is set to 0,
	 * which will force a recalculation of ca->cnt.
	 */
	/* 每次减窗口(丢包)，ca-epoch_start=0；非减窗口情况下每个jiffy只会更新一次（和CONFIG_HZ有关，不大合理？）*/
	if (ca->epoch_start && tcp_jiffies32 == ca->last_time)
		goto tcp_friendliness;

	ca->last_cwnd = cwnd;
	ca->last_time = tcp_jiffies32;
	
	/* 有丢包，减窗口，开始新时段，重新计算cubic函数原点信息：纵坐标即最大窗口值bic_origin_point，很坐标即窗口增到bic_origin_point的耗时bic_K*/
	if (ca->epoch_start == 0) {
		ca->epoch_start = tcp_jiffies32;	/* record beginning */
		ca->ack_cnt = acked;			/* start counting */
		ca->tcp_cwnd = cwnd;			/* syn with cubic */
	/*bic_origin_point = max(last_max_cwnd, cwnd), bic_K通过cubic函数计算*/
		if (ca->last_max_cwnd <= cwnd) {
			ca->bic_K = 0;
			ca->bic_origin_point = cwnd;
		} else {
			/* Compute new K based on
			 * (wmax-cwnd) * (srtt>>3 / HZ) / c * 2^(3*bictcp_HZ)
			 */
			ca->bic_K = cubic_root(cube_factor
					       * (ca->last_max_cwnd - cwnd));
			ca->bic_origin_point = ca->last_max_cwnd;
		}
	}
	
	/* cubic function - calc*/
	/* calculate c * time^3 / rtt,
	 *  while considering overflow in calculation of time^3
	 * (so time^3 is done by using 64 bit)
	 * and without the support of division of 64bit numbers
	 * (so all divisions are done by using 32 bit)
	 *  also NOTE the unit of those veriables
	 *	  time  = (t - K) / 2^bictcp_HZ
	 *	  c = bic_scale >> 10
	 * rtt  = (srtt >> 3) / HZ
	 * !!! The following code does not have overflow problems,
	 * if the cwnd < 1 million packets !!!
	 */
	/*t =tcp_jiffies32-epoch_start+min_rtt，t就是预测一个rtt之后的时间,从而估算一个rtt之后的bic_target*/
	t = (s32)(tcp_jiffies32 - ca->epoch_start);
	t += msecs_to_jiffies(ca->delay_min >> 3);
	/* change the unit from HZ to bictcp_HZ */
	t <<= BICTCP_HZ;
	do_div(t, HZ);

	if (t < ca->bic_K)		/* t - K */
		offs = ca->bic_K - t;
	else
		offs = t - ca->bic_K;

	/* c/rtt * (t-K)^3 */
	delta = (cube_rtt_scale * offs * offs * offs) >> (10+3*BICTCP_HZ);
	if (t < ca->bic_K)                            /* below origin*/
		bic_target = ca->bic_origin_point - delta;
	else                                          /* above origin*/
		bic_target = ca->bic_origin_point + delta;
	
	/* cubic function - calc bictcp_cnt*/
	if (bic_target > cwnd) {
		ca->cnt = cwnd / (bic_target - cwnd);
	} else {
		ca->cnt = 100 * cwnd;              /* very small increment*/
	}

	/*
	 * The initial growth of cubic function may be too conservative
	 * when the available bandwidth is still unknown.
	 */
	if (ca->last_max_cwnd == 0 && ca->cnt > 20)
		ca->cnt = 20;	/* increase cwnd 5% per RTT */

tcp_friendliness:
	/* TCP Friendly */
   /*估算采用reno的窗口大小:cwnd=cwnd+ack_cnt/delta*/
	if (tcp_friendliness) {
		u32 scale = beta_scale;

		delta = (cwnd * scale) >> 3;
		while (ca->ack_cnt > delta) {		/* update tcp cwnd */
			ca->ack_cnt -= delta;
			ca->tcp_cwnd++;
		}

		if (ca->tcp_cwnd > cwnd) {	/* if bic is slower than tcp */
			delta = ca->tcp_cwnd - cwnd;
			max_cnt = cwnd / delta;
			if (ca->cnt > max_cnt)
				ca->cnt = max_cnt;
		}
	}

	/* The maximum rate of cwnd increase CUBIC allows is 1 packet per
	 * 2 packets ACKed, meaning cwnd grows at 1.5x per RTT.
	 */
	ca->cnt = max(ca->cnt, 2U);
}