【论文阅读】2021年9月6日

1 基本信息

  • 标题:7 Self-learning Congestion Control of MPTCP in Satellites Communications
  • 仿真器:ns-3 (TensorFlow for DDPG)
  • 源码: (非)
    https://github.com/JamesRaynor67/mptcp_with_machine_learning
    https://github.com/kallen666/MPTCP-Deep-Reinforcement-Learning
  • 会议:IEEE The International Wireless Communications and Mobile Computing Conference (IWCMC) 2019,B类
  • 机构:北京邮电大学
  • 亮点:应用于卫星通信,MpTCP

2 概述

  1. 要解决的问题是:
    运用在LEO卫星通信的MPTCP的,对于multiple sub-flows的拥塞控制主要是manual process,在动态复杂网络环境中性能不好。
    还要解决LEO卫星通信快速移动导致频繁handover问题。
  2. 提出的方法是
    设计了拥塞控制机制,DRL应用于上述环境的拥塞控制,学习每个sub-flow的控制策略。
  3. 我能做的是
    将RL-CC应用于水下电场通信
  4. 我还想要的资料是
    DDPG算法教程代码,stable-baselines使用方法,MDP教程。

3 细节

  1. RL
  • agent

  • action: [cwndi] of each sub-flow
    请添加图片描述

  • state:congestion window size, cwndt, round-trip time rttt, ACK number ackt, and retransmissions rate rtat of each sub-flow.
    请添加图片描述
    i: ith sub-flow

  • reward:
    请添加图片描述

  • algorithm:DDPG

  • NN:a policy network with two connected hidden layers and a deep Q-valued network with three convolutional neural networks.

  • topology:6 nodes and 11 full-duplex links, source node possesses two IP address and the destination node posses one IP address.

  1. MPTCP (Multipath TCP)
    使用多条路径的传输机制,提高了吞吐量。即把一个数据流分为多个子流,在多条路径上传输。是TCP的扩展,把TCP分成了两部分:MPTCP层和TCP层(sub-flow layer)。MPTCP层使用各种函数管理下层的子流,例如path optimization, packet scheduling, and congestion control. 并且它是透明的,给应用层提供了TCP标准接口,封装隐藏了多路径的复杂性。
  • MPTCP协议栈:
    MPTCP协议栈
  • IETF(The Internet Engineering Task Force)制定的MPTCP三大目标: improve throughput(用multi-path flows至少比single flow on the best path的吞吐量要高), harmless(不能占用过多网络资源而对其他single path TCP flow有害) and balance congestion(避免在高拥塞路径上传输数据,从而保证前两个目标).
  1. LEO (low earth orbit) satellites communications and networking
    低轨道卫星通信
  • 优:low orbit (500-1500 km) and short range,high throughput and lower propagation delay、less energy to deploy (对比Medium Earth Orbit (MEO)和Geostationary Earth Orbit (GEO)).
  • 缺:high moving speeds(每8-10 minutes切换一次),frequent handover(可能导致routing failures, channel quality changing, packet blocking,最终导致service performance degradation)
  1. Satellite Communications with MPTCP
    将MPTCP应用于LEO卫星网络不仅提升带宽,而且在卫星handover时smoothly shift traffic on the disconnected sub-flow to other flow.

在这里插入图片描述

  1. MDP(Markov Decision Process)

请添加图片描述

  1. Deep Deterministic Policy Gradient (DDPG)
    请添加图片描述

  2. 传统MPTCP算法(对比用)

  • round-robin (RR): 每个sub-flow没有优先级,采取round-robin(循环)机制。
  • lowest RTT first RR (LRTT): 基于lowest RTT规划每个sub-flow

4 Writing

  • Introduction

In [9], Cao et al. proposed an approximate iterative algorithm with the ”Congestion Equality Principle” to solve the multipath congestion control.

In [1], Mai et al. proposed an deep reinforcement learning based congestion control mechanism in Multipath TCP transport protocol to improvement the performance of low earth orbit satellites communication and networking.

However, these algorithms largely relies on the manual
process, which has poor scalability and robustness in complex system control. Therefore, there is a need for more powerful methods to deal with the challenges faced in networking.

However, non of algorithm is designed for electrocommunication and networking environment. Therefore, there is a need for more powerful methods to deal with the challenges faced in networking.

Inspired by recent success of applying machine learning in other challenging domain, such as video game, autonomous vehicles. In this paper, we try to apply a deep reinforcement learning approach for optimizing the congestion control strategies to maximize throughput and guarantee fairness. In addition, some simulation results are presented to evaluate the correctness of our architecture and algorithm.

The rest of this paper is organized as followed. In Section
II, we present a new architecture of combining MPTCP with
satellite communications and formulate the problem of multipath congestion control in MPTCP. In Section III, we apply deep deterministic policy gradient algorithm for searching the optimal strategy of congestion control. In Section IV, we present a simulation result to demonstrate the performance of our architecture and algorithm, and summarize the work in Section V.

  • Simulation

These algorithms adopt pre-defined deterministic strategies which is hard to meet the complex network environment [18]. As shown in Fig. ??, based on the strong fitting ability of deep neural networks, our algorithms present a higher throughput than RR and LRTT algorithms.

  • Reference
    [1] Mai, T. , Yao, H. , Jin, Y. , X Xu, & Ji, Z. . (2019). Self-learning Congestion Control of MPTCP in Satellites Communications. IWCMC 2019.
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值