Autodidactic Neurosurgeon: Collaborative Deep Inference for Mobile Edge Intelligence Online Learning

论文出处:https://dl.acm.org/doi/abs/10.1145/3442381.3450051

Authors:  Letian Zhang, Lixing Chen, Jie Xu  // University of Miami

Abstract:

  • partition a deep neural network (DNN) into a front-end part running on the mobile device and a backend part running on the edge server, with the key challenge being how to locate the optimal partition point to minimize the end-to-end inference delay.
  • automatically learn the optimal partition point on-the-fly, closely follow the changes of the system environment by generating new knowledge for adaptive decision making.

1. Introduction:

  • mobile device: computing resource limitation
  • MEC: the performance is sensitive to bandwidth (collaborative inference rather than 0-1 offloading)balance the transmission and computation workload

1.1 Numerical Insights

the computing capability of the edge server and the network condition critically affect the collaborative deep inference performance.

        ① VGG16  is partitioned at different layers: delay histogram

        ② delay at different partition points under different edge capabilities 

        ③ delay at different partition points under different network conditions

1.2 Why Online Learning?

 There are several drawbacks of those existing offline profiling approaches.

        ① Adaptation to New Environments

        ② Limited Feedback

        ③ Layer Dependency: laborioous profiling for very deep layers, neglects the interdependency between layers.

1.3 Contribution

selects, for each frame (or a small batch of video frames), a partition point to perform collaborative deep inference for object detection with the edge server.

        ①  avoids the large overhead incurred in the laborious offline profiling stage.

        ② provides differentiated service to key frames

        ③ a novel online learning under the contextual bandit frame work.

 2. System Architecture

        2.1 DNN Partition

                Marking Partition Points

                Total Inference Delay

        2.2 Edge Offloading Delay Prediction

                Constructed Contextual Features of Partitions

                Linear Prediction Model  对应算法Lin-UCB

        2.3 Object Detection in Video Stream——Key Frames (SSIM)

        对于连续帧图片,相似度高的图片用来exploration, 相似度低的图片用来exploitation.

 

3. Autodidactic Neurosurgeon(Algorithm)

        3.1 LinUCB and its Limitation

                ① LinUCB treats each frame equally for the learning purpose without considering the  key frame.

                ②  for cut point = 0 or P , the reward does not satisfy the linear prediction model.      contextual feature  = 0 , being trapped in pure on-device processing 

        3.2 μLinUCB

                ① each frame  is assigned with a weight  depending on whether it is a key frame or not (or the likelihood of being a key frame)

                ②  add randomness in partition point selection: forced sampling 

                Theoretical Performance Guarantee

                Handling Unknown: μLinUCB starts with a large frequency of forced sampling and gradually reduces the frequency as more video frames have been analyzed.

                Complexity Analysis

4. Experiment Results

        4.1 Implemenntation and Setup

                ① Testbed ②DL Models and Platforms ③ Video Input and Detection Output ④Benchmark( Oracle/ Pure Edge Offloading/ Pure On-device Processing/ NeuroSurgeon)

        4.2 Results and Discussions

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值