Deep Learning With Edge Computing: A Review
introduction
the challenge of moving data from source to cloud
- latency
- scalability
- privacy
- accommodating the high resource requirements of deep learning on less powerful edge compute resources
- how the edge devices should coordinate with other edge devices and with the cloud
- privacy(exchanged between edge devices and possibly the cloud.)
BACKGROUND, MEASUREMENTS, AND FRAMEWORKS
A. Background on Deep Learning
the tradeoff between accuracy and computational resources and energy
APPLICATIONS OF DEEP LEARNING AT THE EDGE
Computer Vision
- cameras located at the network edge
- Uploading camera data to the cloud also has privacy concerns
- Scalability(uplink bandwidth to a cloud server may become a bottleneck )
ex:Vigil, VideoEdge
Natural Language Processing
voice assistants——Wakeword detection
Network Functions
- intrusion detection
- wireless scheduling
- In-network caching
Internet of Things
- human activity recognition from wearable sensors
- pedestrian traffic in a smart city
- electrical load prediction in a smart grid
motivation:
- compressing the deep learning models to fit onto computationally weak end devices
- privacy concerns
Virtual Reality and Augmented Reality
METHODS FOR FAST INFERENCE
On-Device Computation
-
Model Design
reducing memory and execution latency, while aiming to preserve high accuracy. -
Model Compression
compress the existing DNN models with minimal accuracy loss compared with the original model.Hardware -
Hardware
Edge Server Computation
- Data Preprocessing
reduce data redundancy and thus decrease communication time - Edge Resource Management
Transfer learning enables multiple applications to share the common lower layers of the DNN model and computes higher layers unique to the specific application
Computing Across Edge Devices
-
Offloading
which DNN model or which portion of the model to run.
(the size of the data、the hardware capabilities、the DNN model to be executed、network quality) -
DNN Model Partitioning
1.layer-wise partitioning
some layers are computed on the device, and some layers are computed by the edge server or the cloud(60 13)
2.input-wise partitioning(107 60) -
Edge Devices Plus the Cloud
the edge server computes the initial layers of the DNN model, and the cloud computes the higher layers of the DNN -
Distributed Computation
The DNN partition decision is made based on the computation capabilities and/or memory of the end devices
Private Inference
-
adding noise to obfuscate the data uploaded by end devices to edge
servers -
secure computation using cryptographic techniques.
TRAINING IN PLACE ON EDGE DEVICES
- data parallelism
- model parallelism
Frequency of Training Updates
Reducing the frequency of communications and the size of each communication
- synchronous stochastic gradient descent
- asynchronous stochastic gradient descent
Size of Training Updates
review gradient compression techniques, which can reduce the size of the updates communicated to a central server.
- gradient quantization
the floating-point gradients using low-bit width numbers - gradient sparsification
discards unimportant gradient updates and only communicates updates that exceed a certain threshold
Decentralized Communication Protocols
each device computes its own gradient updates based on its training data and then communicates its updates to some of the other devices(gossip-type algorithm)
Private Training
OPEN CHALLENGES
Systems Challenges
- Latency
Keeping up with new deep learning designs will continue to be a major systems’ challenge. - Energy
- Migration
Migrating edge computing applications between different edge servers
(VM migration techniques、Docker containers、multipath TCP )
Relationship to SDN and NFV Technologies
Management and Scheduling of Edge Compute Resources
Deep Learning Benchmarks on Edge Devices
apples-to-apples containing benchmark com- parisons between the models on different hardwares