概述
CDC 编码分布式计算
减少通信负载,减轻掉队的影响,提供容错、隐私和安全
-
基本原理
-
基本方案
-
一些方法
-
应用
-
挑战和前景
一、基本原理
分布式模型:
-
Cluster computing
-
Grid computing
-
Cloud computing
因此,最近的研究集中在编码技术上,以克服分布式计算系统的这些实现挑战,其目的是最小化通信负载和减轻散列效应。
主要解决两个问题:
-
To reduce communication load
-
To mitigate the straggler effects
Distributed computation framework:
-
MapReduce: Map, Shuffle, Reduce
-
Spark
-
Dryad
-
CIEL
Communication Load: (这里都是non-coding methods)
提出了不同的Shuffling strategy
Task scheduling algorithms: Quincy scheduler, Hadoop Fair Scheduler, delay scheduling algorithms
Repeating the computation tasks, using the naive replication method
Straggler Effects:
Stragglers: 运行速度慢于平均速度的处理节点
二.Basic CDC schemes
2.1 CDC to minimize communication load
2.2 CDC to mitigate the straggler effects
2.3 Unified CDC Scheme
三.CDC approaches that have been proposed to reduce communication costs.
- File Allocation
Considering Heterogeneous Systems
Maximizing Data Locality 高数据局部性
Reducing Subpacketization Level
- Coded Shuffling Design
Compression and randomization
Coding across multiple iterations
Problem-specific coding approaches
-
Consideration of Underlying Network Architecture
-
Function Allocation
四.CDC approaches that have been proposed to mitigate straggler effects.
-
Computation Load Allocation
-
Approximate Coding
-
Exploitation of Stragglers
五、Secure coding for distributed computing
六、CDC Applications
-
NFV
-
Edge Computing 边缘计算
七.Challenges
-
heterogeneous nodes
-
Encoding and decoding complexities
-
Non-static computing nodes
-
Security concerns
-
Network architectures
-
Different computation frameworks
-
Coding for both communication reduction and stragglers mitigation
-
CDC applications