论文阅读 - Thinking Like a Vertex: A Survey of Vertex-

最新推荐文章于 2024-08-01 16:11:59 发布

choutiaolin8504

最新推荐文章于 2024-08-01 16:11:59 发布

阅读量381

点赞数

文章标签：大数据 runtime c/c++

原文链接：https://my.oschina.net/u/178169/blog/675198

版权

**Thinking Like a Vertex: A Survey of Vertex-Centric Frameworks for Large-Scale Distributed Graph Processing **ROBERT RYAN MCCUNE, TIM WENINGER, and GREG MADEY, University of Notre Dame

Highlight, page
implements user- defined programs from the perspective of a vertex rather than a graph
Highlight, page
interdependencies
Highlight, page
However, for sequential graph algorithms, which require random access to all graph data, poor locality and the indivisibility of the graph structure cause time- and resource-intensive pointer chasing between storage mediums in order to access each datum.
Highlight, page
MapReduce does not natively support iterative algo- rithms
Highlight, page
identifying the tradeoffs in component implementations and providing data-driven discussion
Highlight, page
subgraph-centric, or hybrid, frameworks
Highlight, page
Think-like-a-vertex frameworks are platforms that iteratively execute a user-defined program over vertices of a graph.
Highlight, page
Many graph problems can be solved by both a sequential, shared-memory algorithm and a distributed, vertex-centric algorithm.
Highlight, page
TLAV frameworks are highly scalable and inherently parallel, with manage- able intermachine communication.
Highlight, page
a given vertex to all other vertices in a graph
Highlight, page
This algorithm, considered a distributed version of Bellman-Ford [Lynch 1996], is shown in Algorithm 1.
Highlight, page
respective edge
Highlight, page
detailing
Highlight, page
proceeds
Highlight, page
Communication
Highlight, page
vertex program execution
Highlight, page
interdependent
Highlight, page
thorough understanding
Highlight, page
interrelation
Highlight, page
a sequential presentation
Highlight, page
discussed earlier.
Highlight, page
conceptually simple
Highlight, page
the overhead becomes largely amortized for large graphs.
Underline, page
synchronous systems are often implemented along with message-passing communication
Highlight, page
batch messaging
Highlight, page
low computation-to-communication ratio
Underline, page
synchronization
Underline, page
accounted for over 80% of the total running time [Chen et al. 2014a],
Highlight, page
underutilized
Highlight, page
the curse of the last reducer
Highlight, page
straggler
Highlight, page
lightweight
Highlight, page
straggler
Highlight, page
outperform
Highlight, page
Theoretical and empirical research
Highlight, page
One disadvantage, however, is that asynchronous execution cannot take advantage of batch messaging optimizations
Highlight, page 1
at the expense of added complexity, not only from scheduling logic, but also from maintaining data consistency. Asynchronous systems typically implement shared memory
Highlight, page 1
pseudo-supersteps [Chen et al. 2014a]
Highlight, page 1
dynamic scheduling within a sin- gle superstep [Wang et al. 2013].
Highlight, page 1
reduces the number of su- persteps by decoupling intraprocessor computation from the interprocessor commu- nication and synchronization [Chen et al. 2014a].
Highlight, page 1
P++ framework [Zhou et al. 2014]
Underline, page 1
boundary nodes
Highlight, page 1
local nodes
Highlight, page 1
pseudo-supersteps
Highlight, page 1
GraphHP and P++, is the KLA paradigm [Harshvardhan et al. 2014]
Highlight, page 1
KLA
Highlight, page 1
is allowed for a certain number of levels before a synchronous round
Highlight, page 1
KLA has multiple traversals of asyn- chronous execution before coordinating a round of synchronous execution
Highlight, page 1
GRACE exposes a programming interface that, from within a given superstep, allows for prioritized exe- cution of vertices and selective receiving of messages outside of the previous superstep.
Highlight, page 1
Motivated by the necessity for execution mode dynamism
Highlight, page 1
PowerSwitch’s heuristics can accurately predict throughput
Highlight, page 1
data for one process is directly and immediately accessible by another process
Underline, page 1
In the case of the former
Highlight, page 1
Otherwise
Highlight, page 1
is flushed when it reaches a certain capacity, sending messages over the network in batches
Highlight, page 1
Shared memory avoids the additional memory overhead constituted by messages and doesn’t require intermedi- ate processing by workers.
Highlight, page 1
Chandy-Misra locking
Highlight, page 1
derivative with
Highlight, page 1
prioritized execution and low communication overhead
Highlight, page 1
Cyclops is a synchronous shared-memory framework [Chen et al. 2014b]
Highlight, page 1
a third method called active messages is implemented in the GRE framework [Yan et al. 2014b].
Highlight, page 1
Within the GRE architecture, active messages combine the process of sending and receiving messages, removing the need to store an intermediate state, like message queues or edge data
Highlight, page 1
The GRE framework modifies the data graph into an Agent-Graph.
Highlight, page 1
The Agent-Graph adds combiner and scatter vertices to the original graph in order to reduce intermachine messaging.
Highlight, page 1
costly
Highlight, page 1
IBM’s X-Pregel [Bao and Suzumura 2013],
Highlight, page 1
plateaus
Highlight, page 1
Computation for the combiner must be commutative and associative because order cannot be guaranteed
Underline, page 1
Conversely
Highlight, page 1
For algorithms that combine vertices into a supervertex, like Boruvka’s Minimum Spanning Tree [Chung and Condon 1996]
Highlight, page 1
counterintuitively
Highlight, page 1
Pregel, including ACM Computing Surveys, Vol. 48, No. 2, Article 25, Publication date: October 2015. Thinking Like a Vertex: A Survey of Vertex-Centric Frameworks 25:17 its open-source implementations [Avery 2011; Seo et al. 2010] and several related variants [Salihoglu and Widom 2013; Bao and Suzumura 2013; Redekopp et al. 2013].
Highlight, page 1
The user provides two functions, one function that executes across each vertex in the active subset and another function that exe- cutes all outgoing edges in the subset.
Highlight, page 1
Edge Centric. The X-Stream framework provides an edge-centric two-phase Scatter- Gather programming model [Roy et al. 2013],
Highlight, page 1
Push Versus Pull. The flow of information for vertex programs can be character- ized as data being pushed or pulled [Nguyen et al. 2013; Hant et al. 2014; Cheng et al. 2012].
Highlight, page 1
Ligra is a single-machine graph-processing framework that dynamically switches between push- and pull-based operators based on a threshold.
Highlight, page 1
interpartition edges
Highlight, page 1
However, for graphs of even medium size, the high computational cost and necessary random access of the entire graph render METIS and related heuristics impractical.
Highlight, page 1
The densification problem is addressed in Vaquero et al. [2014], wherein
Highlight, page 1
More advanced label propagation schemes for partitioning are presented in Wang et al. [2014] and Slota et al. [2014].
Highlight, page 1
GraphBuilder [Jain et al. 2013] is a similar library ACM Computing Surveys, Vol. 48, No. 2, Article 25, Publication date: October 2015. 25:20 R. R. McCune et al. that, in addition to partitioning, supports an extensive variety of graph-loading-related processing tasks.
Highlight, page 2
the partitioner may temporarily store a vertex and decide the partitioning later [Stanton and Kliot 2012];
Highlight, page 2
however, experiments demonstrate that performance remains relatively consistent for breadth-first, depth- first, and random orderings of a graph [Stanton and Kliot 2012; Tsourakakis et al. 2014].
Highlight, page 2
linear deterministic greedy (LDG), a heuristic that assigns a vertex to the parti- tion with which it shares the most edges while weighted by a penalty function linearly associated with a partition’s remaining capacity.
Highlight, page 2
restreaming graph partitioning model, where a streaming partitioner is provided ac- cess to previous stream results [Nishimura and Ugander 2013].
Highlight, page 2
eightfold
Highlight, page 2
thorough analysis
Highlight, page 2
too computationally expensive
Highlight, page 2
Good workload balance for skewed degree distributions can also be achieved with degree-based hashing [Xie et al. 2014].
Highlight, page 2
vary drastically over the course of computation, which creates processing imbalances and increases runtime.
Highlight, page 2
such as graph coarsening [Wang et al. 2014].
Highlight, page 2
According to Salihoglu and Widom [2013], a dynamic repartitioning strategy must directly address (1) how to select vertices to reassign, (2) how and when to move the assigned vertices, and (3) how to locate the reassigned vertices.
Highlight, page 2
the loading and partitioning can be performed in parallel [Salihoglu and Widom 2013].
Highlight, page 2
XPregel [Bao and Suzumura 2013] supports multithreading by dividing a partition into a user-defined number of sub- partitions
Highlight, page 2
When a failure occurs, the system rolls back to the most recently saved point, all partitions are reloaded, and the entire system resumes process- ing from the checkpoint.
Highlight, page 2
GraphLab [Low et al. 2012] implements asynchronous vertex checkpointing, based on Chandy-Lamport [Chandy and Lamport 1985] snapshots,
Highlight, page 2
they are highly scal- able while providing a simple programming interface,
Highlight, page 2
dictated
Highlight, page 2
GraphChi. The seminal single-machine TLAV framework is GraphChi [Kyrola et al. 2012],
Highlight, page 2
PathGraph also implements a path-centric compact storage system that improves compactness and locality [Yuan et al. 2014]. Because most iterative graph algorithms involve path traversal,
Highlight, page 2
retaining
Highlight, page 2
in varying degrees
Highlight, page 2
shared either through vertex programs on boundary nodes or, in the case of Blogel, directly between subgraphs
Highlight, page 2
Collectively, subgraph-centric frameworks dra- matically outperform TLAV frameworks, often by orders of magnitude in terms of computing time, number of messages, and total supersteps [Tian et al. 2013; Yan et al. 2014a].
Highlight, page 2
A vertex subset interface is implemented in Ligra [Shun and Blelloch 2013].
Highlight, page 2
Similarly, the Single Pivot (SP) optimization [Salihoglu and Widom 2014], first pre- sented in Quick et al. [2012], also temporarily adopts a global view.
Highlight, page 2
the Parallel Boost Graph Library [Gregor and Lumsdaine 2005]
Highlight, page 2
first-class citizens
Highlight, page 2
Databases offer local or online queries, such as one-hop neighbors, whereas TLAV systems iteratively process the entire graph offline in batch
Highlight, page 3
Distributed algorithms are a mature field of study [Lynch 1996],
Highlight, page 3
Not all graphs are large enough to necessitate distributed processing, and not all graph problems need the whole graph to be computed iteratively.

转载于:https://my.oschina.net/u/178169/blog/675198

choutiaolin8504

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
论文阅读 - Thinking Like a Vertex: A Survey of Vertex-

**Thinking Like a Vertex: A Survey of Vertex-Centric Frameworks for Large-Scale Distributed Graph Processing**ROBERT RYAN MCCUNE, TIM WENINGER, an...
复制链接

扫一扫