base on :
spark-1.4.1
hadoop-2.5.2
Base from simpleness to complexity and working flow principle,we conform to these steps:
1.[spark-src] spark overview
2.[spark-src] core
from basic demos to dive into spark internal.this section will envolve many components,so it's much details than others.
0.relationships b/t misc spark shells
a.job submiting flow for local mode[client side]
b.tasks running details for local mode[server side]
c.some concepts about RDD
d.spark on standalone
e.spark on yarn
f.shuffle in spark
g.communications b/t certain kernal components
3.[spark-src] sql
4.[spark-src] streaming
5.[spark-src] graphx
6.[spark-src] machine learning
7.[spark-src] tachyon
this reading is focused on :
1.cirtical flows and code blocks; designing thouhts
2.mining the intention of designer's ideas
ps:we always find that,even if u know the principle or architecture,it doesn't mean that u correctly understand the technique details implemented in sources,so we are glad to investigate something special.