Cloud Computing Tech
文章平均质量分 80
Firehotest
这个作者很懒,什么都没留下…
展开
-
Spark: Introduction of Spark
It has been 3 months since I have taken the cloud computing course. Here is my brief summary about spark. Why Spark? And what is Spark?If we run iterative Map-Reduce jobs, the middle resul原创 2016-03-23 10:04:21 · 484 阅读 · 0 评论 -
Kafka and Samza: Real-time stream processing
As we known, for big data analysis, we have those two already learned[1]: Batch Processing is map-reduce. And Iterative Processing is Spark. These two have one thing in common which is w原创 2016-03-23 11:00:52 · 1249 阅读 · 0 评论 -
New Technique: Tachyon (used memory instead of disk for HDFS)
Reference: http://www.csdn.net/article/2013-04-18/2814947-welcome-to-berkeley-where-hadoop-isnt-nearly-fast-enoughCopied from http://www.oschina.net/p/tachyonAlluxio 原名 tachyon。All转载 2016-03-24 22:15:12 · 568 阅读 · 0 评论 -
OLAP: Hive, Impala and Redshift
What is OLAP and OLTP?OLTP stands for Online Transaction Processing (OLTP) while OLAP stands for Online Analytic Processing (OLAP). The main difference is their target. Both of them are two type原创 2016-03-27 10:45:36 · 1929 阅读 · 0 评论 -
Study Note: Optimization in MapReduce
云计算本质上是一种scalable的分布式计算。对于之前提到的many cores和multi-cores而言,最大的局限在于内存都是有限的。云计算完美解决了这个问题(用分割数据的方法)。 有两种分布式计算的方法:1)好像openmp和cuda一样,允许共享内存空间,实行workload distribute。在分布式系统实现这种方法需要分清楚processnode和storage n原创 2016-05-12 01:13:42 · 543 阅读 · 0 评论