涉及大数据应用各方面的一些有用的链接


下面一些链接是关于大数据应用的各方面,有点乱,但都比较有用,会不时更新:

1、AMPLab发布了其利用workload测试Hive/Impala/Tez/Shark/Redshift等SQL查询在Scan/Aggregation/Join/External script等场景的结果,进行了性能对比。

https://amplab.cs.berkeley.edu/benchmark/

2、由上面的blog找出一个intel hadoop test bechmark tools: This benchmark suite contains 9 typical Hadoop workloads (including micro benchmarks, HDFS benchmarks, web search benchmarks, machine learning benchmarks, and data analytics benchmarks). 

https://github.com/intel-hadoop/HiBench

3、(来自hashjoin的微博Tresata今天发布针对金融和保险行业的实时大数据挖掘解决方案。这家由Rackspace创始人投资的公司基于Spark开发了关系挖掘以及风险分析的应用,是华尔街的新宠。Hadoop为业界带来了廉价的大数据存储,下一代的大数据公司则应该围绕着如何从这些储存起来的数据中挖去价值:

http://tresata.com/news/tresata-delivers-big-data-industrys-first-real-time-network-discovery-application-powered-by-spark/

4、facebook针对hbase的使用对hdfs做了一些性能上的改进,似乎是增加了一个flash cache,需要细看一下:

http://research.cs.wisc.edu/wind/Publications/fbmessages-fast14.pdf


5、apache hadoop 2.3.0 released: 

With this release, there are two significant enhancements to HDFS:

• Support for Heterogeneous Storage Hierarchy in HDFS (HDFS-2832)

• In-memory Cache for data resident in HDFS via Datanodes (HDFS-4949)

In YARN, we are very excited to see that ResourceManager Automatic Failover(YARN-149) is nearly complete; even it isn’t ready for primetime yet. We expect it to land by the next release i.e. hadoop-2.4. Furthermore, a number of key operational enhancements have been driven into YARN such as better logging, error-handling, diagnostics etc.

On the MapReduce side of the house, a key enhancement is MAPREDUCE-4421; with this we now no longer need to install MapReduce binaries on every machine and can just use a MapReduce tarball via the YARN DistributedCache by copying it into HDFS.


http://hortonworks.com/blog/apache-hadoop-2-3-0-released/


6 apache hadoop 2.4.0 released


Hadoop 2.4.0 continues that momentum, with additional enhancements to both HDFS & YARN:

  • Support for Access Control Lists in HDFS (HDFS-4685)
  • Native support for Rolling Upgrades in HDFS (HDFS-5535)
  • Smooth operational upgrades with protocol buffers for HDFS FSImage (HDFS-5698)
  • Support for Automatic Failover of the YARN ResourceManager (YARN-149) (a.k.a Phase 1 of YARN ResourceManager High Availability)
  • Enhanced support for new applications on YARN with Application History Server (YARN-321) and Application Timeline Server (YARN-1530)
  • Support for strong SLAs in YARN CapacityScheduler via Preemption (YARN-185)

http://hortonworks.com/blog/apache-hadoop-2-4-0-released/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值