Hadoop and the EDW

Rob Klopp summarizes a whitepaper published by Cloudera and Teradata:

Simply put, Hadoop becomes the staging area for “raw data streams” while the EDW stores data from “operational systems”. Hadoop then analyzes the raw data and shares the results with the EDW. […] The paper then positions Hadoop as an active archive. I like this idea very much. Hadoop can store archived data that is only accessed once a month or once a quarter or less often.. and that data can be processed directly by Hadoop programs or shared with the EDW data using facilities such as Teradata’s SQL-H, or Greenplum’s External Hadoop tables (not by HAWQ, though… see here), or by other federation engines connected to HANA, SQL Server, Oracle, etc.

It’s an interesting positioning of Hadoop. And it’s very similar to the approach Linux has taken when penetrating the walls of enterprises. Then it slowly replaced pretty much everything.

In the early days—we are still in those days, the EDW vendors could still believe this story: Hadoop is complicated and meant for batch processing and it lacks the tools and refinements built over years in EDW.

But the story is starting to change. Fast. Hadoop is becoming more of a platform (YARN), it gets support for (almost) real-time querying (Impala, Project Stinger, HAWQ, just to name a few), and Hadoop leaders are signing partnerships with challengers and incumbents of the big data market at a rate that I don’t think I’ve seen before.

In the end, guess who will become the pillar of the big data platforms: the solution storing all the data or those tools being able to process, indeed very fast and with much control, limited amounts of that data?

✚ The Cloudera-Teradata paper titled “Hadoop and the Data Warehouse: When to Use Which” can be found here.

Ref: http://nosql.mypopescu.com/post/51644311726/hadoop-and-the-edw

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值