Hive BI analytics: Visual Reporting

Hive BI analytics: Visual Reporting

Filed under:  HadoopHiveHPCJava world — Tags:   — indoos @ 5:23 pm

I had earlier written about using Hive as a data source for BI tools using industry proven BI reporting tools and here is a list of the various official announcements from Pentaho, Talend. Microstrategy and Intellicus -

The topic is close to my heart since I firmly believe that while Hadoop and Hive are true large data analytics tool, their power is currently limited to use by software programmers. The advent of BI tools in Hadoop/Hive world would certainly bring it closer to the real end users – business users.

I am currently not too sure how these BI reporting tools are deciding how much part of  the analytics be left in Map reduce and how much in the reporting tool itself- guess it will take time to find the right balance. Chances are that  I will find it a bit earlier than others as I am working closely  (read here) with Intellicus team to get the changes in Hive JDBC driver for Intellicus’ interoperability with Hive.

February 8, 2010

BI with MapReduce

Filed under:  Advanced computingHadoop — Tags:   — indoos @ 2:12 pm

Have any of you used map reduce in the context of business intelligence?

While collating my thoughts on this Linked-in Hadoop discussion, found out that I needed more visuals to explain it first to myself :) .

So, here are the many ways in which Hadoop MapReduce does offer an alternative in the big-big BI world-

Scenario 1: Use Hadoop and Hive as interface to BI tools. Pentaho reporting is already supported as of Hive 0.4.0.

Scenario 2: Use Hadoop for intial data polishing, and then dump to a SQL supported column based database near-real BI reporting. Aster data/Vertica /Greenplum sell themselves by advertising  MapReduce connectors (or similar) heavily. The cost of SQL supported column based database is the only pain point here (+ the risk on how these actually scale vs what these promise)

Scenario 3: Use Hadoop for intial data polishing, and then dump to a SQL supported column based database near-real BI reporting. In case of Real time reporting, data can further be BI polished from column based databases to a fast regular RDBMS with BI support.

 

Scenario 4: The free way:)- Use Hadoop for intial data polishing, and then dump to a regular SQL database with BI support. The export from HDFS can be the Un-sqoop way. The onus would more be on the developer to dump only ready-for-report data (lesser) with most of the BI already completed as part of More MR step.

The important fact to note is that there might be additional costs on moving the major chunk of  BI data analysis part to programmatic interfaces (SQL or MR).  

I am not too much of a database-fallen-in-love type, so do like the way Hive can emerge as a potential BI reporting tool.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值