hadoop 处理mysql,Hadoop和MySQL集成

We would like to implement Hadoop on our system to improve its performance.

The process works like this:

Hadoop will gather data from MySQL database then process it.

The output will then be exported back to MySQL database.

Is this a good implementation? Will this improve our system's overall performance?

What are the requirements and has this been done before? A good tutorial would really help.

Thanks

解决方案

Altough it is not a regular hadoop usage. It migh make sense in following scenario:

a) If you have good way to partition your data into the inputs (like existing partitioning).

b) The processing of each partition is relatively heavy. I would give the number of at least 10 seconds of CPU time per partition.

If both conditions are met - you will be able to apply any desired amount of CPU power to make your data processing.

If your are doing simple scan or aggregation - I think your will not gain anything. On other hand - if your are going to run some CPU intensive algorithms on each partition - then indeed your gain can be significant.

I would also mention a separate case- if your processing require massive data sorting. I do not think that MySQL will be good in sorting billions of records. Hadoop will do it.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值