sqoop 导入mysql blob字段,重新解析存储在由Sqoop从Oracle导入的HDFS中的Blob数据

Using Sqoop I’ve successfully imported a few rows from a table that has a BLOB column.Now the part-m-00000 file contains all the records along with BLOB field as CSV.

Questions:

1) As per doc, knowledge about the Sqoop-specific format can help to read those blob records.

So , What does the Sqoop-specific format means ?

2) Basically the blob file is .gz file of a text file containing some float data in it. These .gz file is stored in Oracle DB as blob and imported into HDFS using Sqoop. So how could I be able to get back those float data from HDFS file.

Any sample code will of very great use.

解决方案

I see these options.

Sqoop Import from Oracle directly to hive table with a binary data type. This option may limit the processing capabilities outside hive like MR, pig etc. i.e. you may need to know the knowledge of how the blob gets stored in hive as binary etc. The same limitation that you described in your question 1.

Sqoop import from oracle to avro, sequence or orc file formats which can hold binary. And you should be able to read this by creating a hive external table on top of it. You can write a hive UDF to decompress the binary data. This option is more flexible as the data can be processed easily with MR as well especially the avro, sequence file formats.

Hope this helps. How did you resolve?

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值