Greenplum与mysql集群_从MySQL向Greenplum集群中导入数据

我们要从MySQL当中导出数据到Greenplum当中,按照以下步骤就可以

1:将MySQL当中的表导出外部文件

以schema_name.table_name为例

selectproduct_id,number, name, english_name, purchase_name, system_name, bar_code, category_one, category_two, category_three,

parent_id, parent_number, brand_id, supplier_id, price, ad_word, give_integral, shelf_life, FROM_UNIXTIME(shelve_date), product_area, country,

sale_unit, specification, weight, length, width, height, storage_conditions, storage, model, refuse_notes, status, is_promote,

is_gift, is_book, is_outgoing, is_presale, is_fragile, is_have, is_cod, is_return, is_oos, is_seasonal, is_multicity, is_package, is_show, click,

favorite, min_purchase_unit, in_price, refer_in_price, mwaverage_price, is_unique_number, is_batch_number, qs_proportion, shelf_life_proportion, box_specification,

max_unsalable, advent_shelves, pro_warning, FROM_UNIXTIME(add_time), operator_id,FROM_UNIXTIME( audit_time), remark, price_type, new_tag, product_type, business_model, is_sell, return_policy,

package, inventory, merchant_number, modified_time ,now()from schema_name.table_name INTO OUTFILE ‘/tmp/table_name.txt‘;

导的时候需要注意,一些字符的转换,对于这张表来说,主要就是在MySQL当中一些时间格式存储的为INT类型,我们需要进行转化后然后导出,而且在Greenplum当中建表的时候会多一个时间字段,我们这里默认导出现在时间。按照以上格式进行导出。

2:将文件拷贝到Greenplum服务器上,并且创建外部表

先将文件拷贝到外部表的目录下,这个比较简单,什么方法都可以,然后创建外部表:

create external TABLE schema_name.table_name_ext( product_id int,number varchar(10),

namevarchar(100),

english_namevarchar(100),

purchase_namevarchar(100),

system_namevarchar(100),

bar_codevarchar(255),

category_oneint,

category_twoint,

category_threeint,

parent_idint,

parent_numberint,

brand_idint,

supplier_idint,

priceint,

ad_wordvarchar(100),

give_integralint,

shelf_lifeint,

shelve_datetimestampwithout time zone,

product_areaint,

countryint,

sale_unitvarchar(20),

specificationvarchar(255),

weightdecimal(10,2) ,

lengthint,

widthint,

heightint,

storage_conditionsvarchar(255),

storagesmallint,

modelvarchar(20),

refuse_notesvarchar(255),

statussmallint,

is_promotesmallint,

is_giftsmallint,

is_booksmallint,

is_outgoingsmallint,

is_presaleint,

is_fragilesmallint,

is_havesmallint,

is_codsmallint,

is_returnsmallint,

is_oossmallint,

is_seasonalsmallint,

is_multicitysmallint,

is_packagesmallint,

is_showsmallint,

clickint,

favoriteint,

min_purchase_unitint,

in_priceint,

refer_in_priceint,

mwaverage_priceint,

is_unique_numberint,

is_batch_numberint,

qs_proportionint,

shelf_life_proportionDOUBLE PRECISION,

box_specificationvarchar(50),

max_unsalableint,

advent_shelvesint,

pro_warningint,

add_timetimestampwithout time zone,

operator_idint,

audit_timetimestampwithout time zone,

remarkvarchar(255),

price_typesmallint,

new_tagint,

product_typeint,

business_modelsmallint,

is_sellsmallint,

return_policysmallint,

packagevarchar(200),

inventoryvarchar(200),

merchant_numberint,

modified_timetimestampwithout time zone,

dw_modified_timetimestampwithout time zone

) location(‘gpfdist://172.16.16.34:9888/table_name.txt‘)

FORMAT‘TEXT‘SEGMENT REJECT LIMIT1000000rows ;

这里我们要指定‘gpfdist://10.102.35.192:9888/table_name.txt‘,这个IP地址加上外部表就可以了,后面要把这个文件拷贝到 gpfdist 的目录当中,我们看下启动方式gpfdist -d /tmp -p 9888,也就是要把外部文件拷贝到/tmp目录下才可以。其他的注意列名对应就好

然后查询一下,一般情况列对上就不会有问题。

3:导入到Greenplum当中正式表

先创建一张正式表:

create table schema_name.table_name ( product_id int,number varchar(10),

namevarchar(100),

english_namevarchar(100),

purchase_namevarchar(100),

system_namevarchar(100),

bar_codevarchar(255),

category_oneint,

category_twoint,

category_threeint,

parent_idint,

parent_numberint,

brand_idint,

supplier_idint,

priceint,

ad_wordvarchar(100),

give_integralint,

shelf_lifeint,

shelve_datetimestampwithout time zone,

product_areaint,

countryint,

sale_unitvarchar(20),

specificationvarchar(255),

weightdecimal(10,2) ,

lengthint,

widthint,

heightint,

storage_conditionsvarchar(255),

storagesmallint,

modelvarchar(20),

refuse_notesvarchar(255),

statussmallint,

is_promotesmallint,

is_giftsmallint,

is_booksmallint,

is_outgoingsmallint,

is_presaleint,

is_fragilesmallint,

is_havesmallint,

is_codsmallint,

is_returnsmallint,

is_oossmallint,

is_seasonalsmallint,

is_multicitysmallint,

is_packagesmallint,

is_showsmallint,

clickint,

favoriteint,

min_purchase_unitint,

in_priceint,

refer_in_priceint,

mwaverage_priceint,

is_unique_numberint,

is_batch_numberint,

qs_proportionint,

shelf_life_proportionDOUBLE PRECISION,

box_specificationvarchar(50),

max_unsalableint,

advent_shelvesint,

pro_warningint,

add_timetimestampwithout time zone,

operator_idint,

audit_timetimestampwithout time zone,

remarkvarchar(255),

price_typesmallint,

new_tagint,

product_typeint,

business_modelsmallint,

is_sellsmallint,

return_policysmallint,

packagevarchar(200),

inventoryvarchar(200),

merchant_numberint,

modified_timetimestampwithout time zone,

dw_modified_timetimestampwithout time zone

)distributed by(product_id);

然后导入数据:

insert intoschema_name.table_name

select * fromschema_name.table_name_ext

这样就把外部表数据导出到了内部表,均匀分布在每个segment上。注意schema_name.table_name的结构要和schema_name.table_name_ext是一致的。

原文:https://www.cnblogs.com/shengdimaya/p/8762059.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值