因为工作的原因,接触hive有一小段时间了,之前经常是查询或者创建新表,这是第一次导出数据然后导入到另一个服务器里并且建立新表,倒腾了有一上午的时间,记录一下。。。
hive导出表
查询表结构语句:show create table tm_research.collection_offline_user;
CREATE TABLE `dm_research.collection_offline_user`(
`customer_name` string,
`status` string,
`contract_no` string,
`customer_id` string,
`orphan_flag` string,
`assign_time` bigint,
`import_flag` string,
`balance_fee` double,
`area_id` string,
`area_name` string,
`balance_interest` double,
`balance_principal` double,
`bucket` string,
`cd` int,
`cd_r` int,
`contract_amount` double,
`contract_date` bigint,
`cycle` int,
`deduct_bank` string,
`due_date` bigint,
`last_paydate` bigint,
`loan_date` bigint,
`loan_phases` int,
`monthly_payment` double,
`product` string,
`receipt_phases` int,
`receiv_phases` int,
`remain_phases` int,
`source_channels` string,
`store_id` string,
`store_name` string,
`cd24` string,
`dpd` int,
`first_delq_flag` string,
`mas_flag` string,
`real_dpd` int,
`apply_id` string,
`loan_amount` double,
`bank_sub_type` tinyint,
`id_num` string,
`work_type` tinyint,
`use` tinyint,
`use_other` string,
`account_place_p` string,
`account_place_c` string,
`account_place_other` string,
`account_id` string,
`now_place_p` string,
`now_place_c` string,
`now_place_other` string,
`now_place_id` string,
`marriage` tinyint,
`is_child` tinyint,
`house_type` tinyint,
`house_other` string,
`phone_log` tinyint,
`product_area` double,
`product_place_p` string,
`product_place_c` string,
`uint_price` double,
`month_pay` double,
`unit_type` tinyint,
`unit_type_other` string,
`profession` string,
`unit_place_other` string,
`unit_place_p` string,
`unit_place_c` string,
`work_level` string,
`department` string,
`month_income` double,
`true_wage` double,
`income_day` tinyint,
`enterprise_type` tinyint,
`use_credit` tinyint,
`car_brand` string,
`car_number` string,
`car_count` int,
`child_sum` tinyint)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://cluster/NS2/dm_research/collection_offline_user'
根据你hive配置地址找到表在hadoop集群中的文件位置。
我的collection_offline_user表的位置在:NS2/dm_research/collection_offline_user里面
在集群上输入:hadoop fs -get /NS2/dm_research/collection_offline_user,便可将文件下载到本地,
我的是下载到/home/yjy_research/collection_offline_user
因为我的不具有使用xftp5下载数据的权限,所有我使用tar cvf collection_offline_user.tar collection_offline_user/* 打包,然后使用 sz collection_offline_user.tar下载到本地(比较麻烦,但是没办法)
换到另一个服务器上,使用rz将collection_offline_user.tar上传,然后tar xvf collection_offline_user.tar解压缩到collection_offline_user
接下来就要建立新表了,到hive,根据表结构建立新表,其实就是show create table tm_research.collection_offline_user这句话的查询结果哦,我的如下:
DROP TABLE IF EXISTS `dm_research.collection_offline_user`;
create table dm_research.collection_offline_user(
`customer_name` string , `status` string, `contract_no` string,`customer_id` string,`orphan_flag ` string,
`assign_time`bigint , `import_flag`string,`balance_fee`double ,`area_id `string , `area_name `string, `balance_interest`double ,`balance_principal ` double,`bucket`string ,`cd `int,`cd_r ` int,`contract_amount `double ,`contract_date ` bigint ,`cycle `int,`deduct_bank`string ,`due_date` bigint ,`last_paydate`bigint , `loan_date ` bigint, `loan_phases `int, `monthly_payment`double,`product `string ,`receipt_phases `int , `receiv_phases`int, `remain_phases`int ,`source_channels`string ,`store_id `string,`store_name `string , `cd24`string,`dpd `int,`first_delq_flag`string,`mas_flag`string ,`real_dpd`int,`apply_id`string ,`loan_amount`double ,`bank_sub_type`tinyint,`id_num`string ,`work_type`tinyint,`use`tinyint,`use_other`string ,`account_place_p`string ,`account_place_c`string ,`account_place_other` string ,`account_id` string,`now_place_p`string ,`now_place_c`string ,`now_place_other`string ,`now_place_id`string,`marriage`tinyint,`is_child `tinyint,`house_type` tinyint,`house_other`string ,`phone_log`tinyint,`product_area`double , `product_place_p`string ,`product_place_c`string ,`uint_price` double ,`month_pay`double ,`unit_type`tinyint,`unit_type_other`string ,`profession` string ,`unit_place_other`string ,`unit_place_p`string ,`unit_place_c `string ,`work_level` string ,`department` string ,`month_income`double ,`true_wage ` double ,`income_day `tinyint,`enterprise_type`tinyint,`use_credit` tinyint,`car_brand`string ,`car_number` string ,`car_count ` int,`child_sum`tinyint);
然后就可以导入数据了load data local inpath '/home/callcenter/collection_offline_user/' into table dm_research.collection_offline_user;
会显示Loading data to table dm_research.collection_offline_user
OK Time taken: 495.05 seconds
导入OK。