LINUX sed - i streaming editor

Problem:

The steps that I take:  
create table using the following HQL commands: 
    CREATE TABLE 10projects(......)
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ',';
The table is created and raw data is loaded into the table, but the data reads like:

select * from 10projects: 

""" e565fb42185c6e9f22806ad9d5ac8a 77""" """ 2e17c8c91cb58132d8103a9aa8797e 80""" """ 45e7ddbdd7023f1eb65a6cc028d741 4f""" 360009001332 40.841691 -73.875457 Bronx NY 10460 urban New York City Dept Of Ed Bronx f f f f f f Mr. f f Literacy Literacy & Language Books highest poverty Grades 9-12 NULL NULL NULL NULL 280.02 341.49 0 308.0 1 f f completed 2007-03-08 2007-03-08 2007-03-08 2003-12-31

The double quotes, such as """e565fb42185c6e9f22806ad9d5ac8a77""", is not expected to appear.
I expected results to be:

e565fb42185c6e9f22806ad9d5ac8a 77 2e17c8c91cb58132d8103a9aa8797e 80 45e7ddbdd7023f1eb65a6cc028d741 4f  360009001332 40.841691 -73.875457 Bronx NY 10460 urban New York City Dept Of Ed Bronx f f f f f f Mr. f f Literacy Literacy & Language Books highest poverty Grades 9-12 NULL NULL NULL NULL 280.02 341.49 0 308.0 1 f f completed 2007-03-08 2007-03-08 2007-03-08 2003-12-31

SOLUTION: 

1. Using CSV SerDe:

Step 1: download JAR HERE

Step 2: Add jar path to HIVE class path when lunching HIVE. 

$  hive --auxpath /path/to/hive-examples.jar

Step 3: Create table using the serde as row format. 

CREATE TABLE 10projects2 (projectid STRING, teacher_acctid STRING)
  row format serde 'com.bizo.hive.serde.csv.CSVSerde'
  stored as textfile;

Results: '''' double quotes  from triple quotes """ are eliminated, but " one quote still remain. 

SOURCE: Hadoop Guide pp.451.


2. Using sed -i linux streaming editor. 

SOURCE: Linux sed

The s Command:

Syntax:  s   /  regexp  /  replacement  /flags

The s command can be followed by zero or more of the following flags:
g
Apply the replacement to all matches to the regexp, not just the first. 

For this case: 
sed -i 's/"""//g' opendata_essays.log

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值