java读写parquet,是否可以在不依赖Hadoop和HDFS的情况下使用Java来读写Parquet?

I've been hunting around for a solution to this question.

It appears to me that there is no way to embed reading and writing Parquet format in a Java program without pulling in dependencies on HDFS and Hadoop. Is this correct?

I want to read and write on a client machine, outside of a Hadoop cluster.

I started to get excited about Apache Drill, but it appears that it must run as a separate process. What I need is an in-process ability to read and write a file using the Parquet format.

解决方案

You can write parquet format out side hadoop cluster using java Parquet Client API.

Here is a sample code in java which writes parquet format to local disk.

{

final String schemaLocation = "/tmp/avro_format.json";

final Schema avroSchema = new Schema.Parser().parse(new File(schemaLocation));

final MessageType parquetSchema = new AvroSchemaConverter().convert(avroSchema);

final WriteSupport writeSupport = new AvroWriteSupport(parquetSchema, avroSchema);

final String parquetFile = "/tmp/parquet/data.parquet";

final Path path = new Path(parquetFile);

ParquetWriter parquetWriter = new ParquetWriter(path, writeSupport, CompressionCodecName.SNAPPY, BLOCK_SIZE, PAGE_SIZE);

final GenericRecord record = new GenericData.Record(avroSchema);

record.put("id", 1);

record.put("age", 10);

record.put("name", "ABC");

record.put("place", "BCD");

parquetWriter.write(record);

parquetWriter.close();

}

avro_format.json,

{

"type":"record",

"name":"Pojo",

"namespace":"com.xx.test",

"fields":[

{

"name":"id",

"type":[

"int",

"null"

]

},

{

"name":"age",

"type":[

"int",

"null"

]

},

{

"name":"name",

"type":[

"string",

"null"

]

},

{

"name":"place",

"type":[

"string",

"null"

]

}

]

}

Hope this helps.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值