rcfile java_Java RCFile.Reader方法代碼示例

本文整理匯總了Java中org.apache.hadoop.hive.ql.io.RCFile.Reader方法的典型用法代碼示例。如果您正苦於以下問題:Java RCFile.Reader方法的具體用法?Java RCFile.Reader怎麽用?Java RCFile.Reader使用的例子?那麽恭喜您, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在類org.apache.hadoop.hive.ql.io.RCFile的用法示例。

在下文中一共展示了RCFile.Reader方法的5個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於我們的係統推薦出更棒的Java代碼示例。

示例1: getSampleData

​點讚 3

import org.apache.hadoop.hive.ql.io.RCFile; //導入方法依賴的package包/類

@Override

public SampleDataRecord getSampleData(Path path) throws IOException {

SampleDataRecord sampleDataRecord = null;

List sampleData = null;

if (!fs.exists(path))

LOG.error(" File Path: " + path.toUri().getPath() + " is not exist in HDFS");

else {

try {

RCFile.Reader reader = new RCFile.Reader(fs, path, fs.getConf());

sampleData = getSampleData(reader);

sampleDataRecord = new SampleDataRecord(path.toUri().getPath(), sampleData);

} catch (Exception e) {

LOG.error("path : {} content " + " is not RC File format content ", path.toUri().getPath());

LOG.info(e.getStackTrace().toString());

}

}

return sampleDataRecord;

}

開發者ID:thomas-young-2013,項目名稱:wherehowsX,代碼行數:19,

示例2: getSchema

​點讚 2

import org.apache.hadoop.hive.ql.io.RCFile; //導入方法依賴的package包/類

@Override

public DatasetJsonRecord getSchema(Path path) throws IOException {

DatasetJsonRecord record = null;

if (!fs.exists(path))

LOG.error("file path : {} not in hdfs", path);

else {

try {

RCFile.Reader reader = new RCFile.Reader(fs, path, fs.getConf());

Map meta = reader.getMetadata().getMetadata();

/** rcfile column number */

int columnNumber = Integer.parseInt(meta.get(new Text(COLUMN_NUMBER_KEY)).toString());

FileStatus status = fs.getFileStatus(path);

String schemaString = getRCFileSchema(columnNumber);

String storage = STORAGE_TYPE;

String abstractPath = path.toUri().getPath();

String codec = "rc.codec";

record = new DatasetJsonRecord(schemaString, abstractPath, status.getModificationTime(), status.getOwner(), status.getGroup(),

status.getPermission().toString(), codec, storage, "");

LOG.info("rc file : {} schema is {}", path.toUri().getPath(), schemaString);

} catch (Exception e) {

LOG.error("path : {} content " + " is not RC File format content ", path.toUri().getPath());

LOG.info(e.getStackTrace().toString());

}

}

return record;

}

開發者ID:thomas-young-2013,項目名稱:wherehowsX,代碼行數:28,

示例3: doProcess

​點讚 2

import org.apache.hadoop.hive.ql.io.RCFile; //導入方法依賴的package包/類

@Override

protected boolean doProcess(Record record, InputStream in) throws IOException {

Path attachmentPath = getAttachmentPath(record);

SingleStreamFileSystem fs = new SingleStreamFileSystem(in, attachmentPath);

RCFile.Reader reader = null;

try {

reader = new RCFile.Reader(fs, attachmentPath, conf);

Record template = record.copy();

removeAttachments(template);

template.put(Fields.ATTACHMENT_MIME_TYPE, OUTPUT_MEDIA_TYPE);

if (includeMetaData) {

SequenceFile.Metadata metadata = reader.getMetadata();

if (metadata != null) {

template.put(RC_FILE_META_DATA, metadata);

}

}

switch (readMode) {

case row:

return readRowWise(reader, template);

case column:

return readColumnWise(reader, template);

default :

throw new IllegalStateException();

}

} catch (IOException e) {

throw new MorphlineRuntimeException("IOException while processing attachment "

+ attachmentPath.getName(), e);

} finally {

if (reader != null) {

reader.close();

}

}

}

開發者ID:cloudera,項目名稱:cdk,代碼行數:35,

示例4: readRowWise

​點讚 2

import org.apache.hadoop.hive.ql.io.RCFile; //導入方法依賴的package包/類

private boolean readRowWise(final RCFile.Reader reader, final Record record)

throws IOException {

LongWritable rowID = new LongWritable();

while (true) {

boolean next;

try {

next = reader.next(rowID);

} catch (EOFException ex) {

// We have hit EOF of the stream

break;

}

if (!next) {

break;

}

incrementNumRecords();

Record outputRecord = record.copy();

BytesRefArrayWritable rowBatchBytes = new BytesRefArrayWritable();

rowBatchBytes.resetValid(columns.size());

reader.getCurrentRow(rowBatchBytes);

// Read all the columns configured and set it in the output record

for (RCFileColumn rcColumn : columns) {

BytesRefWritable columnBytes = rowBatchBytes.get(rcColumn.getInputField());

outputRecord.put(rcColumn.getOutputField(), updateColumnValue(rcColumn, columnBytes));

}

// pass record to next command in chain:

if (!getChild().process(outputRecord)) {

return false;

}

}

return true;

}

開發者ID:cloudera,項目名稱:cdk,代碼行數:38,

示例5: readColumnWise

​點讚 2

import org.apache.hadoop.hive.ql.io.RCFile; //導入方法依賴的package包/類

private boolean readColumnWise(RCFile.Reader reader, Record record) throws IOException {

for (RCFileColumn rcColumn : columns) {

reader.sync(0);

reader.resetBuffer();

while (true) {

boolean next;

try {

next = reader.nextBlock();

} catch (EOFException ex) {

// We have hit EOF of the stream

break;

}

if (!next) {

break;

}

BytesRefArrayWritable rowBatchBytes = reader.getColumn(rcColumn.getInputField(), null);

for (int rowIndex = 0; rowIndex < rowBatchBytes.size(); rowIndex++) {

incrementNumRecords();

Record outputRecord = record.copy();

BytesRefWritable rowBytes = rowBatchBytes.get(rowIndex);

outputRecord.put(rcColumn.getOutputField(), updateColumnValue(rcColumn, rowBytes));

// pass record to next command in chain:

if (!getChild().process(outputRecord)) {

return false;

}

}

}

}

return true;

}

開發者ID:cloudera,項目名稱:cdk,代碼行數:35,

注:本文中的org.apache.hadoop.hive.ql.io.RCFile.Reader方法示例整理自Github/MSDocs等源碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值