本文整理匯總了Java中org.apache.hadoop.hive.ql.io.RCFile.Reader方法的典型用法代碼示例。如果您正苦於以下問題:Java RCFile.Reader方法的具體用法?Java RCFile.Reader怎麽用?Java RCFile.Reader使用的例子?那麽恭喜您, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在類org.apache.hadoop.hive.ql.io.RCFile的用法示例。
在下文中一共展示了RCFile.Reader方法的5個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於我們的係統推薦出更棒的Java代碼示例。
示例1: getSampleData
點讚 3
import org.apache.hadoop.hive.ql.io.RCFile; //導入方法依賴的package包/類
@Override
public SampleDataRecord getSampleData(Path path) throws IOException {
SampleDataRecord sampleDataRecord = null;
List sampleData = null;
if (!fs.exists(path))
LOG.error(" File Path: " + path.toUri().getPath() + " is not exist in HDFS");
else {
try {
RCFile.Reader reader = new RCFile.Reader(fs, path, fs.getConf());
sampleData = getSampleData(reader);
sampleDataRecord = new SampleDataRecord(path.toUri().getPath(), sampleData);
} catch (Exception e) {
LOG.error("path : {} content " + " is not RC File format content ", path.toUri().getPath());
LOG.info(e.getStackTrace().toString());
}
}
return sampleDataRecord;
}
開發者ID:thomas-young-2013,項目名稱:wherehowsX,代碼行數:19,
示例2: getSchema
點讚 2
import org.apache.hadoop.hive.ql.io.RCFile; //導入方法依賴的package包/類
@Override
public DatasetJsonRecord getSchema(Path path) throws IOException {
DatasetJsonRecord record = null;
if (!fs.exists(path))
LOG.error("file path : {} not in hdfs", path);
else {
try {
RCFile.Reader reader = new RCFile.Reader(fs, path, fs.getConf());
Map meta = reader.getMetadata().getMetadata();
/** rcfile column number */
int columnNumber = Integer.parseInt(meta.get(new Text(COLUMN_NUMBER_KEY)).toString());
FileStatus status = fs.getFileStatus(path);
String schemaString = getRCFileSchema(columnNumber);
String storage = STORAGE_TYPE;
String abstractPath = path.toUri().getPath();
String codec = "rc.codec";
record = new DatasetJsonRecord(schemaString, abstractPath, status.getModificationTime(), status.getOwner(), status.getGroup(),
status.getPermission().toString(), codec, storage, "");
LOG.info("rc file : {} schema is {}", path.toUri().getPath(), schemaString);
} catch (Exception e) {
LOG.error("path : {} content " + " is not RC File format content ", path.toUri().getPath());
LOG.info(e.getStackTrace().toString());
}
}
return record;
}
開發者ID:thomas-young-2013,項目名稱:wherehowsX,代碼行數:28,
示例3: doProcess
點讚 2
import org.apache.hadoop.hive.ql.io.RCFile; //導入方法依賴的package包/類
@Override
protected boolean doProcess(Record record, InputStream in) throws IOException {
Path attachmentPath = getAttachmentPath(record);
SingleStreamFileSystem fs = new SingleStreamFileSystem(in, attachmentPath);
RCFile.Reader reader = null;
try {
reader = new RCFile.Reader(fs, attachmentPath, conf);
Record template = record.copy();
removeAttachments(template);
template.put(Fields.ATTACHMENT_MIME_TYPE, OUTPUT_MEDIA_TYPE);
if (includeMetaData) {
SequenceFile.Metadata metadata = reader.getMetadata();
if (metadata != null) {
template.put(RC_FILE_META_DATA, metadata);
}
}
switch (readMode) {
case row:
return readRowWise(reader, template);
case column:
return readColumnWise(reader, template);
default :
throw new IllegalStateException();
}
} catch (IOException e) {
throw new MorphlineRuntimeException("IOException while processing attachment "
+ attachmentPath.getName(), e);
} finally {
if (reader != null) {
reader.close();
}
}
}
開發者ID:cloudera,項目名稱:cdk,代碼行數:35,
示例4: readRowWise
點讚 2
import org.apache.hadoop.hive.ql.io.RCFile; //導入方法依賴的package包/類
private boolean readRowWise(final RCFile.Reader reader, final Record record)
throws IOException {
LongWritable rowID = new LongWritable();
while (true) {
boolean next;
try {
next = reader.next(rowID);
} catch (EOFException ex) {
// We have hit EOF of the stream
break;
}
if (!next) {
break;
}
incrementNumRecords();
Record outputRecord = record.copy();
BytesRefArrayWritable rowBatchBytes = new BytesRefArrayWritable();
rowBatchBytes.resetValid(columns.size());
reader.getCurrentRow(rowBatchBytes);
// Read all the columns configured and set it in the output record
for (RCFileColumn rcColumn : columns) {
BytesRefWritable columnBytes = rowBatchBytes.get(rcColumn.getInputField());
outputRecord.put(rcColumn.getOutputField(), updateColumnValue(rcColumn, columnBytes));
}
// pass record to next command in chain:
if (!getChild().process(outputRecord)) {
return false;
}
}
return true;
}
開發者ID:cloudera,項目名稱:cdk,代碼行數:38,
示例5: readColumnWise
點讚 2
import org.apache.hadoop.hive.ql.io.RCFile; //導入方法依賴的package包/類
private boolean readColumnWise(RCFile.Reader reader, Record record) throws IOException {
for (RCFileColumn rcColumn : columns) {
reader.sync(0);
reader.resetBuffer();
while (true) {
boolean next;
try {
next = reader.nextBlock();
} catch (EOFException ex) {
// We have hit EOF of the stream
break;
}
if (!next) {
break;
}
BytesRefArrayWritable rowBatchBytes = reader.getColumn(rcColumn.getInputField(), null);
for (int rowIndex = 0; rowIndex < rowBatchBytes.size(); rowIndex++) {
incrementNumRecords();
Record outputRecord = record.copy();
BytesRefWritable rowBytes = rowBatchBytes.get(rowIndex);
outputRecord.put(rcColumn.getOutputField(), updateColumnValue(rcColumn, rowBytes));
// pass record to next command in chain:
if (!getChild().process(outputRecord)) {
return false;
}
}
}
}
return true;
}
開發者ID:cloudera,項目名稱:cdk,代碼行數:35,
注:本文中的org.apache.hadoop.hive.ql.io.RCFile.Reader方法示例整理自Github/MSDocs等源碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。