java中row类_java – 如何为包装TableRow的类指定/定义编码器

我已经定义了一个包装com.google.api.services.bigquery.model.TableRow类的类,将其定义为内部成员

public class TableRowWrapper implements Serializable {

private TableRow tableRow;

public TableRowWrapper() {

}

...

}

我还有一些DoFn处理该TableRowWrapper类的输入/输出实例,从而产生PCollection< TableRowWrapper>.我已经尝试使用@DefaultCoder(SerializableCoder.class)和@DefaultCoder(ArvoCoder.class)注释该类,但它总是无法编码,因为它找不到TableRow的成员属性实例的编码器.

以下是使用ArvoCoder时的示例

java.lang.IllegalArgumentException: Unable to encode element 'com.test.bigquery.api.TableRowWrapper@5129e8a6' with coder 'AvroCoder'.

at com.google.cloud.dataflow.sdk.coders.StandardCoder.getEncodedElementByteSize(StandardCoder.java:177)

at com.google.cloud.dataflow.sdk.coders.StandardCoder.registerByteSizeObserver(StandardCoder.java:191)

at com.google.cloud.dataflow.sdk.util.WindowedValue$FullWindowedValueCoder.registerByteSizeObserver(WindowedValue.java:633)

at com.google.cloud.dataflow.sdk.util.WindowedValue$FullWindowedValueCoder.registerByteSizeObserver(WindowedValue.java:542)

at com.google.cloud.dataflow.sdk.runners.worker.MapTaskExecutorFactory$ElementByteSizeObservableCoder.registerByteSizeObserver(MapTaskExecutorFactory.java:429)

at com.google.cloud.dataflow.sdk.util.common.worker.OutputObjectAndByteCounter.update(OutputObjectAndByteCounter.java:115)

at com.google.cloud.dataflow.sdk.runners.worker.DataflowOutputCounter.update(DataflowOutputCounter.java:61)

at com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:46)

at com.google.cloud.dataflow.sdk.runners.worker.ParDoFnBase$1.output(ParDoFnBase.java:157)

at com.google.cloud.dataflow.sdk.util.DoFnRunner$DoFnContext.outputWindowedValue(DoFnRunner.java:329)

at com.google.cloud.dataflow.sdk.util.DoFnRunner$DoFnProcessContext.output(DoFnRunner.java:483)

at com.test.cdf.wrapper.pipeline.DataflowPipeline$TableRowToWrapperDoFn.processElement(DataflowPipeline.java:203)

Caused by: java.lang.NullPointerException: in com.test.bigquery.api.TableRowWrapper in com.google.api.services.bigquery.model.TableRow in array null of array in field f of com.google.api.services.bigquery.model.TableRow in field tableRow of com.test.bigquery.api.TableRowWrapper

at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:145)

at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)

at com.google.cloud.dataflow.sdk.coders.AvroCoder.encode(AvroCoder.java:227)

at com.google.cloud.dataflow.sdk.coders.StandardCoder.getEncodedElementByteSize(StandardCoder.java:174)

at com.google.cloud.dataflow.sdk.coders.StandardCoder.registerByteSizeObserver(StandardCoder.java:191)

at com.google.cloud.dataflow.sdk.util.WindowedValue$FullWindowedValueCoder.registerByteSizeObserver(WindowedValue.java:633)

at com.google.cloud.dataflow.sdk.util.WindowedValue$FullWindowedValueCoder.registerByteSizeObserver(WindowedValue.java:542)

at com.google.cloud.dataflow.sdk.runners.worker.MapTaskExecutorFactory$ElementByteSizeObservableCoder.registerByteSizeObserver(MapTaskExecutorFactory.java:429)

at com.google.cloud.dataflow.sdk.util.common.worker.OutputObjectAndByteCounter.update(OutputObjectAndByteCounter.java:115)

at com.google.cloud.dataflow.sdk.runners.worker.DataflowOutputCounter.update(DataflowOutputCounter.java:61)

at com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:46)

at com.google.cloud.dataflow.sdk.runners.worker.ParDoFnBase$1.output(ParDoFnBase.java:157)

at com.google.cloud.dataflow.sdk.util.DoFnRunner$DoFnContext.outputWindowedValue(DoFnRunner.java:329)

at com.google.cloud.dataflow.sdk.util.DoFnRunner$DoFnProcessContext.output(DoFnRunner.java:483)

at com.test.cdf.wrapper.pipeline.DataflowPipeline$TableRowToWrapperDoFn.processElement(DataflowPipeline.java:203)

at com.google.cloud.dataflow.sdk.util.DoFnRunner.invokeProcessElement(DoFnRunner.java:189)

at com.google.cloud.dataflow.sdk.util.DoFnRunner.processElement(DoFnRunner.java:171)

at com.google.cloud.dataflow.sdk.runners.worker.ParDoFnBase.processElement(ParDoFnBase.java:193)

at com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.process(ParDoOperation.java:52)

at com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)

at com.google.cloud.dataflow.sdk.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:171)

at com.google.cloud.dataflow.sdk.util.common.worker.ReadOperation.start(ReadOperation.java:117)

at com.google.cloud.dataflow.sdk.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:66)

at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.executeWork(DataflowWorker.java:234)

at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.doWork(DataflowWorker.java:171)

at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:137)

at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:147)

at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:132)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.NullPointerException

at org.apache.avro.reflect.ReflectDatumWriter.writeArray(ReflectDatumWriter.java:67)

at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:68)

at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:143)

at org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:114)

at org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:175)

at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104)

at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)

at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:143)

at org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:114)

at org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:175)

at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104)

at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)

at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:143)

... 31 more

如何为此课程定义编码器?

也许为您的TableRowWrapper提供编码器的最简单方法是直接在TableRowJsonCoder上通过瘦包装器执行此操作:

class TableRowWrapperCoder extends CustomCoder {

private static final Coder tableRowCoder = TableRowJsonCoder.of();

@Override

public void encode(TableRowWrapper value, OutputStream outStream, Context context)

throws IOException {

tableRowCoder.encode(value.getRow(), outStream, context);

}

@Override

public TableRowWrapper decode(InputStream inStream, Context context)

throws IOException {

return new TableRowWrapper(tableRowCoder.decode(inStream, context));

}

...

}

您可以通过以下方式为整个管道注册此编码器

pipeline.getCoderRegistry()

.registerCoder(TableRowWrapper.class, new TableRowWrapperCoder());

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值