Trafodion之parquet_tools基本使用

Trafodion执行一个parquet_tools可执行程序,用于检查parquet文件是否正常。
parquet_tools存储在目录$TRFA_HOME/sql/scripts下,

cd $TRAF_HOME/sql/scripts/
ll parquet_tools 

parquet_tools可执行文件依赖于parquet-tools-${PARQUET_VERSION}.jar,这可以通过查看parquet_tools的内容知道,

${lv_cmd} jar ${TRAF_HOME}/export/lib/parquet-tools-${PARQUET_VERSION}.jar $*

关于parquet_tools的用法,

parquet_tools -h
usage: parquet-tools cat [option...] <input>
where option is one of:
       --debug     Enable debug output
    -h,--help      Show this help string
    -j,--json      Show records in JSON format.
       --no-color  Disable color output even if supported
where <input> is the parquet file to print to stdout

usage: parquet-tools head [option...] <input>
where option is one of:
       --debug          Enable debug output
    -h,--help           Show this help string
    -n,--records <arg>  The number of records to show (default: 5)
       --no-color       Disable color output even if supported
where <input> is the parquet file to print to stdout

usage: parquet-tools schema [option...] <input>
where option is one of:
    -d,--detailed  Show detailed information about the schema.
       --debug     Enable debug output
    -h,--help      Show this help string
       --no-color  Disable color output even if supported
where <input> is the parquet file containing the schema to show

usage: parquet-tools meta [option...] <input>
where option is one of:
       --debug     Enable debug output
    -h,--help      Show this help string
       --no-color  Disable color output even if supported
where <input> is the parquet file to print to stdout

usage: parquet-tools dump [option...] <input>
where option is one of:
    -c,--column <arg>  Dump only the given column, can be specified more than
                       once
    -d,--disable-data  Do not dump column data
       --debug         Enable debug output
    -h,--help          Show this help string
    -m,--disable-meta  Do not dump row group and page metadata
    -n,--disable-crop  Do not crop the output based on console width
       --no-color      Disable color output even if supported
where <input> is the parquet file to print to stdout

usage: parquet-tools merge [option...] <input> [<input> ...] <output>
where option is one of:
       --debug     Enable debug output
    -h,--help      Show this help string
       --no-color  Disable color output even if supported
where <input> is the source parquet files/directory to be merged
   <output> is the destination parquet file

以下是一些基本示例,

//查看parquet文件中字段DEVICE_NUMBER的dump信息
parquet_tools dump -c DEVICE_NUMBER -d /opt/trafodion/bss_userinfo_20180812_0
//查看parquet文件的dump信息
parquet_tools dump  -d /opt/trafodion/bss_userinfo_20180812_0 
//查看parquet文件的前10行内容
parquet_tools head  -n 10 /opt/trafodion/bss_userinfo_20180812_0
//查看parquet文件的meta信息
parquet_tools meta /opt/trafodion/bss_userinfo_20180812_0
//查看parquet文件的schema信息
parquet_tools schema /opt/trafodion/bss_userinfo_20180812_0
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

数据源的港湾

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值