One is INPUT__FILE__NAME, which is the input file’s name for a mapper task.
INPUT__FILE__NAME 在map任务对应输入文件
the other is BLOCK__OFFSET__INSIDE__FILE, which is the current global file position.
BLOCK__OFFSET__INSIDE__FILE 块在全局文件中的地址
For block compressed file, it is the current block’s file offset, which is the current block’s first byte’s file offset.
对于块是压缩文件,当前块在文件中偏移量,也是当前块首字节在文件中的偏移量
set hive.exec.rowoffset=true
select col, INPUT__FILE__NAME,BLOCK__OFFSET__INSIDE__FILE,ROW__OFFSET__INSIDE__BLOCK from test_data;
1 hdfs://bg001:8020/user/hive/warehouse/temp/test_data/1.csv 0 0
2 hdfs://bg001:8020/user/hive/warehouse/temp/test_data/1.csv 2 0
3 hdfs: