PHOENIX问题汇总

1、通过phoenix的bulk data loading批量导入数据的时候报错,无法识别date列的空值

18/05/14 13:10:52 INFO mapreduce.Job: Task Id : attempt_1525509822813_0015_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: org.apache.phoenix.schema.IllegalDataException: java.lang.IllegalArgumentException: Invalid format: "null"
at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:205)
at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:77)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.phoenix.schema.IllegalDataException: java.lang.IllegalArgumentException: Invalid format: "null"
at org.apache.phoenix.util.DateUtil$ISODateFormatParser.parseDateTime(DateUtil.java:333)
at org.apache.phoenix.util.csv.CsvUpsertExecutor$SimpleDatatypeConversionFunction.apply(CsvUpsertExecutor.java:169)
at org.apache.phoenix.util.csv.CsvUpsertExecutor$SimpleDatatypeConversionFunction.apply(CsvUpsertExecutor.java:120)
at org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:85)
at org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:52)
at org.apache.phoenix.util.UpsertExecutor.execute(UpsertExecutor.java:133)
at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:174)
... 9 more
Caused by: java.lang.IllegalArgumentException: Invalid format: "null"
at org.apache.phoenix.shaded.org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:673)
at org.apache.phoenix.util.DateUtil$ISODateFormatParser.parseDateTime(DateUtil.java:331)

... 15 more


解决办法:sqoop导入数据的时候,把空字符串和空其它类型数据都设置为'',--null-string '' --null-non-string '' ,这样phoenix碰到的时候就会自动处理为null。


2、导入的时候碰到Hbase报错

18/05/14 13:38:55 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://nameservice1/tmp/1cf5afb5-9f13-4c13-b7a9-b22125cbe22c/MD3U_SICK_TYPE/0/a6dc29bce6df4a7b9d3e83a4fae58c76 first=000000 last=\xE6\xB3\x8C\xE5\xB0\xBF\xE5\xA4\x96\xE7\xA7\x91


18/05/14 13:50:09 INFO client.RpcRetryingCaller: Call exception, tries=10, retries=35, started=674743 ms ago, cancelled=false, msg=row '' on table 'MD3U_SICK_TYPE' at region=MD3U_SICK_TYPE,,1526264019025.2be9fc9e013fafc7086ac5a97234d771., hostname=dw91.localdomain,60020,1525654829835, seqNum=2
18/05/14 13:51:25 INFO client.RpcRetryingCaller: Call exception, tries=11, retries=35, started=749911 ms ago, cancelled=false, msg=row '' on table 'MD3U_SICK_TYPE' at region=MD3U_SICK_TYPE,,1526264019025.2be9fc9e013fafc7086ac5a97234d771., hostname=dw91.localdomain,60020,1525654829835, seqNum=2
18/05/14 13:52:40 INFO client.RpcRetryingCaller: Call exception, tries=12, retries=35, started=825171 ms ago, cancelled=false, msg=row '' on table 'MD3U_SICK_TYPE' at region=MD3U_SICK_TYPE,,1526264019025.2be9fc9e013fafc7086ac5a97234d771., hostname=dw91.localdomain,60020,1525654829835, seqNum=2
18/05/14 13:53:55 INFO client.RpcRetryingCaller: Call exception, tries=13, retries=35, started=900443 ms ago, cancelled=false, msg=row '' on table 'MD3U_SICK_TYPE' at region=MD3U_SICK_TYPE,,1526264019025.2be9fc9e013fafc7086ac5a97234d771., hostname=dw91.localdomain,60020,1525654829835, seqNum=2
18/05/14 13:55:10 INFO client.RpcRetryingCaller: Call exception, tries=14, retries=35, started=975626 ms ago, cancelled=false, msg=row '' on table 'MD3U_SICK_TYPE' at region=MD3U_SICK_TYPE,,1526264019025.2be9fc9e013fafc7086ac5a97234d771., hostname=dw91.localdomain,60020,1525654829835, seqNum=2

解决办法:查看CM的日志,发现是因为Hbase读取HFile的时候使用了user=hbase ,但sqoop导入的数据所有者是hdfs,权限是755,所以Hbase用户没有权限rename,所以报错,调整CM中Hbase配置,把系统用户配置有hbase改为hdfs,重启Hbase,解决问题。


3、sqoop导入的数据与phoenix表字段不匹配

18/05/15 14:41:58 INFO mapreduce.Job: Task Id : attempt_1525509822813_0027_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: java.lang.IllegalArgumentException: CSV record does not have enough values (has 6, but needs 41)
at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:205)
at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:77)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.IllegalArgumentException: CSV record does not have enough values (has 6, but needs 41)
at org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:82)
at org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:52)
at org.apache.phoenix.util.UpsertExecutor.execute(UpsertExecutor.java:133)
at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:174)

... 9 more

解决办法:把sqoop导入的文件copy到本地,用notepad++打开,我用的分隔符是 '\001’,能看到两个字符间有EOF,看看是不是存在错行的问题,可能是由于数据中存在\n \r 等回车换行符,导致csv文件换行了,如果是这种情况 sqoop脚本添加命令   --hive-drop-import-delims 去掉回车换行符就可以了

4、phoenix转义符的问题

18/05/16 15:54:06 INFO mapreduce.Job: Task Id : attempt_1525509822813_0044_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: java.lang.IllegalArgumentException: CSV record does not have enough values (has 40, but needs 41)
at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:207)
at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:77)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.IllegalArgumentException: CSV record does not have enough values (has 40, but needs 41)
at org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:82)
at org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:52)
at org.apache.phoenix.util.UpsertExecutor.execute(UpsertExecutor.java:133)
at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:176)
... 9 more

phoenix通过Bulk Data Loading读取文件的时候可以指定-e Supply a custom escape character, default is a backslash

意思就是支持转义符,默认是 \ ,所以当你的列中存在 \时,会被转义,出现问题,解决办法是使用-e 更换转义符,或者修改代码CsvToKeyValueMapper.CsvLineParser 去掉转义符设置

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值