操作snappy压缩的表时抛出:
java.lang.RuntimeException: native snappy library not available: this version of lib hadoop was built without snappy support.
原因: 是由于没有在java.library.path上加上snappy库
解决方法:修改spark-default.conf配置文件
加上: spark.executor.extraLibraryPath /ldata/Install/hadoop/lib/native
或者spark.executor.extraJavaOptions -Djava.library.path=/data/Install/hadoop/lib/native
如果spark任务是cluster模式,则需要在spark-default.conf参数里面配置上
spark.driver.extraLibraryPath /opt/cloudera/parcels/CDH/lib/hadoop/lib/native
spark.executor.extraLibraryPath /opt/cloudera/parcels/CDH/lib/hadoop/lib/native
如果是使用spark-submit提交任务(Client模式),命令增加的参数则是
--drive-library-path /path
新增一个排查方法:
先用hadoop checknative
检查系统有无相应支持包
到hadoop路径下发现 /opt/cloudera/parcels/CDH/lib/hadoop/lib/native 有libhadoop.so文件,
但是在java.library.path的实际路径下,没有libhadoop.so文件
程序在运行system.loadlibrary(“hadoop”)时,报错实际上是找不到文件,加载失败报错
Unable to load native-hadoop library for your platform… using builtin-java classes where applicable.
把包复制进去即可。