scala问题:Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(II[BI[BIILjava/lang/String;JZ)V
报错详情:
Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(II[BI[BIILjava/lang/String;JZ)V
at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(Native Method)
at org.apache.hadoop.util.NativeCrc32.calculateChunkedSumsByteArray(NativeCrc32.java:86)
at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(DataChecksum.java:430)
at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:202)
at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:163)
at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:144)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:400)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:241)
at org.apache.hadoop.io.compress.GzipCodec$GzipOutputStream.close(GzipCodec.java:74)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:108)
at org.apache.spark.SparkHadoopWriter.close(SparkHadoopWriter.scala:103)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply$mcV$sp(PairRDDFunctions.scala:1206)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1259)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1205)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1185)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
运行scala处理本地文件报错,搜索是因为是远程连接hdfs出错,网上推荐的方法大概有三种:
方案一
看了好多其他人的答案,大多是把hadoop.dll和winutil.exe文件拷贝到C:/Windows/System32目录下,就可以了。
方案二
把本机配置的HADOOP_HOME环境变量去掉(当然不要忘记去掉PATH中的相应的环境变量)
因为后面数据保存的时候需要调用windows的io操作,不能让程序把本地hadoop当作运行环境。
方案三
在程序运行前,添加-Djava.library.path=$HADOOP_HOME/lib/native
参数,如果是idea,是下面这个位置
以上三种方法转载于:https://blog.csdn.net/Vector97/article/details/99695838
我的解决方式是将/windows/system32下的hadoop.dill删去,这种方式跟上面这个博主说的方案三应该是一个意思,但是还是建议上面博主的第三个方法,大家可以自己尝试解决。