scala> val rdd1 = sc.textFile("hdfs://master:9001/spark/spark02/directory/")
14/07/19 17:09:36 INFO MemoryStore: ensureFreeSpace(138763) called with curMem=0, maxMem=309225062
14/07/19 17:09:36 INFO MemoryStore: Block broadcast_0 stored as values to memory (estimated size 135.5 KB, free 294.8 MB)
rdd1: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at <console>:12
scala> 14/07/19 17:09:45 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@slave01:42733/user/Executor#-2006581551] with ID 1
14/07/19 17:09:48 INFO BlockManagerInfo: Registering block manager slave01:60074 with 593.9 MB RAM
scala> rdd1.toDebugString
java.lang.VerifyError: class org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$SetOwnerRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.La
spark中读取hdfs中文件出错
在尝试使用Scala的Spark从HDFS读取文件时遇到VerifyError,问题源于protobuf版本冲突。解决方法是检查并更新Spark的mesos依赖至0.18.0以上,或者重新编译Spark以解决潜在的Maven编译问题。
摘要由CSDN通过智能技术生成