实验环境
Cloudera manager6.3;
相关报错
笔者使用Cloudera Manager6.3
来进行管理,当打开spark-shell
交互式终端,读取mysql
数据库中的数据时出现如下报错:
scala> val jdbcDF = spark.read.format("jdbc").option("url", "jdbc:mysqll://hadoop210:3306/rdd").option("driver", "com.mysql.jdbc.Driver").option("dbtable", "t").option("user", "root").option("password", "000000").load()
java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:45)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$5.apply(JDBCOptions.scala:99)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$5.apply(JDBCOptions.scala:99)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:99)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:35)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:317)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
... 49 elided
错误分析
根据错误提示,很容易看出是缺少Mysql
驱动,于是!我们只需下载mysql-connector-java-5.1.26-bin.jar
(下载地址)将其放到spark
的类路径下(若是采用独立模式和本地模式,则相应的目录为 …/spark/jar)即可。
但是若是Cloudera Manger
来管理,那么该如何做呢?
方法是一样的,只是目录变了,找到/opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/jars
,在命令行将mysql-connector-java-5.1.26-bin.jar
复制过来。
sudo cp ./mysql-connector-java.jar /opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/lib/spark/jars
下来重新开启终端,便可以读取mysql
中的数据了。