我已经安装了Apache Spark(Spark-2.4.5条-垃圾箱-hadoop2.7版)在我的Mac中:
/Users/xxxx/Software/
另外,我下载了ojdbc6.jar文件在下面的路径中:
/Users/xxxx/Software/spark/jars
以下是我在环境变量中所做的更新:
export SPARK_HOME=/Users/xxxx/Software/spark
export SPARK_CLASSPATH=/Users/xxxx/spark_env/ojdbc6.jar
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
alias python='python3'
export PYSPARK_PYTHON=python3
在终端上,我调用了pyspark并运行了下面的命令,它运行得很好。
conn_url = "jdbc:oracle:thin:@//xxx.xxx.xxx.xx:1521/USER”
df = spark.read.format("jdbc").option("url",conn_url).option("drive","oracle.jdbc.driver.OracleDriver").option("dbtable”,”table_name”).option("user”,”xxxx”).option("password”,”xxxx”).load()
我能成功地查询数据库。
现在,我正在尝试使用PYCHARM进行类似的编程。
PyCharm配置:
/Users/xxxx/Software/spark/jars/ojdbc6.jar
Users/xxxx/Software/spark-2.4.5-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip
/Users/xxxx/Software/spark-2.4.5-bin-hadoop2.7/python/lib/pyspark.zip
当我运行主.py“(它有 join和查询数据库的代码)我得到以下错误:
Status: FailureError: An error occurred while calling o71.load.
: java.sql.SQLException: No suitable driver
at java.sql/java.sql.DriverManager.getDriver(DriverManager.java:298)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$6.apply(JDBCOptions.scala:105)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$6.apply(JDBCOptions.scala:105)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.(JDBCOptions.scala:104)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.(JDBCOptions.scala:35)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.base/java.lang.Thread.run(Thread.java:835)