I have some third party Database client libraries in Java. I want to access them through
java_gateway.py
E.g: to make the client class (not a jdbc driver!) available to the python client via the java gateway:
java_import(gateway.jvm, "org.mydatabase.MyDBClient")
It is not clear where to add the third party libraries to the jvm classpath. I tried to add to compute-classpath.sh but that did nto seem to work: I get
Py4jError: Trying to call a package
Also, when comparing to Hive: the hive jar files are NOT loaded via compute-classpath.sh so that makes me suspicious. There seems to be some other mechanism happening to set up the jvm side classpath.
解决方案
You can add external jars as arguments to pyspark
pyspark --jars file1.jar,file2.jar