The spark cluster setting is as follows:
conf['SparkConfiguration'] = SparkConf() \
.setMaster('yarn-client') \
.setAppName("test") \
.set("spark.executor.memory", "20g") \
.set("spark.driver.maxResultSize", "20g") \
.set("spark.executor.instances", "20")\
.set("spark.executor.cores", "3") \
.set("spark.memory.fraction", "0.2") \
.set("user", "test_user") \
.set("spark.executor.extraClassPath", "/usr/share/java/postgresql-jdbc3.jar")
When I try to write the dataframe to the Postgres DB using the following code:
from pyspark.sql import DataFrameWriter
my_writer = DataFrameWriter(df)
url_connect = "jdbc:postgresql://198.123.43.24:1234&