Exception: Java gateway process exited before sending its port number

problem:when i run the code in spark environment,it alway error:Exception: Java gateway process exited before sending its port number!
it takes me a long time to solve,finally a man in google who also encountered with the same problem like me says the errror is cause by anaconda,better to uninstall anaconda.I did not take his advice at beginning.but i could not find any solutions to it.so i just do the following steps:
1.uninstall anaconda
2.remove the path i added in pycarm(Run-edit configurations-environment variables)
3.change the strange name i i use in the py file like 1,2,3 to a normal name
Finnally,my code run successfully in spark enviroment.Here is the my code:

start_time = time.time()
password = parse.quote_plus('Goe@spider')
uri = "mongodb://gouuse:{}@192.168.5.113:27017/Flight.test".format(password)

#  mongdb 连接参数
input_uri = "mongodb://username:password@192.168.5.112:27017/Flight"
database = "Flight"
collection = "info"
spark = SparkSession.builder.master('local[*]').appName("read").getOrCreate()
mongdbDf = spark.read.format('com.mongodb.spark.sql').options(fetchsize=1000,uri=input_uri,database=database,collection=collection).load()
mongdbDf.printSchema()
mongdbDf.show()  #默认显示20行
print('before delete the duplicated:')
print(mongdbDf.count())
mongdbDf.registerTempTable("temp_table")

# mngDF = mongdbDf.dropDuplicates(['customer_name','website'])  # 去重
mngDF = mongdbDf.dropDuplicates(['customer_name'])  # 去重
mngDF.write.format("com.mongodb.spark.sql.DefaultSource").mode("append").option("spark.mongodb.output.uri", uri).save()
time_inv = time.time()-start_time
print("************************************")
print(time_inv)
print('after delete the duplicated:')
print(mngDF.count())
mngDF.show()

Today is 2019.2.14,the same proplem about the java port come out again!,i googled and tried many ways,i solve the problem!The way is following:
1.修改from py4j.java_gateway import JavaGateway
为:from pyspark.java_gateway import JavaGateway
2.添加如下代码:
import os
os.environ[‘SPARK_HOME’] = “E:\dev\spark-2.3.0-bin-hadoop2.7”
.os.environ[‘JAVA_HOME’] = “D:\Java\jdk1.8.0_131”

Today is 2019.7.26,the same proplem about the java port come out again!,!!!I try very hard as the fomer ways,but i does not work!!!Maybe this time i have to install python environment again!

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值