hduser@master:~$ spark-submit --driver-memory 512m --master local[4] pythonwork/mllib/k_means_example.py
Py4JJavaErrorTraceback (most recent call last)
/home/hduser/pythonwork/mllib/k_means_example.py in <module>()
37
38 # Build the model (cluster the data)
---> 39 clusters = KMeans.train(parsedData, 2, maxIterations=10, initializationMode="random")
40
41 # Evaluate clustering by computing Within Set Sum of Squared Errors
/usr/local/spark/python/lib/pyspark.zip/pyspark/mllib/clustering.py in train(cls, rdd, k, maxIterations, runs, initializationMode, seed, initializationSteps, epsilon, initialModel)
354 model = callMLlibFunc("trainKMeansModel", rdd.map(_convert_to_vector), k, maxIterations,
355 runs, initializationMode, seed, initializationSteps, epsilon,
--> 356 clusterInitialModel)
357 centers = callJavaFunc(rdd.context, model.clusterCenters)
358 return KMeansModel([c.toArray() for c in centers])
/usr/local/spark/python/lib/pyspark.zip/pyspark/mllib/common.py in callMLlibFunc(name, *args)
128 sc = SparkContext.getOrCreate()
129 api = getattr(sc._jvm.PythonMLLibAPI(), name)
--> 130 return callJavaFunc(sc, api, *args)
131
132
/usr/local/spark/python/lib/pyspark.zip/pyspark/mllib/common.py in callJavaFunc(sc, func, *args)
121 """ Call Java Function """
122 args = [_py2java(sc, a) for a in args]
--> 123 return _java2py(sc, func(*args))
124
125
/usr/local/spark/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py in __call__(self, *args)
1158 answer = self.gateway_client.send_command(command)
1159 retur
Py4JJavaError: An error occurred while calling o22.trainKMeansModel.
最新推荐文章于 2022-09-06 23:25:53 发布
在使用PySpark进行KMeans聚类时遇到Py4JJavaError,详细错误信息为'An error occurred while calling o22.trainKMeansModel'。这个问题可能由于数据预处理、内存限制或Spark配置不当引起。解决方案可能涉及检查数据质量,增加内存资源,或调整Spark配置参数。
摘要由CSDN通过智能技术生成