报错具体信息
An error occurred while calling o386.trainWord2VecModel.: java.lang.OutOfMemoryError: Java heap space
出错代码:
from pyspark import SparkConf, SparkContext
from pyspark.mllib.feature import Word2Vec, Word2VecModel
conf = SparkConf().set(“spark.driver.memory”, “10g”)
sc = SparkContext.getOrCreate(conf=conf)
data = sc.textFile(“…”).map(lambda row: row.split(" “))
word2Vec = Word2Vec()
model = word2Vec.fit(data)
Synonyms = model.findSynonyms(‘…’, 5)
for word, cosine_distance in synonyms: print(”{}: {}".format(word, cosine_distance))