python调用sparkmlib_在Spark\PySp中保存/加载模型的正确方法是什么

最新推荐文章于 2023-08-04 19:42:15 发布

weixin_39760689

最新推荐文章于 2023-08-04 19:42:15 发布

阅读量582

点赞数

文章标签： python调用sparkmlib

我正在使用PySpark和MLlib使用Spark 1.3.0，我需要保存和加载我的模型。我使用这样的代码(取自官方的documentation)from pyspark.mllib.recommendation import ALS, MatrixFactorizationModel, Rating

data = sc.textFile("data/mllib/als/test.data")

ratings = data.map(lambda l: l.split(',')).map(lambda l: Rating(int(l[0]), int(l[1]), float(l[2])))

rank = 10

numIterations = 20

model = ALS.train(ratings, rank, numIterations)

testdata = ratings.map(lambda p: (p[0], p[1]))

predictions = model.predictAll(testdata).map(lambda r: ((r[0], r[1]), r[2]))

predictions.collect() # shows me some predictions

model.save(sc, "model0")

# Trying to load saved model and work with it

model0 = MatrixFactorizationModel.load(sc, "model0")

predictions0 = model0.predictAll(testdata).map(lambda r: ((r[0], r[1]), r[2]))

在尝试使用model0之后，我得到了一个很长的回溯，结果是：Py4JError: An error occurred while calling o70.predict. Trace:

py4j.Py4JException: Method predict([class org.apache.spark.api.java.JavaRDD]) does not exist

at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:333)

at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342)

at py4j.Gateway.invoke(Gateway.java:252)

at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)

at py4j.commands.CallCommand.execute(CallCommand.java:79)

at py4j.GatewayConnection.run(GatewayConnection.java:207)

at java.lang.Thread.run(Thread.java:745)

所以我的问题是-我做错什么了吗？在我调试的时候，我的模型被存储(本地和HDFS上)，并且它们包含许多包含一些数据的文件。我有一种感觉，模型是正确保存的，但可能它们没有正确加载。我也到处搜索，但没有发现任何相关的。

看起来这个save\load特性是最近在Spark 1.3.0中添加的，因此我有另一个问题-在1.3.0之前，推荐的保存/加载模型的方法是什么？我还没有找到任何好的方法，至少对于Python来说是这样。我也试过Pickle，但是遇到了这里描述的同样的问题Save Apache Spark mllib model in python

weixin_39760689

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。