spark 如何获得分类概率

最新推荐文章于 2024-04-30 20:06:58 发布

seu_yang

最新推荐文章于 2024-04-30 20:06:58 发布

阅读量3.7k

点赞数 2

分类专栏： spark学习文章标签： spark 数据挖掘算法

本文链接：https://blog.csdn.net/seu_yang/article/details/52118683

版权

spark学习专栏收录该内容

2 篇文章 0 订阅

订阅专栏

在进行分类时，通常不仅希望知道该样本是被预测为0，1，还希望获得该样本被预测为0，1的概率

LR中：

val model = new LogisticRegressionWithLBFGS().setNumClasses(2).run(trainingData)

model.clearThreshold()

//默认Threshold为0.5，只需通过model.clearThreshold()函数去掉阈值即可获得分类概率

GBDT 中：

原始的 predict函数只能输出0,1；我们需要通过以下原代码修改该函数即可得到概率

def predict(features: Vector): Double = {
  (algo, combiningStrategy) match {
    case (Regression, Sum) =>
      predictBySumming(features)
    case (Regression, Average) =>
      predictBySumming(features) / sumWeights
 case (Classification, Sum) => // binary classification
 val prediction = predictBySumming(features)
      // TODO: predicted labels are +1 or -1 for GBT. Need a better way to store this info.
 //if (prediction > 0.0) 1.0 else 0.0（原始）
 (1/(1+math.pow(2.7, -prediction)))//修改为sigmoid函数
    case (Classification, Vote) =>
      predictByVoting(features)
    case _ =>
      throw new IllegalArgumentException(
        "TreeEnsembleModel1 given unsupported (algo, combiningStrategy) combination: " +
          s"($algo, $combiningStrategy).")
  }
}

然后import重定义的class即可

seu_yang

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
spark 如何获得分类概率

在进行分类时，通常不仅希望知道该样本是被预测为0，1，还希望获得该样本被预测为0，1的概率 LR中：val model = new LogisticRegressionWithLBFGS().setNumClasses(2).run(trainingData)model.clearThreshold()//默认Threshold为0.5，只需通过model.cl
复制链接

扫一扫