python逻辑回归优化参数_【机器学习算法】逻辑回归调优

最新推荐文章于 2024-06-27 20:04:44 发布

weixin_39635657

最新推荐文章于 2024-06-27 20:04:44 发布

阅读量1.5k

点赞数

文章标签： python逻辑回归优化参数

本文链接：https://blog.csdn.net/weixin_39635657/article/details/111795372

版权

本文介绍了在Spark环境下使用Python进行逻辑回归的参数优化，包括有无截距、处理线性不可分问题、调整分类阈值、增强模型的鲁棒性以及数据归一化的技巧。通过实例展示了如何利用LogisticRegressionWithLBFGS和LogisticRegressionWithSGD进行模型训练，并调整正则化参数和分类阈值，以提升模型性能。

摘要由CSDN通过智能技术生成

环境

spark-1.6

python3.5

一、有无截距

对于逻辑回归分类，就是找到z那条直线，不通过原点有截距的直线与通过原点的直线相比，有截距更能将数据分类的彻底。

packagecom.bjsxt.lrimportorg.apache.spark.mllib.classification.{LogisticRegressionWithLBFGS}importorg.apache.spark.mllib.util.MLUtilsimportorg.apache.spark.{SparkConf, SparkContext}/*** 逻辑回归健康状况训练集*/object LogisticRegression {

def main(args: Array[String]) {

val conf= new SparkConf().setAppName("spark").setMaster("local[3]")

val sc= newSparkContext(conf)//加载 LIBSVM 格式的数据这种格式特征前缀要从1开始

val inputData = MLUtils.loadLibSVMFile(sc, "健康状况训练集.txt")

val splits= inputData.randomSplit(Array(0.7, 0.3), seed = 1L)

val (trainingData, testData)= (splits(0), splits(1))

val lr= newLogisticRegressionWithLBFGS()//lr.setIntercept(true)

val model =lr.run(trainingData)

val result=testData

.map{point=>Math.abs(point.label-model.predict(point.features)) }

println("正确率="+(1.0-result.mean()))/***逻辑回归算法训练出来的模型，模型中的参数个数(w0....w6)=训练集中特征数(6)+1*/println(model.weights.toArray.mkString(" "))

println(model.intercept)

sc.stop()

}

packagecom.bjsxt.lrimportorg.apache.spark.mllib.classification.{LogisticRegressionWithLBFGS, LogisticRegressionWithSGD}importorg.apache.spark.mllib.regression.LabeledPointimportorg.apache.spark.mllib.util.MLUtilsimportorg.apache.spark.rdd.RDDimportorg.apache.spark.{SparkConf, SparkContext}/*** 有无截距*/object LogisticRegression2 {

def main(args: Array[String]) {

val conf= new SparkConf().setAppName("spark").setMaster("local[3]")

val sc= newSparkContext(conf)

val inputData: RDD[LabeledPoint]= MLUtils.loadLibSVMFile(sc, "w0测试数据.txt")/*** randomSplit(Array(0.7, 0.3))方法就是将一个RDD拆分成N个RDD，N = Array.length

* 第一个RDD中的数据量和数组中的第一个元素值相关*/val splits= inputData.randomSplit(Array(0.7, 0.3),11L)

val (trainingData, testData)= (splits(0), splits(1))

val lr= newLogisticRegressionWithSGD//设置要有W0，也就是有截距

lr.setIntercept(true)

val model=lr.run(trainingData)

val result=testData.map{labeledpoint=>Math.abs(labeledpoint.label-model.predict(labeledpoint.features)) }

println("正确率="+(1.0-result.mean()))

println(model.weights.toArray.mkString(" "))

println(model.intercept)

}</

最低0.47元/天解锁文章

weixin_39635657

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
python逻辑回归优化参数_【机器学习算法】逻辑回归调优

环境spark-1.6python3.5一、有无截距对于逻辑回归分类，就是找到z那条直线，不通过原点有截距的直线与通过原点的直线相比，有截距更能将数据分类的彻底。packagecom.bjsxt.lrimportorg.apache.spark.mllib.classification.{LogisticRegressionWithLBFGS}importorg.apache.spark.mlli...
复制链接

扫一扫