特征选择 java代码,逻辑回归特征选择

特征选择很重要,除了人工选择,还可以用

其他机器学习方法,如逻辑回归、随机森林、PCA、

LDA等。

分享一下逻辑回归做特征选择

特征选择包括:

特征升维

特征降维

特征升维

如一个样本有少量特征,可以升维,更好的拟合曲线

特征X

升维X/X**2/

效果验证,做回归

20180110232825697116.png

加特征x**2之后的效果

20180110232825699069.png

20180110232825700046.png

特征X1、X2

升维X1/X2/X1X2/X1**2/X2**2/

特征降维

利用L1正则化做特征选择

20180110232825701999.png

sparkmllib代码实现

import java.io.PrintWriter

import java.util

import org.apache.spark.ml.attribute.{Attribute, AttributeGroup, NumericAttribute}

import org.apache.spark.ml.classification.{BinaryLogisticRegressionTrainingSummary, LogisticRegressionModel, LogisticRegression}

import org.apache.spark.mllib.classification.LogisticRegressionWithSGD

import org.apache.spark.mllib.linalg.Vectors

import org.apache.spark.rdd.RDD

import org.apache.spark.sql.{SQLContext, DataFrame, Row}

import org.apache.spark.sql.types.{DataTypes, StructField}

import org.apache.spark.{SparkContext, SparkConf}

object LogisticRegression {

def main(args: Array[String]) {

val conf = new SparkConf().setAppName("test").setMaster("local")

val sc = new SparkContext(conf)

val sql = new SQLContext(sc);

val df: DataFrame = sql.read.format("libsvm").load("rl.txt")

// val training = sc.read.format("libsvm").load("data/mllib/sample_libsvm_data.txt")

val Array(train, test) = df.randomSplit(Array(0.7, 0.3),seed = 12L)

val lr = new LogisticRegression()

.setMaxIter(10)

.setRegParam(0.3)

.setElasticNetParam(1)//默认0 L2 1---》L1

// Fit the model

val lrModel: LogisticRegressionModel = lr.fit(train)

lrModel.transform(test).show(false)

// Print the coefficients and intercept for logistic regression

// coefficients 系数 intercept 截距

println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}")

lrModel.write.overwrite().save("F:\\mode")

val weights: Array[Double] = lrModel.weights.toArray

val pw = new PrintWriter("F:\\weights");

//遍历

for(i

//通过map得到每个下标相应的特征名

//特征名对应相应的权重

val str = weights(i)

pw.write(str.toString)

pw.println()

}

pw.flush()

pw.close()

}

}

样本lr.txt

0 1:5.1 2:3.5 3:1.4 4:0.2

0 1:4.9 2:3.0 3:1.4 4:0.2

0 1:4.7 2:3.2 3:1.3 4:0.2

0 1:4.6 2:3.1 3:1.5 4:0.2

0 1:5.0 2:3.6 3:1.4 4:0.2

0 1:5.4 2:3.9 3:1.7 4:0.4

0 1:4.6 2:3.4 3:1.4 4:0.3

0 1:5.0 2:3.4 3:1.5 4:0.2

0 1:4.4 2:2.9 3:1.4 4:0.2

0 1:4.9 2:3.1 3:1.5 4:0.1

0 1:5.4 2:3.7 3:1.5 4:0.2

0 1:4.8 2:3.4 3:1.6 4:0.2

0 1:4.8 2:3.0 3:1.4 4:0.1

0 1:4.3 2:3.0 3:1.1 4:0.1

0 1:5.8 2:4.0 3:1.2 4:0.2

0 1:5.7 2:4.4 3:1.5 4:0.4

0 1:5.4 2:3.9 3:1.3 4:0.4

0 1:5.1 2:3.5 3:1.4 4:0.3

0 1:5.7 2:3.8 3:1.7 4:0.3

0 1:5.1 2:3.8 3:1.5 4:0.3

0 1:5.4 2:3.4 3:1.7 4:0.2

0 1:5.1 2:3.7 3:1.5 4:0.4

0 1:4.6 2:3.6 3:1.0 4:0.2

0 1:5.1 2:3.3 3:1.7 4:0.5

0 1:4.8 2:3.4 3:1.9 4:0.2

0 1:5.0 2:3.0 3:1.6 4:0.2

0 1:5.0 2:3.4 3:1.6 4:0.4

0 1:5.2 2:3.5 3:1.5 4:0.2

0 1:5.2 2:3.4 3:1.4 4:0.2

0 1:4.7 2:3.2 3:1.6 4:0.2

0 1:4.8 2:3.1 3:1.6 4:0.2

0 1:5.4 2:3.4 3:1.5 4:0.4

0 1:5.2 2:4.1 3:1.5 4:0.1

0 1:5.5 2:4.2 3:1.4 4:0.2

0 1:4.9 2:3.1 3:1.5 4:0.1

0 1:5.0 2:3.2 3:1.2 4:0.2

0 1:5.5 2:3.5 3:1.3 4:0.2

0 1:4.9 2:3.1 3:1.5 4:0.1

0 1:4.4 2:3.0 3:1.3 4:0.2

0 1:5.1 2:3.4 3:1.5 4:0.2

0 1:5.0 2:3.5 3:1.3 4:0.3

0 1:4.5 2:2.3 3:1.3 4:0.3

0 1:4.4 2:3.2 3:1.3 4:0.2

0 1:5.0 2:3.5 3:1.6 4:0.6

0 1:5.1 2:3.8 3:1.9 4:0.4

0 1:4.8 2:3.0 3:1.4 4:0.3

0 1:5.1 2:3.8 3:1.6 4:0.2

0 1:4.6 2:3.2 3:1.4 4:0.2

0 1:5.3 2:3.7 3:1.5 4:0.2

0 1:5.0 2:3.3 3:1.4 4:0.2

1 1:7.0 2:3.2 3:4.7 4:1.4

1 1:6.4 2:3.2 3:4.5 4:1.5

1 1:6.9 2:3.1 3:4.9 4:1.5

1 1:5.5 2:2.3 3:4.0 4:1.3

1 1:6.5 2:2.8 3:4.6 4:1.5

1 1:5.7 2:2.8 3:4.5 4:1.3

1 1:6.3 2:3.3 3:4.7 4:1.6

1 1:4.9 2:2.4 3:3.3 4:1.0

1 1:6.6 2:2.9 3:4.6 4:1.3

1 1:5.2 2:2.7 3:3.9 4:1.4

1 1:5.0 2:2.0 3:3.5 4:1.0

1 1:5.9 2:3.0 3:4.2 4:1.5

1 1:6.0 2:2.2 3:4.0 4:1.0

1 1:6.1 2:2.9 3:4.7 4:1.4

1 1:5.6 2:2.9 3:3.6 4:1.3

1 1:6.7 2:3.1 3:4.4 4:1.4

1 1:5.6 2:3.0 3:4.5 4:1.5

1 1:5.8 2:2.7 3:4.1 4:1.0

1 1:6.2 2:2.2 3:4.5 4:1.5

1 1:5.6 2:2.5 3:3.9 4:1.1

1 1:5.9 2:3.2 3:4.8 4:1.8

1 1:6.1 2:2.8 3:4.0 4:1.3

1 1:6.3 2:2.5 3:4.9 4:1.5

1 1:6.1 2:2.8 3:4.7 4:1.2

1 1:6.4 2:2.9 3:4.3 4:1.3

1 1:6.6 2:3.0 3:4.4 4:1.4

1 1:6.8 2:2.8 3:4.8 4:1.4

1 1:6.7 2:3.0 3:5.0 4:1.7

1 1:6.0 2:2.9 3:4.5 4:1.5

1 1:5.7 2:2.6 3:3.5 4:1.0

1 1:5.5 2:2.4 3:3.8 4:1.1

1 1:5.5 2:2.4 3:3.7 4:1.0

1 1:5.8 2:2.7 3:3.9 4:1.2

1 1:6.0 2:2.7 3:5.1 4:1.6

1 1:5.4 2:3.0 3:4.5 4:1.5

1 1:6.0 2:3.4 3:4.5 4:1.6

1 1:6.7 2:3.1 3:4.7 4:1.5

1 1:6.3 2:2.3 3:4.4 4:1.3

1 1:5.6 2:3.0 3:4.1 4:1.3

1 1:5.5 2:2.5 3:4.0 4:1.3

1 1:5.5 2:2.6 3:4.4 4:1.2

1 1:6.1 2:3.0 3:4.6 4:1.4

1 1:5.8 2:2.6 3:4.0 4:1.2

1 1:5.0 2:2.3 3:3.3 4:1.0

1 1:5.6 2:2.7 3:4.2 4:1.3

1 1:5.7 2:3.0 3:4.2 4:1.2

1 1:5.7 2:2.9 3:4.2 4:1.3

1 1:6.2 2:2.9 3:4.3 4:1.3

1 1:5.1 2:2.5 3:3.0 4:1.1

1 1:5.7 2:2.8 3:4.1 4:1.3

特征选择

20180110232825704929.png

第一个特征权重为0,可以忽略,选择2,3,4个特征

原文:http://www.cnblogs.com/xiaoma0529/p/6929051.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值