用到的代码目录:
1.入门demo:(先不关心具体数据是什么)
python代码:lr_iris.py
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
def load_data():
inputdata = datasets.load_iris()
x_train, x_test, y_train, y_test = \
train_test_split(inputdata.data, inputdata.target, test_size = 0.2, random_state=0)
return x_train, x_test, y_train, y_test
def main():
x_train, x_test, y_train, y_test = load_data()
model = LogisticRegression(penalty='l1')
model.fit(x_train, y_train)
print "w: ", model.coef_
print "b: ", model.intercept_
print "precision: ", model.score(x_test, y_test)
print "MSE: ", np.mean((model.predict(x_test) - y_test) ** 2)
if __name__ == '__main__':
main()
1.将数据切分为20%的测试集,80%的训练集
2.调用逻辑回归的方法得到一个模型
3.将训练集数据放入模型进行学习训练
4.训练后得到结果
5.LogisticRegression(penalty='l2')中的l1是做稀疏化,通常是l2
模型:w,b
准确率:precision
执行结果:
第二步,单独得到w和b
代码 lr_out_model.py:
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
def load_data():
inputdata = datasets.load_iris()
x_train, x_test, y_train, y_test = \
train_test_split(inputdata.data, inputdata.target, test_size = 0.2, random_state=0)
return x_train, x_test, y_train, y_test
def main():
x_train, x_test, y_train, y_test = load_data()
model = LogisticRegression(penalty='l2')
model.fit(x_train, y_train)
ff_w = open('model.w', 'w')
ff_b = open('model.b', 'w')
for w_list in model.coef_:
for w in w_list:
print >> ff_w, "w: ", w
for b in model.intercept_:
print >> ff_b, "b: ", b
# print "w: ", model.coef_
# print "b: ", model.intercept_
print "precision: ", model.score(x_test, y_test)
print "MSE: ", np.mean((model.predict(x_test) - y_test) ** 2)
if __name__ == '__main__':
main()
w:每四个为一组,共12组
2.用我们自己找的数据得到一个model
验证模型的数据:
第一列:label(做2分类)
第二列:冒号左边:每一个数字代表一个特征