Logistic Regression 逻辑斯蒂回归,一般的二分类函数形式:
(1-1)
另:
(1-2)
得到如下函数:
(1-3)
在二次的Logistic中,有如下概率值:
(1-4)
有一点需要知道,什么是事件几率和事件对数几率,时间几率就是一个事件发生的概率和不发生的概率的比值,对数几率就是对上述的事件几率值取对数。
(1-5)
最需要注意的一点:
在该模型中,输出Y=1的对数几率是x的线性函数,对x进行分类时wx的值域是全体实数。当wx–>正无穷时,分类1,当wx–>负无穷时,分类0。
如何求参数:
在给定的数据集合: (1-6)
得到:
(1-7)
得到似然函数:
(1-8)
得到对数似然函数:
(1-9)
(1-10)
对公式(1-9)L(w)求极大值,根据牛顿法求出w的最大值,相对于-L(w)求最小值,根据这个进行优化,下文可以用到。一般而言使用
如下公式进行最小值优化。当然也可以直接求L(w)。
(1-11)
代码实验:
# coding: utf-8
# In[4]:
#This is theano simple example
#first use theano ,let's go to reveal what is theano
import numpy as np
import theano
import theano.tensor as T #tensor is input
rng = np.random
N = 400 #training sample size
feats = 784 #number of input variables(dimension)
# In[21]:
# generate a dataSet: D = {input_values, target_calss}
D = (rng.randn(N, feats), rng.randint(size=N, low=0, high=2))
# rng.randn(N, feats) generate N vector of feats,
# rng.randint(size=N, low=0, high=2) random get N {0,1}
training_steps = 10000
# declare(state) Theano symbolic(符号) variables
x = T.dmatrix("x")
y = T.dvector("y")
# initialize the weight vector w randomly
# 初始化 w,b,并设置为共享变量,使得在迭代的更新中可以修改
# this and the following bias variable b
# are shareed so they keep their values
# between training interations (update)
w = theano.shared(rng.randn(feats), name="w")
b = theano.shared(0., name="b") # b is scarlar
#print("Initia model:")
#print(w.get_value())
#print(b.get_value())
# Constuct Theano expression graph
# note like (1-11)
p_1 = 1 / (1 + T.exp(-T.dot(x,w) - b)) # Probability that target = 1
prediction = p_1 > 0.5 # The prediction thresholded
xent = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # Cross-entropy loss function
cost = xent.mean() + 0.01 * (w ** 2).sum() # The cost to minimize
gw, gb = T.grad(cost, [w, b]) # Compute the gradient of cost
# w.r.t weight vector w and biad term b
# (we shall return this in a following
# section of this tutorial)
# In[22]:
# Compile
train = theano.function(
inputs = [x,y], #input training data
outputs = [prediction,xent], # type of compute
updates = ((w, w - 0.1 * gw), (b, b - 0.1 * gb))) # type of update
predict = theano.function(inputs=[x],outputs=prediction)
# In[38]:
# Train
for i in range(training_steps):
pred, err = train(D[0],D[1])
print("Theano Final model:")
print(w.get_value())
print(b.get_value())
print("target values for D:")
print(D[1])
print("prediction on D:")
print(predict(D[0])) # all training data
# In[40]:
# use sklearn
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()
lr.fit(D[0],D[1])
lr.predict(D[0])
print("sklearn Final model:")
print(lr.coef_[0])
print(lr.intercept_[0])
print("target values for D:")
print(D[1])
print("prediction on D:")
print(lr.predict(D[0])) # all training data
# In[44]:
# 比较两个算法的结果是否相同
k = 0
theano_prediction = predict(D[0])
sklearn_prediction = lr.predict(D[0])
for i in range(len(theano_prediction)):
if theano_prediction[i] != sklearn_prediction[i]:
k += 1
print(k)
# In[ ]: