LR 处理多分类问题

回归模型做分类

LR 却是用来做分类的。它的模型函数为：

$h_{\Theta }\left ( x \right ) = \frac{1}{1+e^{-\Theta ^{T}x}}$

$h\left ( z \right ) = \frac{1}{1+e^{-z}}$

• $z = +\infty,h\left ( z \right ) = \frac{1}{1+0} \approx 1, z = -\infty,h\left ( z \right ) = \frac{1}{1+\infty } \approx 0$

逻辑回归的目标函数

$P\left ( y=1|x \right ) = h\left ( x \right ); P\left ( y=0|x \right ) =1- h\left ( x \right );$

$P\left ( y|x \right ) = h\left ( x \right )^{y}\left ( 1-h\left ( x \right ) \right )^{\left ( 1-y \right )}$

$L\left ( \Theta \right ) = \prod_{i=1}^{m}P\left (y^{\left ( i \right )}|x^{\left ( i \right )};\Theta \right ) = \prod _{m}^{i=1}\left ( h_{\Theta }\left ( x^{\left ( i \right )} \right )\right )^{y\left ( i \right )}\left ( 1-h_{\Theta}\left ( x^{\left ( i \right )} \right ) \right )^{\left ( 1-y^{\left ( i \right )} \right )}$

L(θ) 就是 LR 的似然函数。我们要让它达到最大，也就是对其进行“极大估计”。因此，求解 LR 目标函数的过程，就是对 LR 模型函数进行极大似然估计的过程。我们要求出让 l(θ) 能够得到最大值的 θ。

$l\left ( \Theta \right ) = log\left ( L\left ( \Theta \right ) \right )$

l(θ) 其实可以作为 LR 的目标函数。前面讲过，我们需要目标函数是一个凸函数，具备最小值。因此我们设定：J(θ)=−l(θ)。

$J\left ( \Theta \right ) = -log\left ( L\left ( \Theta \right ) \right )$

实例及代码实现

from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import LinearRegression
import pandas as pd

# Importing dataset

used_features = ["Last Score", "Hours Spent"]
X = data[used_features].values
scores = data["Score"].values

X_train = X[:11]
X_test = X[11:]

# Logistic Regression – Binary Classification
passed = []

for i in range(len(scores)):
if (scores[i] >= 60):
passed.append(1)
else:
passed.append(0)

y_train = passed[:11]
y_test = passed[11:]

classifier = LogisticRegression(C=1e5)
classifier.fit(X_train, y_train)

y_predict = classifier.predict(X_test)
print(y_predict)


[1 0 1]

Process finished with exit code 0

LR 处理多分类问题

    from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import LinearRegression
import pandas as pd

# Importing dataset

used_features = [ "Last Score", "Hours Spent"]
X = data[used_features].values
scores = data["Score"].values

X_train = X[:11]
X_test = X[11:]

# Logistic Regression - Multiple Classification
level = []

for i in range(len(scores)):
if(scores[i] >= 85):
level.append(2)
elif(scores[i] >= 60):
level.append(1)
else:
level.append(0)

y_train = level[:11]
y_test = level[11:]

classifier = LogisticRegression(C=1e5)
classifier.fit(X_train, y_train)

y_predict = classifier.predict(X_test)
print(y_predict)
08-30 2622
03-13 4万+

01-29 709
03-21 1万+
06-22 2651
07-29 9851
03-12 75