吴恩达机器学习ex22练习题

NATSU1573

已于 2024-06-02 18:56:22 修改

阅读量791

点赞数 24

文章标签：机器学习 logistics regression

于 2024-06-02 18:43:36 首次发布

本文链接：https://blog.csdn.net/weixin_47598128/article/details/139394307

版权

吴恩达机器学习ex22

数据集介绍

ex22数据集
共有三列，前两列代表生产元件的两个测试结果，第三列代表生产的元器件是否合格，1代表合格，0代表不合格。
数据下载地址：链接：https://pan.baidu.com/s/1-a6UJMkVkq-1NRZleuCrsg
提取码：77v2
输出有0,1两个取值，很明显这是一个分类问题，如何选择分类模型参数是需要重点考虑的问题，这需要根据数据的特征决定。

数据加载

txt类型的文件，其中包含三列，且分隔符为","，加载数据可以选用np.loadtxt，或者是选用pandas加载：

# importing the dataset
path = "data_sets/ex2data2.txt"
data = pd.read_csv(path, header=None, sep=',',names=['Test 1', 'Test 2', 'Accepted'])
data2 = copy.deepcopy(data)
plotData(data)

绘制二维散点图如下：

def plotData(data):
    pos = data[data["Accepted"].isin([1])]
    neg = data[data["Accepted"].isin([0])]
    plt.figure(1)
    plt.scatter(pos["Test 1"], pos["Test 2"], s=30, c="b", marker=MarkerStyle("o"), label="Accepted")
    plt.scatter(neg["Test 1"], neg["Test 2"], s=30, c="r", marker=MarkerStyle("x"), label="Rejected")
    plt.show()
    input()

在这里插入图片描述
可以看出无法用一条直线来划分两类，需要使用高阶多项式，这里使用6次项，也就是 $x_1^m * x_2^n$ 其中 $(m+n)\leq6$ 。
再次之前，需要数据预处理

数据预处理

数据预处理包含两部分内容，一部分是高阶多项式数据补充；第二部分是数据类型的转换，为了方便后续的矩阵处理，最好将pd类型的数据转换为np.array或者是np.matrix

# data preprocessing
data.insert(0, 'Ones', 1)
ss = 6
for i in range(1, ss+1):
    for j in range(0, i+1):
    	# i代表的是m+n的取值，最大为6
        data["F"+str(i-j)+str(j)] = pow(data["Test 1"],i-j)*pow(data["Test 2"],j)
data.drop(['Test 1', 'Test 2'], axis=1, inplace=True)

打印前五行

 Ones  Accepted       F10      F01       F20       F11       F02  ...           F60           F51       F42       F33       F24       F15       F06
0     1         1  0.051267  0.69956  0.002628  0.035864  0.489384  ...  1.815630e-08  2.477505e-07  0.000003  0.000046  0.000629  0.008589  0.117206        
1     1         1 -0.092742  0.68494  0.008601 -0.063523  0.469143  ...  6.362953e-07 -4.699318e-06  0.000035 -0.000256  0.001893 -0.013981  0.103256        
2     1         1 -0.213710  0.69225  0.045672 -0.147941  0.479210  ...  9.526844e-05 -3.085938e-04  0.001000 -0.003238  0.010488 -0.033973  0.110047        
3     1         1 -0.375000  0.50219  0.140625 -0.188321  0.252195  ...  2.780914e-03 -3.724126e-03  0.004987 -0.006679  0.008944 -0.011978  0.016040        
4     1         1 -0.513250  0.46564  0.263426 -0.238990  0.216821  ...  1.827990e-02 -1.658422e-02  0.015046 -0.013650  0.012384 -0.011235  0.010193

ex22需要考虑正则化，下面是初始化参数设置

y = data["Accepted"]
x = data.drop(['Accepted'], axis=1)
x = np.array(x.values)
y = np.array(y.values)
theta = np.zeros(x.shape[1])
lam = 1# res

计算梯度和损失函数

损失函数

再加入正则项之后，损失函数的计算公式为：
$J(\theta)=\frac{1}{m}\sum_{1}^{m}{[h_\theta(x^i)-y^i]^2}+\frac{\lambda}{2m}\sum_{j}^{n}{\theta_j^2}$
其中 $\lambda$ 为正则化参数，m代表样本数，n代表 $\theta$ 的维度
对于分类问题，将损失函数改写为：
$J(\theta)=\frac{1}{m}\sum_{1}^{m}{[-y\log{h_\theta(x)}-(1-y)\log{[1-h_\theta(x)}]]}+\frac{\lambda}{2m}\sum_{j=1}^{n}{\theta_j^2}$
其中 $h_\theta(x)$ 的计算公式为
$h_\theta(x)=g(\theta^T*x)$
$g(z)=\frac{1}{1+exp^-z}$

# sigmod funcion
def sigmod(z):
    return 1 / (1 + np.exp(-z))

# cost function
def costFunction(theta, x, y, lam):
    hx = sigmod(x.dot(theta)) #(n,)
    left = -np.dot(y.T, np.log(hx))# 注意矩阵维度
    right = -np.dot((1-y).T, np.log(1-hx))
    reg = lam * np.sum(theta[1:] * theta[1:])
    return (left + right) / len(x) + reg / (2 * len(x))

梯度函数

对损失函数求导可以得到梯度函数：
$grad=\frac{1}{m}(\sum_{i=1}^{m}{(h_\theta(x^i)-y^i)*x^i}+\lambda\theta_j)$

# gradient function
def gradient(theta, x, y, lam):
    hx = sigmod(x.dot(theta))
    errors = (hx - y.T)
    grad = np.zeros(theta.shape)
    grad[0] = np.sum(errors * x[:,0]) / len(x)
    grad[1:] = (np.dot(errors, x[:,1:]) + lam * theta[1:]) / len(x)
    return grad

梯度下降

在计算完损失函数和梯度函数之后，这里不在实现梯度下降算法，调用模块内有的梯度下降算法实现梯度下降

# gradient descent
res = opt.fmin_tnc(func=costFunction, x0=theta, fprime=gradient, args=(x, y, lam))
print(res)

计算结果如下：

(array([ 1.27271026,  0.62529965,  1.18111686, -2.01987399, -0.91743189,
       -1.43166928,  0.12393228, -0.36553118, -0.35725404, -0.17516291,
       -1.45817009, -0.05098418, -0.61558554, -0.27469165, -1.19271298,
       -0.2421784 , -0.20603299, -0.04466178, -0.2777895 , -0.29539513,
       -0.45645982, -1.04319155,  0.02779373, -0.29244869,  0.0155576 ,
       -0.32742405, -0.1438915 , -0.92467487]), 32, 1)

res[0]代表 $\theta$ 系数

评估梯度下降结果

首先在训练集中计算模型的正确率

# predict function
def predict(theta, x):
    hx = sigmod(x.dot(theta))
    return [1 if (i>0.5) else 0 for i in hx]

pre = predict(res[0], x)
correct = [1 if (i==j) else 0 for i,j in zip(pre,y)]
correct = np.array(correct)
acc = np.sum(correct)/len(correct)
print(acc)
# acc=0.8305084745762712

计算正确率为83%

绘制模型划分边界

模型划分边界即为：
$h_\theta(x)=0.5$
即
$\theta^T*x=0$
在 $\theta$ 已经梯度下降求得后，需要绘制隐函数的曲线：

def computeZ(theta, t1, t2):
    z = np.full(t1.shape,theta[0])
    degree = 6
    place = 0
    for i in range(1, degree+1):
        for j in range(0, i+1):
            z += np.power(t1, i-j) * np.power(t2, j) * theta[place+1]
            place+=1
    return z
def plotPlot(theta, data):
    pos = data[data["Accepted"].isin([1])]
    neg = data[data["Accepted"].isin([0])]
    plt.figure(1)
    plt.scatter(pos["Test 1"], pos["Test 2"], s=30, c="b", marker=MarkerStyle("o"), label="Accepted")
    plt.scatter(neg["Test 1"], neg["Test 2"], s=30, c="r", marker=MarkerStyle("x"), label="Rejected")

    t1 = np.linspace(-1, 1.5, 1000)
    t2 = np.linspace(-1, 1.5, 1000)
    t1, t2 = np.meshgrid(t1, t2)
    z = computeZ(theta, t1, t2)
    plt.contour(t1, t2, z, 0)
    plt.show()
    input()