Python 与机器学习笔记与调试心得

最新推荐文章于 2022-11-08 19:44:45 发布

LaiAng8086

最新推荐文章于 2022-11-08 19:44:45 发布

阅读量190

点赞数

分类专栏：机器学习文章标签： python

本文链接：https://blog.csdn.net/LaiAng8086/article/details/112908389

版权

机器学习专栏收录该内容

1 篇文章 0 订阅

订阅专栏

TensorFlow 与Python学习 II

一、Python相关函数的使用

1、range:左闭右开区间

range(start,stop,[step])

二、深层神经网络训练

1、选择非线性激活函数
2、信号前向传播：矩阵乘法
$y = f (w x + b)$
3、计算当前误差
$\delta(z_j)=\frac{e^{z_j}}{\sum_{k=1}^{K}e^{z_k}}\\ H(y,\hat{y})=-\sum_yylog(\hat{y})$
4、缩小误差
梯度下降法（求偏导）、设置学习率
$\frac{\partial loss}{\partial w^{(o)}} =\frac{\partial loss}{\partial \hat{y}} \cdot \frac{\partial \hat{y}}{\partial O} \cdot \frac{\partial O}{\partial w^{(o)}}\\ \eta=\eta_s\cdot decay\_rate^{\frac{step\_count}{decay\_count}}\\ w_{new}=w_{old}-\eta \frac{\partial loss}{\partial w}$
5、反复迭代

三、存在问题

1、损失函数不一定是凸函数，故梯度下降不一定能找到全局最优解，只能找到局部最优解
【解决】同时训练出多个模型，综合最优解
2、过拟合和欠拟合

通过引入正则化来解决过拟合问题

四、TensorFlow运行机制

1、计算图
2、会话
计算过程中，run函数可以同时计算多个张量，eval只能一次计算一个张量。
3、模型的保存：保存模型、恢复模型

20210701

1、替换pip源，可以用清华的pypi

pip install pip -U
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

不过好像用豆瓣最快？

pip --default-timeout=100 install 库名称 -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com

2、升级pandas

Could not install packages due to an EnvironmentError

可以通过 --user选项来

3、简单数据可视化来分析

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
path='ex2data2.txt'

# 路径，索引项，表头
data2=pd.read_csv(path,header=None,names=['Test 1','Test 2','Accepted'])
print(data2.head())

# 数据可视化
positive=data2[data2['Accepted'].isin([1])] # isin 筛选功能，[1]是一个list
negative=data2[data2['Accepted'].isin([0])]

fig,ax=plt.subplots(figsize=(12,8))
# c=color the first two figures refers to data (x,y)
ax.scatter(positive['Test 1'],positive['Test 2'],s=50,c='b',marker='o',label='Accepted')
ax.scatter(negative['Test 1'],negative['Test 2'],s=50,c='r',marker='x',label='Rejected')
ax.legend()  #加上图例
ax.set_xlabel('Test 1 Score')
ax.set_ylabel('Test 2 Score')
plt.show()

20210703

一、逻辑回归

1、新的代价函数的数学依据：最大化似然函数和最小化损失函数时等价的。
$L(w)=\prod(p(x_{i}))^{y_i} \cdot [1-p(x_i)]^{y_i}$
取对数后得到现在的代价函数。
2、梯度下降的推导过程
$\begin{aligned} J(\theta)=&-\frac{1}{m}\sum_{i=1}^m [y^{(i)}log(h_{\theta}(x^{(i)}))+(1-y^{(i)})log(1-h_{\theta}(x^{(i)}))] \end{aligned}$
故 $\frac{\partial}{\partial \theta_j}J(\theta)=$
$\begin{aligned} &-\frac{1}{m}[y^{(i)}\frac{{x_j}^{(i)}}{1+e^{\theta^Tx^{(i)}}}-(1-y^{(i)})\frac{x_j^{(i)}e^{\theta^Tx^{(i)}}}{1+e^{\theta^Tx^{(i)}}}] \\ &=\frac{1}{m}[h_{\theta}(x^{(i)})-y^{(i)}]x_j^{(i)} \end{aligned}$

二、完整的二分类代码

# -*- coding: utf-8 -*-
"""
Created on Thu Jul  1 16:15:33 2021

@author: LaiAng80586
"""

# logistic regression for classification
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt # finding for best arguments

def sigmoid(z):
    return 1/(1+np.exp(-z))

nums=np.arange(-10.0,10.0,step=0.01)
print(sigmoid(nums))
fig,a=plt.subplots()
a.plot(nums,sigmoid(nums),'r') # cichu meiyou denghao!
plt.show()

def cost(theta,X,y,learningRate): # with regularized cost in order not to get overfitting
    theta=np.matrix(theta)
    X=np.matrix(X)
    y=np.matrix(y)
    first=np.multiply(-y,np.log(sigmoid(X*theta.T)))
    second=np.multiply(1-y,np.log(1-sigmoid(X*theta.T)))
    reg=(learningRate/(2*len(X)))*np.sum(np.power(theta[:,1:theta.shape[1]],2))
    return np.sum(first-second)/(len(X))+reg # len(X) refer to the size of training set

def gradient(theta,X,y,learningRate):  # 0 without regularization
    theta=np.matrix(theta)
    X=np.matrix(X)
    y=np.matrix(y)
    
    parameters=int(theta.ravel().shape[1]) #ravel refers to convert the previous array into 1D array
    grad=np.zeros(parameters)
    
    error=sigmoid(X*theta.T)-y
    
    for i in range(parameters):
        term=np.multiply(error,X[:,i])
        if(i==0):
            grad[i]=np.sum(term)/len(X)
        else :
            grad[i]=(np.sum(term)/len(X))+((learningRate/len(X))*theta[:,i])
        
    return grad

def predict(theta,X):
    probability=sigmoid(X*theta.T)
    return [1 if x>=0.5 else 0 for x in probability]



path='ex2data2.txt'
# 路径，索引项，表头
data2=pd.read_csv(path,header=None,names=['Test 1','Test 2','Accepted'])
print(data2.head())
# 数据可视化

positive=data2[data2['Accepted'].isin([1])] # isin 筛选功能，[1]是一个list
negative=data2[data2['Accepted'].isin([0])]

fig,ax=plt.subplots(figsize=(12,8))
# c=color the first two figures refers to data (x,y)
ax.scatter(positive['Test 1'],positive['Test 2'],s=50,c='b',marker='o',label='Accepted')
ax.scatter(negative['Test 1'],negative['Test 2'],s=50,c='r',marker='x',label='Rejected')
ax.legend()  #加上图例
ax.set_xlabel('Test 1 Score')
ax.set_ylabel('Test 2 Score')
plt.show()


degree=5
x1=data2['Test 1']
x2=data2['Test 2']
data2.insert(3,'Ones',1)
for i in range(1,degree):
    for j in range(0,i):
        data2['F'+str(i)+str(j)]=np.power(x1,i-j)*np.power(x2,j)
data2.drop('Test 1',axis=1,inplace=True) #do not create new object and modify on former object
data2.drop('Test 2',axis=1,inplace=True)
print(data2.head())


# setting the data
cols=data2.shape[1]
X2=data2.iloc[:,1:cols] # the data in all rows but in specific cols
y2=data2.iloc[:,0:1]

X2=np.array(X2.values)
y2=np.array(y2.values)
theta2=np.zeros(11)
learningRate=1
#examine the cost and gradient function
print(cost(theta2,X2,y2,learningRate))
print(gradient(theta2,X2,y2,learningRate))

#prediection
result2=opt.fmin_tnc(func=cost,x0=theta2,fprime=gradient,args=(X2,y2,learningRate))
print(result2)

theta_min=np.matrix(result2[0])
predictions=predict(theta_min,X2)
correct=[1 if ((a==1 and b==1) or (a==0 and b==0)) else 0 for (a,b) in zip(predictions,y2)]
accuracy=(sum(map(int,correct))%len(correct))
print('accuracy={0}%'.format(accuracy))

20210712

一、向量化的梯度下降计算

这个并不是线性回归的正规方程法，只不过是把原来梯度下降循环赋值变成了矩阵乘法罢了。

二、几个概念的理解

逻辑回归 ：建立多特征的模型进行二分类
一对多 : 若干个二分类器，每一个分类器只是判断是这一类or不是这一类，只是 $s i g m o i d$ 之后会有不同的概率，选取概率最大的确定类别。

20210713

一、神经网络

对于 $K$ 分类而言，其代价函数是 $K$ 维向量，最终的输出也是 $K$ 维向量，这是与逻辑回归所不同的地方。
正向传播：计算预测函数 $h_{\theta}(x)$
反向传播：计算代价函数的偏导数 $\frac{\partial}{\partial \Theta_{ij}^{(l)}} J(\Theta)$