Python 与机器学习 笔记与调试心得

TensorFlow 与Python学习 II

一、Python相关函数的使用

1、range:左闭右开区间

range(start,stop,[step])

二、深层神经网络训练

1、选择非线性激活函数
2、信号前向传播:矩阵乘法
y = f ( w x + b ) y=f(wx+b) y=f(wx+b)
3、计算当前误差
δ ( z j ) = e z j ∑ k = 1 K e z k H ( y , y ^ ) = − ∑ y y l o g ( y ^ ) \delta(z_j)=\frac{e^{z_j}}{\sum_{k=1}^{K}e^{z_k}}\\ H(y,\hat{y})=-\sum_yylog(\hat{y}) δ(zj)=k=1KezkezjH(y,y^)=yylog(y^)
4、缩小误差
梯度下降法(求偏导)、设置学习率
∂ l o s s ∂ w ( o ) = ∂ l o s s ∂ y ^ ⋅ ∂ y ^ ∂ O ⋅ ∂ O ∂ w ( o ) η = η s ⋅ d e c a y _ r a t e s t e p _ c o u n t d e c a y _ c o u n t w n e w = w o l d − η ∂ l o s s ∂ w \frac{\partial loss}{\partial w^{(o)}} =\frac{\partial loss}{\partial \hat{y}} \cdot \frac{\partial \hat{y}}{\partial O} \cdot \frac{\partial O}{\partial w^{(o)}}\\ \eta=\eta_s\cdot decay\_rate^{\frac{step\_count}{decay\_count}}\\ w_{new}=w_{old}-\eta \frac{\partial loss}{\partial w} w(o)loss=y^lossOy^w(o)Oη=ηsdecay_ratedecay_countstep_countwnew=woldηwloss
5、反复迭代

三、存在问题

1、损失函数不一定是凸函数,故梯度下降不一定能找到全局最优解,只能找到局部最优解
【解决】同时训练出多个模型,综合最优解
2、过拟合和欠拟合

  • 通过引入正则化来解决过拟合问题

四、TensorFlow运行机制

1、计算图
2、会话
计算过程中,run函数可以同时计算多个张量,eval只能一次计算一个张量。
3、模型的保存:保存模型、恢复模型

20210701

1、替换pip源,可以用清华的pypi

pip install pip -U
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

不过好像用豆瓣最快?

pip --default-timeout=100 install 库名称 -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com 

2、升级pandas

Could not install packages due to an EnvironmentError

可以通过 --user选项来

3、简单数据可视化来分析

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
path='ex2data2.txt'

# 路径,索引项,表头
data2=pd.read_csv(path,header=None,names=['Test 1','Test 2','Accepted'])
print(data2.head())

# 数据可视化
positive=data2[data2['Accepted'].isin([1])] # isin 筛选功能,[1]是一个list
negative=data2[data2['Accepted'].isin([0])]

fig,ax=plt.subplots(figsize=(12,8))
# c=color the first two figures refers to data (x,y)
ax.scatter(positive['Test 1'],positive['Test 2'],s=50,c='b',marker='o',label='Accepted')
ax.scatter(negative['Test 1'],negative['Test 2'],s=50,c='r',marker='x',label='Rejected')
ax.legend()  #加上图例
ax.set_xlabel('Test 1 Score')
ax.set_ylabel('Test 2 Score')
plt.show()

20210703

一、逻辑回归

1、新的代价函数的数学依据:最大化似然函数和最小化损失函数时等价的。
L ( w ) = ∏ ( p ( x i ) ) y i ⋅ [ 1 − p ( x i ) ] y i L(w)=\prod(p(x_{i}))^{y_i} \cdot [1-p(x_i)]^{y_i} L(w)=(p(xi))yi[1p(xi)]yi
取对数后得到现在的代价函数。
2、梯度下降的推导过程
J ( θ ) = − 1 m ∑ i = 1 m [ y ( i ) l o g ( h θ ( x ( i ) ) ) + ( 1 − y ( i ) ) l o g ( 1 − h θ ( x ( i ) ) ) ] \begin{aligned} J(\theta)=&-\frac{1}{m}\sum_{i=1}^m [y^{(i)}log(h_{\theta}(x^{(i)}))+(1-y^{(i)})log(1-h_{\theta}(x^{(i)}))] \end{aligned} J(θ)=m1i=1m[y(i)log(hθ(x(i)))+(1y(i))log(1hθ(x(i)))]
∂ ∂ θ j J ( θ ) = \frac{\partial}{\partial \theta_j}J(\theta)= θjJ(θ)=
− 1 m [ y ( i ) x j ( i ) 1 + e θ T x ( i ) − ( 1 − y ( i ) ) x j ( i ) e θ T x ( i ) 1 + e θ T x ( i ) ] = 1 m [ h θ ( x ( i ) ) − y ( i ) ] x j ( i ) \begin{aligned} &-\frac{1}{m}[y^{(i)}\frac{{x_j}^{(i)}}{1+e^{\theta^Tx^{(i)}}}-(1-y^{(i)})\frac{x_j^{(i)}e^{\theta^Tx^{(i)}}}{1+e^{\theta^Tx^{(i)}}}] \\ &=\frac{1}{m}[h_{\theta}(x^{(i)})-y^{(i)}]x_j^{(i)} \end{aligned} m1[y(i)1+eθTx(i)xj(i)(1y(i))1+eθTx(i)xj(i)eθTx(i)]=m1[hθ(x(i))y(i)]xj(i)

二、完整的二分类代码
# -*- coding: utf-8 -*-
"""
Created on Thu Jul  1 16:15:33 2021

@author: LaiAng80586
"""

# logistic regression for classification
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt # finding for best arguments

def sigmoid(z):
    return 1/(1+np.exp(-z))

nums=np.arange(-10.0,10.0,step=0.01)
print(sigmoid(nums))
fig,a=plt.subplots()
a.plot(nums,sigmoid(nums),'r') # cichu meiyou denghao!
plt.show()

def cost(theta,X,y,learningRate): # with regularized cost in order not to get overfitting
    theta=np.matrix(theta)
    X=np.matrix(X)
    y=np.matrix(y)
    first=np.multiply(-y,np.log(sigmoid(X*theta.T)))
    second=np.multiply(1-y,np.log(1-sigmoid(X*theta.T)))
    reg=(learningRate/(2*len(X)))*np.sum(np.power(theta[:,1:theta.shape[1]],2))
    return np.sum(first-second)/(len(X))+reg # len(X) refer to the size of training set

def gradient(theta,X,y,learningRate):  # 0 without regularization
    theta=np.matrix(theta)
    X=np.matrix(X)
    y=np.matrix(y)
    
    parameters=int(theta.ravel().shape[1]) #ravel refers to convert the previous array into 1D array
    grad=np.zeros(parameters)
    
    error=sigmoid(X*theta.T)-y
    
    for i in range(parameters):
        term=np.multiply(error,X[:,i])
        if(i==0):
            grad[i]=np.sum(term)/len(X)
        else :
            grad[i]=(np.sum(term)/len(X))+((learningRate/len(X))*theta[:,i])
        
    return grad

def predict(theta,X):
    probability=sigmoid(X*theta.T)
    return [1 if x>=0.5 else 0 for x in probability]



path='ex2data2.txt'
# 路径,索引项,表头
data2=pd.read_csv(path,header=None,names=['Test 1','Test 2','Accepted'])
print(data2.head())
# 数据可视化

positive=data2[data2['Accepted'].isin([1])] # isin 筛选功能,[1]是一个list
negative=data2[data2['Accepted'].isin([0])]

fig,ax=plt.subplots(figsize=(12,8))
# c=color the first two figures refers to data (x,y)
ax.scatter(positive['Test 1'],positive['Test 2'],s=50,c='b',marker='o',label='Accepted')
ax.scatter(negative['Test 1'],negative['Test 2'],s=50,c='r',marker='x',label='Rejected')
ax.legend()  #加上图例
ax.set_xlabel('Test 1 Score')
ax.set_ylabel('Test 2 Score')
plt.show()


degree=5
x1=data2['Test 1']
x2=data2['Test 2']
data2.insert(3,'Ones',1)
for i in range(1,degree):
    for j in range(0,i):
        data2['F'+str(i)+str(j)]=np.power(x1,i-j)*np.power(x2,j)
data2.drop('Test 1',axis=1,inplace=True) #do not create new object and modify on former object
data2.drop('Test 2',axis=1,inplace=True)
print(data2.head())


# setting the data
cols=data2.shape[1]
X2=data2.iloc[:,1:cols] # the data in all rows but in specific cols
y2=data2.iloc[:,0:1]

X2=np.array(X2.values)
y2=np.array(y2.values)
theta2=np.zeros(11)
learningRate=1
#examine the cost and gradient function
print(cost(theta2,X2,y2,learningRate))
print(gradient(theta2,X2,y2,learningRate))

#prediection
result2=opt.fmin_tnc(func=cost,x0=theta2,fprime=gradient,args=(X2,y2,learningRate))
print(result2)

theta_min=np.matrix(result2[0])
predictions=predict(theta_min,X2)
correct=[1 if ((a==1 and b==1) or (a==0 and b==0)) else 0 for (a,b) in zip(predictions,y2)]
accuracy=(sum(map(int,correct))%len(correct))
print('accuracy={0}%'.format(accuracy))

20210712

一、向量化的梯度下降计算

这个并不是线性回归的正规方程法,只不过是把原来梯度下降循环赋值变成了矩阵乘法罢了。

二、几个概念的理解

逻辑回归 : 建立多特征的模型进行二分类
一对多 : 若干个二分类器,每一个分类器只是判断是这一类or不是这一类,只是 s i g m o i d sigmoid sigmoid之后会有不同的概率,选取概率最大的确定类别。

20210713

一、神经网络

对于 K K K分类而言,其代价函数是 K K K维向量,最终的输出也是 K K K维向量,这是与逻辑回归所不同的地方。
正向传播:计算预测函数 h θ ( x ) h_{\theta}(x) hθ(x)
反向传播:计算代价函数的偏导数 ∂ ∂ Θ i j ( l ) J ( Θ ) \frac{\partial}{\partial \Theta_{ij}^{(l)}} J(\Theta) Θij(l)J(Θ)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值