Machine Learning in Action 之 Logistic回归(一) 调试错误

最新推荐文章于 2021-12-21 17:09:39 发布

王20133

最新推荐文章于 2021-12-21 17:09:39 发布

阅读量3.2k

点赞数 1

分类专栏： Machine Learning 机器学习文章标签： Machine Learning 机器学习 Logistic Regression

机器学习同时被 2 个专栏收录

19 篇文章 0 订阅

订阅专栏

Machine Learning

15 篇文章 0 订阅

订阅专栏

Logistic回归python 3.6实践错误总结

sigmoid(intX)函数

报错：

return 1.0/(1+math.exp(-intX)) 
TypeError: only length-1 arrays can be converted to Python scalars

将math.exp改为numpy的方法numpy.exp

    import numpy as np
    def sigmoid(self, intX):
        return 1.0/(1+np.exp(-intX))

也可以在文件头添加from numpy import *,就可以直接用exp(-intX)了
详见python import与from…import….的区别

plotBstFit(wei)函数

n = shape(dataArr)[0]IDE提示

Class lterable does not define '_getitem_',so the '[]' operator can not be used on its instance

python中的shape有两个，一个是numpy中的shape，一个是array的shape，本质上是一样的。

import numpy as np
y=[1,2,3,1,2,2,1,2,1,2]  # list
y_arr = array(y)         #array
y_mat = np.mat(y_arr)  #matrix
y_arr.shape[0]   #output:10L
y_mat.shape[0]   #output:1L
y_mat.shape[1]   #output:10L
np.shape(y_arr)  #output :(1L, 10L)

奇怪的是两种`shape`都没有`shape(dataArr)[0]`的访问方式，参考的网上的代码却能使用此种方法。暂时还没找到原因。用的是`dataArr.shape[0]`

plotBstFit(wei)函数
报错：

weights = wei.getA()
AttributeError: 'numpy.ndarray' object has no attribute 'getA'

weights是在gradAscent(dataMatIn, classLabels)函数中返回的，返回时即为ndarray类型，所以plotBstFit(wei)函数中不需要weights = wei.getA()此句。

stocGradAscent0(dataMatrix, classLables)函数

    def stocGradAscent0(self, dataMatrix, classLables):
        m,n = shape(dataMatrix)  #shape(dataMatrix) =(100, 3)
        alpha = 0.01
        weights = ones(n)       #shape(weights)=(3,)
        for i in range(m):
            h = self.sigmoid(sum(dataMatrix[i]*weights)) #数值
            error = classLables[i] - h       #数值
            weights += alpha * error * dataMatrix[i]  #shape(dataMatrix[i])=(3,)    #alpha * error * dataMatrix[i]得到一个数值
        return weights

报错：

weights += alpha * error * dataMatrix[i]
ValueError: operands could not be broadcast together with shapes (3,) (0,) (3,)

看了很久numpy的广播机制，还是不能解决，最后发现，在调用函数时使用如下代码：

datamet, labelmet = loadDataSet()
wei1 = stocGradAscent0(datamet, labelmet) 
plotBstFit(wei1)

datamet和labelmet都是普通的list格式，可能不适用numpy的广播机制，改为：

wei1 = stocGradAscent0(array(datamet), labelmet)

则正常运行。

对于numpy和Python语法还是很生疏，希望再过段时间这些问题对自己都不再是问题吧。

python问题,NameError: name 'weights' is not defined 5

>>> logRegres.plotBestFit(weights.getA())

Traceback (most recent call last):
File "<pyshell#26>", line 1, in <module>
logRegres.plotBestFit(weights.getA())
NameError: name 'weights' is not defined
是什么情况啊？

很明显，你这是《机器学习实战》书本代码

作者漏掉了这一行代码

weights = logRegres.gradAscent(dataArr, labelMat)
完整的这样：
>>>import logRegres
>>>from numpy import *
>>>weights = logRegres.gradAscent(dataArr, labelMat)
>>> logRegres.plotBestFit(weights.getA())

这样就OK了