softmax 代码实现

最新推荐文章于 2024-07-15 21:55:14 发布

TaoTaoFu

最新推荐文章于 2024-07-15 21:55:14 发布

阅读量3.7k

点赞数 1

分类专栏：机器学习【算法】

机器学习【算法】专栏收录该内容

33 篇文章 0 订阅

订阅专栏

最近一直在外面，李航那本书没带在身上，所以那本书的算法实现估计要拖后了。
这几天在看Andrew Ng 机器学习的课程视频，正好看到了Softmax分类器那块，发现自己之前理解perceptron与logistic regression是有问题的。这两个算法真正核心的不同在于其分类函数的不同，perceptron采用一个分段函数作为分类器，logistic regression采用sigmod函数作为分类器，这才是这两个函数真正的不同。

废话不多说了，今天打算实现softmax分类器。

算法

算法参考的是Andrew 的课件与这篇文章。
具体实现的时候发现加入权重衰减效果会更好。

这里为了防止大家看不懂我的程序，我在这里做一些定义

数据集

数据集和KNN那个博文用的是同样的数据集。
数据地址：https://github.com/WenDesi/lihang_book_algorithm/blob/master/data/train.csv

特征

将整个图作为特征

代码

代码已上传GitHub

这次的代码是python3的，有可能需要稍微改一改，不好意思了，我要背叛python2了。

# encoding=utf8

import math
import pandas as pd
import numpy as np
import random
import time

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score


class Softmax(object):

    def __init__(self):
        self.learning_step = 0.000001           # 学习速率
        self.max_iteration = 100000             # 最大迭代次数
        self.weight_lambda = 0.01               # 衰退权重

    def cal_e(self,x,l):
        '''
        计算博客中的公式3
        '''

        theta_l = self.w[l]
        product = np.dot(theta_l,x)

        return math.exp(product)

    def cal_probability(self,x,j):
        '''
        计算博客中的公式2
        '''

        molecule = self.cal_e(x,j)
        denominator = sum([self.cal_e(x,i) for i in range(self.k)])

        return molecule/denominator


    def cal_partial_derivative(self,x,y,j):
        '''
        计算博客中的公式1
        '''

        first = int(y==j)                           # 计算示性函数
        second = self.cal_probability(x,j)          # 计算后面那个概率

        return -x*(first-second) + self.weight_lambda*self.w[j]

    def predict_(self, x):
        result = np.dot(self.w,x)
        row, column = result.shape

        # 找最大值所在的列
        _positon = np.argmax(result)
        m, n = divmod(_positon, column)

        return m

    def train(self, features, labels):
        self.k = len(set(labels))

        self.w = np.zeros((self.k,len(features[0])+1))
        time = 0

        while time < self.max_iteration:
            print('loop %d' % time)
            time += 1
            index = random.randint(0, len(labels) - 1)

            x = features[index]
            y = labels[index]

            x = list(x)
            x.append(1.0)
            x = np.array(x)

            derivatives = [self.cal_partial_derivative(x,y,j) for j in range(self.k)]

            for j in range(self.k):
                self.w[j] -= self.learning_step * derivatives[j]

    def predict(self,features):
        labels = []
        for feature in features:
            x = list(feature)
            x.append(1)

            x = np.matrix(x)
            x = np.transpose(x)

            labels.append(self.predict_(x))
        return labels


if __name__ == '__main__':

    print('Start read data')

    time_1 = time.time()

    raw_data = pd.read_csv('../data/train.csv', header=0)
    data = raw_data.values

    imgs = data[0::, 1::]
    labels = data[::, 0]

    # 选取 2/3 数据作为训练集， 1/3 数据作为测试集
    train_features, test_features, train_labels, test_labels = train_test_split(
        imgs, labels, test_size=0.33, random_state=23323)
    # print train_features.shape
    # print train_features.shape

    time_2 = time.time()
    print('read data cost '+ str(time_2 - time_1)+' second')

    print('Start training')
    p = Softmax()
    p.train(train_features, train_labels)

    time_3 = time.time()
    print('training cost '+ str(time_3 - time_2)+' second')

    print('Start predicting')
    test_predict = p.predict(test_features)
    time_4 = time.time()
    print('predicting cost ' + str(time_4 - time_3) +' second')

    score = accuracy_score(test_labels, test_predict)
    print("The accruacy socre is " + str(score))

 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132

运行结果

这里写图片描述

速度挺快，正确率一般吧，比决策树之类的要高。

TaoTaoFu

关注

1
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
softmax 代码实现

最近一直在外面，李航那本书没带在身上，所以那本书的算法实现估计要拖后了。这几天在看Andrew Ng 机器学习的课程视频，正好看到了Softmax分类器那块，发现自己之前理解perceptron与logistic regression是有问题的。这两个算法真正核心的不同在于其分类函数的不同，perceptron采用一个分段函数作为分类器，logistic regression采用sigm
复制链接

扫一扫

专栏目录