TensorFlow(2)Softmax Regression

Softmax Regression

generate random Tensors

   Two generally type of random distribute:

  • uniform
  • normal

In TensorFlow, they come in tf.random_normal() and tf.random_uniform() funcion.


import tensorflow as tf

# 正态分布
# tf.random_normal(shape,mean=0.0,stddev=1.0,dtype=tf.float32,seed=None,name=None)
b = tf.random_normal([5, 5],seed = 1234)
# 平均分布
# tf.random_uniform(shape,minval=0.0,maxval=1.0,dtype=tf.flaot32,seed=None,name=None)
c = tf.random_uniform([5,5], seed=12)

with tf.Session() as sees:
    mat1, mat2 = sees.run([b,c])


[[ 0.51340485 -0.25581399  0.65199131  1.39236379  0.37256798]
 [ 0.20336303  1.24701834 -0.98333126  0.50439858  0.98600131]
 [-1.70049322  1.51739979  0.06326418  1.07656693  0.03294745]
 [-2.32479048 -1.44697022  0.56703895  0.10577387 -0.90796399]
 [-1.05107033 -1.63305104  1.22501576  0.83072805  1.28771544]]
[[ 0.63615251  0.92409146  0.67627728  0.50212514  0.96114957]
 [ 0.88567221  0.04360652  0.29879451  0.46695721  0.05440903]
 [ 0.82479727  0.28480017  0.98824406  0.67999697  0.66409671]
 [ 0.75018144  0.31693625  0.51552784  0.75187266  0.44445455]
 [ 0.07454526  0.04856801  0.35572827  0.2670846   0.4779768 ]]
Three usual activation function in Neural Network
  • logistic function: (sigmoid function)

f ( x ) = 1 1 + e − x f(x)=\frac{1}{1+e^{-x}} f(x)=1+ex1

![logistic function](http://images.cnblogs.com/cnblogs_com/jclian91/1145646/o_logistic.png)
  • tanh function

t a n h ( x ) = s i n h ( x ) c o s h ( x ) = e x − e − x e x + e − x tanh(x)=\frac{sinh(x)}{cosh(x)}=\frac{e^{x}-e^{-x}}{e^{x}+e^{-x}} tanh(x)=cosh(x)sinh(x)=ex+exexex

![tanh function](http://images.cnblogs.com/cnblogs_com/jclian91/1145646/o_tanh.png)
  • ReLU (Rectified Linear Unit) function

f ( x ) = { 0 , if x  ≤  0 x , if  x   >  0 f(x) = \begin{cases} 0, & \text{if x $\le$ 0} \\ x, & \text{if $x$ $\gt$ 0} \end{cases} f(x)={0,x,if x  0if x > 0

![ReLU function](http://images.cnblogs.com/cnblogs_com/jclian91/1145646/o_ReLU.png)

TensorFlow Code:

# 常用的神经网络的三种激活函数(sigmoid, tanh, ReLu)
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

g = tf.Graph()
with g.as_default() as g:
    x = tf.placeholder(dtype=tf.float64)
    y = tf.nn.sigmoid(x)     # sigmoid函数
    # y = tf.nn.tanh(x)      # tanh函数
    # y = tf.nn.relu(x)      # ReLU函数

with tf.Session(graph=g) as sess:
    # x的范围为-10到10,均匀地取1000个点
    x_value = np.linspace(-10, 10, 1000)
    x, y = sess.run([x,y], feed_dict={x: x_value})

# 绘制函数图像

plt.title('logistic Function')

Softmax Function

  The softmax function is a generalization of the logistic function that “squashes” a K-dimensional vector of arbitrary real values to a K-dimensional vector of real values, where each entry is in the range (0, 1], and all the entries add up to 1.

  The function is :

![](https://wikimedia.org/api/rest_v1/media/math/render/svg/86ead0d067436010ffe21c29fa4bf956eb023ff6) ![](https://wikimedia.org/api/rest_v1/media/math/render/svg/e348290cf48ddbb6e9a6ef4e39363568b67c09d3)
  In TensorFlow, we can use **tf.nn.softmax()** function.

Python Code:

# -*- coding: utf-8 -*-
import tensorflow as tf

import numpy as np

A = np.array([1.0,2.0,3.0,4.0,5.0,6.0])
exp_A = np.exp(A)
softmax_A = exp_A/sum(exp_A)

tf_a = tf.constant(A)

with tf.Session() as sess:


[ 0.00426978  0.01160646  0.03154963  0.08576079  0.23312201  0.63369132]
[ 0.00426978  0.01160646  0.03154963  0.08576079  0.23312201  0.63369132]

Logistic Regression

  Think logistic regression in a neural-network way.

logistic regression

  Suppose there are m samples, each sample in n-dimension with label in {0,1}, so the loss function is:
ϕ ( x ) = ∑ j = 0 n w j x j h θ ( x ) = 1 1 + e − ϕ ( x ) \phi(x) = \sum\limits_{j=0}^{n}w_{j}x_{j}\\ h_{\theta}(x) =\frac{1}{1+e^{-\phi(x)}} ϕ(x)=j=0nwjxjhθ(x)=1+eϕ(x)1


  We use gradient descent method to find the optimal parameter θ \theta θ (n+1 parameters: w 0 , w 1 , . . . , w n w_0,w_1,...,w_n w0,w1,...,wn) in order to minimize the loss.

Softmax Regression

  Softmax Regression is a generalize type of Logistic Function which is useful for multi-class classification.


  Suppose there are m samples, each sample in n-dimension with label in {1,2,…,k}, so the loss function is:

J ( θ ) = − 1 m [ ∑ i = 1 m ∑ j = 1 k 1 { y ( i ) = j } × log ⁡ ( p ( y ( i ) = j ∣ x i ; θ ) ) ] J(\theta) = -\frac{1}{m}\Big[\sum\limits_{i=1}^{m}\sum\limits_{j=1}^{k}1{\{y^{(i)}=j\}}\times \log{(p(y^{(i)}=j|x^{i};\theta))}\Big] J(θ)=m1[i=1mj=1k1{y(i)=j}×log(p(y(i)=jxi;θ))]

  We use gradient descent method to find the optimal parameter θ \theta θ ( k × ( n + 1 ) k\times (n+1) k×(n+1) parameters) in order to minimize the loss.

TensorFlow Code:

import tensorflow as tf
import numpy as np
import logging

logging.basicConfig(level = logging.INFO, format='%(asctime)s - %(levelname)s: %(message)s')
logger = logging.getLogger(__name__)

class Softmax:

    def __init__(self, input_size, class_num, epoch = 1000, learning_rate = 0.1, save_model_path='./model.ckpt'):

        self.save_model_path = save_model_path   # 模型保存目录
        self.epoch = epoch                       # 循环次数
        self.learning_rate = learning_rate       # 学习率

        x = tf.placeholder(dtype=tf.float32, shape=[None, input_size])         # 特征
        y_true = tf.placeholder(dtype=tf.float32, shape=[None, class_num])     # 标签

        # 定义softmax回归的网络结构
        W = tf.Variable(tf.random_normal([input_size, class_num]), name='weight')
        b = tf.Variable(tf.random_normal([class_num]), name='bias')

        # tf.nn.softmax computes softmax activations
        # softmax = exp(logits) / reduce_sum(exp(logits), dim)
        self.hypothesis = tf.nn.softmax(tf.matmul(x, W) + b)

        self.x = x
        self.y_true = y_true

        # Cross entropy loss
        self.loss = tf.reduce_mean(-tf.reduce_sum(self.y_true * tf.log(self.hypothesis), axis=1))

        # 选择optimizer使loss达到最小, 使用梯度下降法减少 loss,学习率是self.learning_rate
        self.train_model = tf.train.GradientDescentOptimizer(self.learning_rate).minimize(self.loss)

        self.saver = tf.train.Saver()

        logger.info('Initialize the model...')

    # 训练并保存模型
    def train(self, x_data, y_data):

        logger.info('Training the model...')
        with tf.Session() as sess:

            # 对所有变量进行初始化

            feed_dict = {self.x: x_data, self.y_true: y_data}
            # 进行迭代学习
            for i in range(self.epoch+1):
                sess.run(self.train_model, feed_dict=feed_dict)
                if i % int(self.epoch/10) == 0:
                    # to see the step improvement
                    print('已训练%d次, loss: %s.' % (i, sess.run(self.loss, feed_dict=feed_dict)))

            # 保存ANN模型
            logger.info('Saving the model...')
            self.saver.save(sess, self.save_model_path)  # E

    # 预测数据
    def predict(self, data):

        with tf.Session() as sess:
            logger.info('Restoring the model...')
            self.saver.restore(sess, self.save_model_path) # A
            predict = sess.run(self.hypothesis, feed_dict={self.x: data}) # B
            predict_class = sess.run(tf.argmax(predict,1))

        print("预测值为:%s." % predict_class)

        return predict

# 使用Softmax类
# 样本数据

x_data = np.array([[1, 2, 1, 1],
                   [2, 1, 3, 2],
                   [3, 1, 3, 4],
                   [4, 1, 5, 5],
                   [1, 7, 5, 5],
                   [1, 2, 5, 6],
                   [1, 6, 6, 6],
                   [1, 7, 7, 7]])

y_data = np.array([[0, 0, 1],
                   [0, 0, 1],
                   [0, 0, 1],
                   [0, 1, 0],
                   [0, 1, 0],
                   [0, 1, 0],
                   [1, 0, 0],
                   [1, 0, 0]])

input_size = x_data.shape[1]
class_num = y_data.shape[1]

save_path = 'E://logs/model.ckpt'
st = Softmax(input_size, class_num, 1000, 0.1, save_path)
st.train(x_data, y_data)
st.predict(np.array([[1, 3, 4, 3]]))


2018-08-17 17:01:37,854 - INFO: Initialize the model...
2018-08-17 17:01:37,854 - INFO: Training the model...
已训练0次, loss: 5.31073.
已训练100次, loss: 0.624064.
已训练200次, loss: 0.559649.
已训练300次, loss: 0.511522.
已训练400次, loss: 0.470231.
已训练500次, loss: 0.432752.
已训练600次, loss: 0.397538.
已训练700次, loss: 0.363444.
已训练800次, loss: 0.329436.
已训练900次, loss: 0.294607.
已训练1000次, loss: 0.259079.
2018-08-17 17:01:38,289 - INFO: Saving the model...
2018-08-17 17:01:39,062 - INFO: Restoring the model...
INFO:tensorflow:Restoring parameters from E://logs/model.ckpt
2018-08-17 17:01:39,062 - INFO: Restoring parameters from E://logs/model.ckpt

iris.csv: https://github.com/percent4/Storing_pictures/blob/master/iris.csv

TensorFlow Code:

Softmax 类:TF_Softmax.py

import tensorflow as tf
import logging

logging.basicConfig(level = logging.INFO, format='%(asctime)s - %(levelname)s: %(message)s')
logger = logging.getLogger(__name__)

class Softmax:

    def __init__(self, input_size, class_num, epoch = 1000, learning_rate = 0.1, save_model_path='./model.ckpt'):

        self.save_model_path = save_model_path   # 模型保存目录
        self.epoch = epoch                       # 循环次数
        self.learning_rate = learning_rate       # 学习率

        x = tf.placeholder(dtype=tf.float32, shape=[None, input_size])         # 特征
        y_true = tf.placeholder(dtype=tf.float32, shape=[None, class_num])     # 标签

        # 定义softmax回归的网络结构
        W = tf.Variable(tf.random_normal([input_size, class_num],  name='weight'))
        b = tf.Variable(tf.random_normal([class_num],  name='bias'))

        # tf.nn.softmax computes softmax activations
        # softmax = exp(logits) / reduce_sum(exp(logits), dim)
        self.hypothesis = tf.nn.softmax(tf.matmul(x, W) + b)

        self.x = x
        self.y_true = y_true

        # Cross entropy loss
        self.loss = tf.reduce_mean(-tf.reduce_sum(self.y_true * tf.log(self.hypothesis), axis=1))

        # 选择optimizer使loss达到最小, 使用梯度下降法减少 loss,学习率是self.learning_rate
        self.train_model = tf.train.GradientDescentOptimizer(self.learning_rate).minimize(self.loss)

        self.saver = tf.train.Saver()

        logger.info('Initialize the model...')

    # 训练并保存模型
    def train(self, x_data, y_data):

        logger.info('Training the model...')
        with tf.Session() as sess:

            # 对所有变量进行初始化

            feed_dict = {self.x: x_data, self.y_true: y_data}
            # 进行迭代学习
            for i in range(self.epoch+1):
                sess.run(self.train_model, feed_dict=feed_dict)
                if i % int(self.epoch/10) == 0:
                    # to see the step improvement
                    print('已训练%d次, loss: %s.' % (i, sess.run(self.loss, feed_dict=feed_dict)))

            # 保存ANN模型
            logger.info('Saving the model...')
            self.saver.save(sess, self.save_model_path)  # E

    # 预测数据
    def predict(self, data):

        with tf.Session() as sess:
            logger.info('Restoring the model...')
            self.saver.restore(sess, self.save_model_path) # A
            predict = sess.run(self.hypothesis, feed_dict={self.x: data}) # B

        # print("预测值为:%s." % predict)

        return predict



from TF_Softmax import Softmax
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import LabelBinarizer

CSV_FILE_PATH = 'E://iris.csv'          # CSV 文件路径
IRIS = pd.read_csv(CSV_FILE_PATH)       # 读取CSV文件

target_variable = 'class'               # 目标变量

# 数据集的特征
features = list(IRIS.columns)

# 目标变量的类别
Class = IRIS[target_variable].unique()

# 对目标变量进行重新编码

# 目标变量的类别字典
Class_dict = {}
for i, clf in enumerate(Class):
    Class_dict[clf] = i+1
# 增加一列target, 将目标变量进行编码
IRIS['target'] =  IRIS[target_variable].apply(lambda x: Class_dict[x])

# 对目标变量进行0-1编码
lb = LabelBinarizer()
transformed_labels = lb.transform(IRIS['target'])
y_bin_labels = []   # 对多分类进行0-1编码的变量
for i in range(transformed_labels.shape[1]):
    IRIS['y'+str(i)] = transformed_labels[:, i]
# print(IRIS.head(100))

# 数据是否标准化
# x_bar = (x-mean)/std
if IS_STANDARD == 'yes':
    for feature in features:
        mean = IRIS[feature].mean()
        std = IRIS[feature].std()
        IRIS[feature] = (IRIS[feature]-mean)/std

# print(IRIS.head())

# 将数据集分为训练集和测试集,训练集80%, 测试集20%
x_train, x_test, y_train, y_test = train_test_split(IRIS[features], IRIS[y_bin_labels], \
                                                    train_size = 0.8, test_size=0.2, random_state=123)

# 使用Softmax进行预测
# 构建Softmax网络
input_size = x_train.shape[1]
class_num = y_train.shape[1]

# 模型保存地址
MODEL_SAVE_PATH = 'E://logs/softmax.ckpt'
# Softmax初始化
ann = Softmax(input_size, class_num, 10000, 0.5, MODEL_SAVE_PATH)
ann.train(x_train, y_train)            # 训练ANN
y_pred = ann.predict(x_test)     # 预测数据

# 预测分类
prediction = []
for pred in y_pred:

# 计算预测的准确率
x_test['prediction'] = prediction
x_test['label'] = IRIS['target'][y_test.index]
accuracy = accuracy_score(x_test['prediction'], x_test['label'])


2018-08-17 21:36:33,373 - INFO: Initialize the model...
2018-08-17 21:36:33,373 - INFO: Training the model...
已训练0次, loss: 8.691852.
已训练1000次, loss: 0.060822684.
已训练2000次, loss: 0.053384975.
已训练3000次, loss: 0.04848685.
已训练4000次, loss: 0.044897027.
已训练5000次, loss: 0.042198572.
已训练6000次, loss: 0.040111676.
已训练7000次, loss: 0.038444195.
已训练8000次, loss: 0.037071057.
已训练9000次, loss: 0.035911743.
已训练10000次, loss: 0.034913346.
2018-08-17 21:36:37,012 - INFO: Saving the model...
2018-08-17 21:36:37,215 - INFO: Restoring the model...
INFO:tensorflow:Restoring parameters from E://logs/softmax.ckpt
2018-08-17 21:36:37,230 - INFO: Restoring parameters from E://logs/softmax.ckpt
     sepal_length  sepal_width  petal_length  petal_width  prediction  label
72            6.3          2.5           4.9          1.5           2      2
112           6.8          3.0           5.5          2.1           3      3
132           6.4          2.8           5.6          2.2           3      3
88            5.6          3.0           4.1          1.3           2      2
37            4.9          3.1           1.5          0.1           1      1

