CNTK学习(一)

最新推荐文章于 2022-10-25 19:45:00 发布

sunflower_Yolanda

最新推荐文章于 2022-10-25 19:45:00 发布

阅读量2.1k

点赞数

分类专栏： CNTK 文章标签： CNTK

本文链接：https://blog.csdn.net/sunflower_Yolanda/article/details/51647238

版权

CNTK 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

参考用 CNTK 搞深度学习（一）入门

LR模型构建：

LR问题简述
- Our model can then be seen as y ~ p(y | x; w, b) = sigma(wx+b) = 1 / (1 + exp(-wx-b)) Then, we predict y = 1 if p(y | x; w, b) > 0.5 and y = 0 otherwise.
- 学习参数w和b
  - 最小化交叉熵损失函数
  - 梯度下降法逼近最优值
  - stochastic gradient descent (SGD)算法

创建数据


# -*- coding: utf-8 -*-

import numpy as np
from sklearn.utils import shuffle

# number of dimensions

Dim = 2

# number of samples

N_train = 1000
N_test = 500

def generate(N, mean, cov, diff):   
    #import ipdb;ipdb.set_trace()
    num_classes = len(diff)
    samples_per_class = int(N/2)

    X0 = np.random.multivariate_normal(mean, cov, samples_per_class)#
    Y0 = np.zeros(samples_per_class)

    for ci, d in enumerate(diff):
        X1 = np.random.multivariate_normal(mean+d, cov, samples_per_class)
        Y1 = (ci+1)*np.ones(samples_per_class)

        X0 = np.concatenate((X0,X1))
        Y0 = np.concatenate((Y0,Y1))

    X, Y = shuffle(X0, Y0)

    return X,Y

def create_data_files(num_classes, diff, train_filename, test_filename, regression):
    print("Outputting %s and %s"%(train_filename, test_filename))
    mean = np.random.randn(num_classes)
    cov = np.eye(num_classes)

    for filename, N in [(train_filename, N_train), (test_filename, N_test)]:
        X, Y = generate(N, mean, cov, diff)

        # output in CNTK Text format
        with open(filename, "w") as dataset:
            num_labels = int((1 + np.amax(Y)))
            for i in range(N):
                dataset.write("|features ")
                for d in range(Dim):
                    dataset.write("%f " % X[i,d])
                if (regression): 
                    dataset.write("|labels %f\n" % Y[i])
                else:
                    labels = ['0'] * num_labels;
                    labels[int(Y[i])] = '1'
                    dataset.write("|labels %s\n" % " ".join(labels))

def main():
    # random seed (create the same data)
    np.random.seed(10)#产生[1,10]随机数

    create_data_files(Dim, [3.0], "Train_cntk_text.txt", "Test_cntk_text.txt", True)#生成01分类器的Train和Test文件

#其中Train文件


#|features 1.709340 2.329687 |labels 0.000000


#|features 0.959171 -0.252047 |labels 0.000000


#其中Test文件


#|features 3.854499 4.163941 |labels 1.000000


#|features 1.058121 1.204858 |labels 0.000000


#|features 1.870621 1.284107 |labels 0.000000

    create_data_files(Dim, [[3.0], [3.0, 0.0]], "Train-3Classes_cntk_text.txt", "Test-3Classes_cntk_text.txt", False)#生成3分类器的Train和Test文件

if __name__ == '__main__':
    main()

网络描述

BrainScriptNetworkBuilder = [

        # sample and label dimensions
        SDim = $dimension$ #sample dimension
        LDim = 1 #label dimention

        features = Input (SDim)
        labels   = Input (LDim)

        # parameters to learn
        b = Parameter (LDim, 1)     # bias [1 x 1] matrix
        w = Parameter (LDim, SDim)  # weights [1 x 2] matrix

        # operations
        p = Sigmoid (w * features + b) # logtistic func      

        lr = Logistic (labels, p) # loss function.
        #lr = -(TransposeTimes (labels, Log (p)) + TransposeTimes (Constant(1) - labels, Log (Constant(1) - p)))
        err = SquareError (labels, p) # model evaluation as

        # root nodes
        featureNodes    = (features)
        labelNodes      = (labels)
        criterionNodes  = (lr) # those where an objective is specified that CNTK will try to achieve
        evaluationNodes = (err) # nodes used for perfoming evaluation as the model is trained
        outputNodes     = (p) # nodes whose values will be output
    ]

定义网络中命令

执行流程
command=Train:Output:DumpNodeInfo:Test

Train命令


    # training config
    Train = [ # command=Train --> CNTK will look for a parameter named Train
        action = "train"  # execute CNTK's 'train' routine

    # network description
    BrainScriptNetworkBuilder = [
        ...
    ]

    # configuration parameters of the SGD procedure
    SGD = [
        ...
    ]

    # configuration of data reading
    reader = [
        ...
        ]
    ]

Output命令


# output the results

Output = [
    action = "write"
    reader = [
        readerType = "CNTKTextFormatReader"
        file = "Test_cntk_text.txt"
        input = [
            features = [
                dim = $dimension$  # $$ means variable substitution
                format = "dense"
            ]
            labels = [
                dim = 1
                format = "dense"
            ]
        ]
    ]
    outputPath = "LR.txt"  # dump the output to this text file
]

Test命令
unlike Output which simply wrote the prediction probabilities for each test sample to file, it will compute the specified error metric for the test set. In our example, that is the function SquareError() as specified in the configuration

Test = [
    action = "test"
    reader = [
        readerType = "CNTKTextFormatReader"
        file = "Test_cntk_text.txt"
        input = [
            features = [
                dim = $dimension$
                format = "dense"
            ]
            labels = [
                dim = 1
                format = "dense"
            ]
        ]
    ]
]

It takes the matrix p, which is the prediction probability for class 1 of our model, and compares it to the correct labels. The final output will give the error per sample of our network on the test/validation data.

DumpNodeInfo命令
outputs all model parameters in the network, useful for debugging and for doing further processing with what your network learned
```
DumpNodeInfo = [
    action = "dumpNode"
    printValues = true
]
```

outputs a file called “LR.dnn.__AllNodes__.txt” into the Models directory. like this:

b=LearnableParameter [1,1]   learningRateMultiplier=1.000000  NeedsGradient=true
 -12.3975668
w=LearnableParameter [1,2]   learningRateMultiplier=1.000000  NeedsGradient=true
 2.40208364 2.66412568

学习算法
stochastic gradient descent：SGD iteratively looks at some fixed subset of the training examples (called a minibatch) and updates the parameters in the direction of the cost function gradients after every such step, such that if it was to see the same data again, the network output would be a little closer to the desired result.

SGD = [ 
    epochSize = 0 # how many examples will be examined per epoch. If it is set to 0 then that means all of the training data will be examined for every epoch (also can be thought of as iteration).
    minibatchSize = 25 # Within each iteration, minibatches of samples (in this case 25) are examined together and their gradients are computed
    learningRatesPerSample = 0.04 # learningRatesPerPerSample = 0.05:0.02*5:0.01
    maxEpochs = 50
]

读写数据

reader = [
    readerType = "CNTKTextFormatReader"
    file = "Train_cntk_text.txt"
    input = [
        features = [
            dim = 2
            format = "dense"
        ]
        labels = [
            dim = 1
            format = "dense"
        ]
    ]
]

运行cntk configFile=lr_bs.cntk

结果

Final Results: Minibatch[1-1]: err = 0.00688694 * 500
COMPLETED!

程序输出

-------------------------------------------------------------------

Build info: 

        Built time: Jun  6 2016 20:06:59
        Last modified date: Sun Jun  5 04:28:32 2016
        Build type: release
        Build target: CPU-only
        With 1bit-SGD: no
        Math lib: acml
        Build Branch: HEAD
        Build SHA1: b7ed8dc9e5cd8ab35f4badae86dd42e93e9f2564
        Built by philly on 3177047f4067

        Build Path: /home/philly/jenkins/workspace/CNTK-Build-Linux
-------------------------------------------------------------------


Running on localhost at 2016/06/13 14:04:07
Command line: 
cntk  configFile=lr_bs.cntk



>>>>>>>>>>>>>>>>>>>> RAW CONFIG (VARIABLES NOT RESOLVED) >>>>>>>>>>>>>>>>>>>>
command=Train:Output:DumpNodeInfo:Test
modelPath = "Models/LR_reg.dnn"  
deviceId = -1                    
dimension = 2                    
Train = [             
action = "train"  
    BrainScriptNetworkBuilder = [
        SDim = $dimension$
        LDim = 1
        features = Input (SDim)
        labels   = Input (LDim)
        b = Parameter (LDim, 1)     
        w = Parameter (LDim, SDim)  
        p = Sigmoid (w * features + b)    
        lr = Logistic (labels, p)
        err = SquareError (labels, p)
        featureNodes    = (features)
        labelNodes      = (labels)
        criterionNodes  = (lr)
        evaluationNodes = (err)
        outputNodes     = (p)
    ]   
    SGD = [
        epochSize = 0                  
        minibatchSize = 25
        learningRatesPerSample = 0.04  
        maxEpochs = 50
    ]
    reader = [
        readerType = "CNTKTextFormatReader"
        file = "Train_cntk_text.txt"
        input = [
            features = [
                dim = $dimension$
                format = "dense"
            ]
            labels = [
                dim = 1
                format = "dense"
            ]
        ]
    ]
]
Test = [
    action = "test"
    reader = [
        readerType = "CNTKTextFormatReader"
        file = "Test_cntk_text.txt"
        input = [
            features = [
                dim = $dimension$
                format = "dense"
            ]
            labels = [
                dim = 1
                format = "dense"
            ]
        ]
    ]
]
Output = [
    action = "write"
    reader = [
        readerType = "CNTKTextFormatReader"
        file = "Test_cntk_text.txt"
        input = [
            features = [
                dim = $dimension$  
                format = "dense"
            ]
            labels = [
                dim = 1            
                format = "dense"
            ]
        ]
    ]
    outputPath = "LR.txt"  
]
DumpNodeInfo = [
    action = "dumpNode"
    printValues = true
]       
>>>>>>>>>>>>>>>>>>>> PROCESSED CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
configparameters: lr_bs.cntk:command=Train:Output:DumpNodeInfo:Test
configparameters: lr_bs.cntk:deviceId=-1
configparameters: lr_bs.cntk:dimension=2
configparameters: lr_bs.cntk:DumpNodeInfo=[
    action = "dumpNode"
    printValues = true
]

configparameters: lr_bs.cntk:modelPath=Models/LR_reg.dnn
configparameters: lr_bs.cntk:Output=[
    action = "write"
    reader = [
        readerType = "CNTKTextFormatReader"
        file = "Test_cntk_text.txt"
        input = [
            features = [
            dim = 2  
                format = "dense"
            ]
            labels = [
            dim = 1            
                format = "dense"
            ]
        ]
    ]
    outputPath = "LR.txt"  
]

configparameters: lr_bs.cntk:Test=[
    action = "test"
    reader = [
        readerType = "CNTKTextFormatReader"
        file = "Test_cntk_text.txt"
        input = [
            features = [
                dim = 2
                format = "dense"
            ]
            labels = [
                dim = 1
                format = "dense"
            ]
        ]
    ]
]

configparameters: lr_bs.cntk:Train=[             
action = "train"  
    BrainScriptNetworkBuilder = [
        SDim = 2
        LDim = 1
        features = Input (SDim)
        labels   = Input (LDim)
        b = Parameter (LDim, 1)     
        w = Parameter (LDim, SDim)  
        p = Sigmoid (w * features + b)    
        lr = Logistic (labels, p)
        err = SquareError (labels, p)
        featureNodes    = (features)
        labelNodes      = (labels)
        criterionNodes  = (lr)
        evaluationNodes = (err)
        outputNodes     = (p)
    ]   
    SGD = [
        epochSize = 0                  
        minibatchSize = 25
        learningRatesPerSample = 0.04  
        maxEpochs = 50
    ]
    reader = [
        readerType = "CNTKTextFormatReader"
        file = "Train_cntk_text.txt"
        input = [
            features = [
                dim = 2
                format = "dense"
            ]
            labels = [
                dim = 1
                format = "dense"
            ]
        ]
    ]
]

<<<<<<<<<<<<<<<<<<<< PROCESSED CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<
Commands: Train Output DumpNodeInfo Test
Precision = "float"
CNTKModelPath: Models/LR_reg.dnn
CNTKCommandTrainInfo: Train : 50
CNTKCommandTrainInfo: CNTKNoMoreCommands_Total : 50


##############################################################################


#                                                                            #


# Action "train"                                                             #


#                                                                            #


##############################################################################


CNTKCommandTrainBegin: Train
Final model exists: Models/LR_reg.dnn
No further training is necessary.
CNTKCommandTrainEnd: Train

Action "train" complete.



##############################################################################


#                                                                            #


# Action "write"                                                             #


#                                                                            #


##############################################################################



Post-processing network...

3 roots:
    err = SquareError()
    lr = Logistic()
    p = Sigmoid()

Validating network. 9 nodes to process in pass 1.

Validating --> labels = InputValue() :  -> [1 x *]
Validating --> w = LearnableParameter() :  -> [1 x 2]
Validating --> features = InputValue() :  -> [2 x *]
Validating --> p.z.PlusArgs[0] = Times (w, features) : [1 x 2], [2 x *] -> [1 x *]
Validating --> b = LearnableParameter() :  -> [1 x 1]
Validating --> p.z = Plus (p.z.PlusArgs[0], b) : [1 x *], [1 x 1] -> [1 x 1 x *]
Validating --> p = Sigmoid (p.z) : [1 x 1 x *] -> [1 x 1 x *]
Validating --> err = SquareError (labels, p) : [1 x *], [1 x 1 x *] -> [1]
Validating --> lr = Logistic (labels, p) : [1 x *], [1 x 1 x *] -> [1]

Validating network. 5 nodes to process in pass 2.


Validating network, final pass.



4 out of 9 nodes do not share the minibatch layout with the input data.

Post-processing network complete.



Allocating matrices for forward and/or backward propagation.

Memory Sharing Structure:

(nil): {[b Gradient[1 x 1]] [err Gradient[1]] [err Value[1]] [features Gradient[2 x *]] [labels Gradient[1 x *]] [lr Gradient[1]] [lr Value[1]] [p Gradient[1 x 1 x *]] [p.z Gradient[1 x 1 x *]] [p.z.PlusArgs[0] Gradient[1 x *]] [w Gradient[1 x 2]] }
0x1d52248: {[b Value[1 x 1]] }
0x1d529c8: {[features Value[2 x *]] }
0x1d52e48: {[labels Value[1 x *]] }
0x1d53ee8: {[w Value[1 x 2]] }
0x1dc03e8: {[p Value[1 x 1 x *]] }
0x1dc0548: {[p.z Value[1 x 1 x *]] }
0x1dc0e68: {[p.z.PlusArgs[0] Value[1 x *]] }

Minibatch[0]: ActualMBSize = 500
Written to LR.txt*
Total Samples Evaluated = 500

Action "write" complete.



##############################################################################


#                                                                            #


# Action "dumpNode"                                                          #


#                                                                            #


##############################################################################



Post-processing network...

3 roots:
    err = SquareError()
    lr = Logistic()
    p = Sigmoid()

Validating network. 9 nodes to process in pass 1.

Validating --> labels = InputValue() :  -> [1 x *1]
Validating --> w = LearnableParameter() :  -> [1 x 2]
Validating --> features = InputValue() :  -> [2 x *1]
Validating --> p.z.PlusArgs[0] = Times (w, features) : [1 x 2], [2 x *1] -> [1 x *1]
Validating --> b = LearnableParameter() :  -> [1 x 1]
Validating --> p.z = Plus (p.z.PlusArgs[0], b) : [1 x *1], [1 x 1] -> [1 x 1 x *1]
Validating --> p = Sigmoid (p.z) : [1 x 1 x *1] -> [1 x 1 x *1]
Validating --> err = SquareError (labels, p) : [1 x *1], [1 x 1 x *1] -> [1]
Validating --> lr = Logistic (labels, p) : [1 x *1], [1 x 1 x *1] -> [1]

Validating network. 5 nodes to process in pass 2.


Validating network, final pass.



4 out of 9 nodes do not share the minibatch layout with the input data.

Post-processing network complete.

Warning: node name '__AllNodes__' does not exist in the network. dumping all nodes instead.

Action "dumpNode" complete.



##############################################################################


#                                                                            #


# Action "test"                                                              #


#                                                                            #


##############################################################################



Post-processing network...

3 roots:
    err = SquareError()
    lr = Logistic()
    p = Sigmoid()

Validating network. 9 nodes to process in pass 1.

Validating --> labels = InputValue() :  -> [1 x *2]
Validating --> w = LearnableParameter() :  -> [1 x 2]
Validating --> features = InputValue() :  -> [2 x *2]
Validating --> p.z.PlusArgs[0] = Times (w, features) : [1 x 2], [2 x *2] -> [1 x *2]
Validating --> b = LearnableParameter() :  -> [1 x 1]
Validating --> p.z = Plus (p.z.PlusArgs[0], b) : [1 x *2], [1 x 1] -> [1 x 1 x *2]
Validating --> p = Sigmoid (p.z) : [1 x 1 x *2] -> [1 x 1 x *2]
Validating --> err = SquareError (labels, p) : [1 x *2], [1 x 1 x *2] -> [1]
Validating --> lr = Logistic (labels, p) : [1 x *2], [1 x 1 x *2] -> [1]

Validating network. 5 nodes to process in pass 2.


Validating network, final pass.



4 out of 9 nodes do not share the minibatch layout with the input data.

Post-processing network complete.

evalNodeNames are not specified, using all the default evalnodes and training criterion nodes.


Allocating matrices for forward and/or backward propagation.

Memory Sharing Structure:

(nil): {[b Gradient[1 x 1]] [err Gradient[1]] [features Gradient[2 x *2]] [labels Gradient[1 x *2]] [lr Gradient[1]] [p Gradient[1 x 1 x *2]] [p.z Gradient[1 x 1 x *2]] [p.z.PlusArgs[0] Gradient[1 x *2]] [w Gradient[1 x 2]] }
0x1d92be8: {[err Value[1]] }
0x1d92da8: {[lr Value[1]] }
0x1d93108: {[p.z Value[1 x 1 x *2]] }
0x1d93e18: {[p.z.PlusArgs[0] Value[1 x *2]] }
0x1d93eb8: {[p Value[1 x 1 x *2]] }
0x1e19598: {[b Value[1 x 1]] }
0x1e19cd8: {[features Value[2 x *2]] }
0x1e1a1a8: {[labels Value[1 x *2]] }
0x1e1b248: {[w Value[1 x 2]] }

BlockRandomizer::StartEpoch: epoch 0: frames [0..500] (first sequence at sample 0), data subset 0 of 1
Final Results: Minibatch[1-1]: err = 0.00718580 * 500; lr = 0.03153573 * 500

Action "test" complete.

__COMPLETED__