- 参考 用 CNTK 搞深度学习 (一) 入门
LR模型构建:
- LR问题简述
- Our model can then be seen as y ~ p(y | x; w, b) = sigma(wx+b) = 1 / (1 + exp(-wx-b)) Then, we predict y = 1 if p(y | x; w, b) > 0.5 and y = 0 otherwise.
- 学习参数w和b
- 最小化交叉熵损失函数
- 梯度下降法逼近最优值
- stochastic gradient descent (SGD)算法
创建数据
# -*- coding: utf-8 -*- import numpy as np from sklearn.utils import shuffle # number of dimensions Dim = 2 # number of samples N_train = 1000 N_test = 500 def generate(N, mean, cov, diff): #import ipdb;ipdb.set_trace() num_classes = len(diff) samples_per_class = int(N/2) X0 = np.random.multivariate_normal(mean, cov, samples_per_class)# Y0 = np.zeros(samples_per_class) for ci, d in enumerate(diff): X1 = np.random.multivariate_normal(mean+d, cov, samples_per_class) Y1 = (ci+1)*np.ones(samples_per_class) X0 = np.concatenate((X0,X1)) Y0 = np.concatenate((Y0,Y1)) X, Y = shuffle(X0, Y0) return X,Y def create_data_files(num_classes, diff, train_filename, test_filename, regression): print("Outputting %s and %s"%(train_filename, test_filename)) mean = np.random.randn(num_classes) cov = np.eye(num_classes) for filename, N in [(train_filename, N_train), (test_filename, N_test)]: X, Y = generate(N, mean, cov, diff) # output in CNTK Text format with open(filename, "w") as dataset: num_labels = int((1 + np.amax(Y))) for i in range(N): dataset.write("|features ") for d in range(Dim): dataset.write("%f " % X[i,d]) if (regression): dataset.write("|labels %f\n" % Y[i]) else: labels = ['0'] * num_labels; labels[int(Y[i])] = '1' dataset.write("|labels %s\n" % " ".join(labels)) def main(): # random seed (create the same data) np.random.seed(10)#产生[1,10]随机数 create_data_files(Dim, [3.0], "Train_cntk_text.txt", "Test_cntk_text.txt", True)#生成01分类器的Train和Test文件 #其中Train文件 #|features 1.709340 2.329687 |labels 0.000000 #|features 0.959171 -0.252047 |labels 0.000000 #其中Test文件 #|features 3.854499 4.163941 |labels 1.000000 #|features 1.058121 1.204858 |labels 0.000000 #|features 1.870621 1.284107 |labels 0.000000 create_data_files(Dim, [[3.0], [3.0, 0.0]], "Train-3Classes_cntk_text.txt", "Test-3Classes_cntk_text.txt", False)#生成3分类器的Train和Test文件 if __name__ == '__main__': main()
网络描述
BrainScriptNetworkBuilder = [ # sample and label dimensions SDim = $dimension$ #sample dimension LDim = 1 #label dimention features = Input (SDim) labels = Input (LDim) # parameters to learn b = Parameter (LDim, 1) # bias [1 x 1] matrix w = Parameter (LDim, SDim) # weights [1 x 2] matrix # operations p = Sigmoid (w * features + b) # logtistic func lr = Logistic (labels, p) # loss function. #lr = -(TransposeTimes (labels, Log (p)) + TransposeTimes (Constant(1) - labels, Log (Constant(1) - p))) err = SquareError (labels, p) # model evaluation as # root nodes featureNodes = (features) labelNodes = (labels) criterionNodes = (lr) # those where an objective is specified that CNTK will try to achieve evaluationNodes = (err) # nodes used for perfoming evaluation as the model is trained outputNodes = (p) # nodes whose values will be output ]
定义网络中命令
- 执行流程
command=Train:Output:DumpNodeInfo:Test
Train命令
# training config Train = [ # command=Train --> CNTK will look for a parameter named Train action = "train" # execute CNTK's 'train' routine # network description BrainScriptNetworkBuilder = [ ... ] # configuration parameters of the SGD procedure SGD = [ ... ] # configuration of data reading reader = [ ... ] ]
Output命令
# output the results Output = [ action = "write" reader = [ readerType = "CNTKTextFormatReader" file = "Test_cntk_text.txt" input = [ features = [ dim = $dimension$ # $$ means variable substitution format = "dense" ] labels = [ dim = 1 format = "dense" ] ] ] outputPath = "LR.txt" # dump the output to this text file ]
Test命令
unlike Output which simply wrote the prediction probabilities for each test sample to file, it will compute the specified error metric for the test set. In our example, that is the function SquareError() as specified in the configurationTest = [ action = "test" reader = [ readerType = "CNTKTextFormatReader" file = "Test_cntk_text.txt" input = [ features = [ dim = $dimension$ format = "dense" ] labels = [ dim = 1 format = "dense" ] ] ] ]
It takes the matrix p, which is the prediction probability for class 1 of our model, and compares it to the correct labels. The final output will give the error per sample of our network on the test/validation data.
- DumpNodeInfo命令
outputs all model parameters in the network, useful for debugging and for doing further processing with what your network learned
DumpNodeInfo = [ action = "dumpNode" printValues = true ]
outputs a file called “LR.dnn.__AllNodes__.txt” into the Models directory. like this:
b=LearnableParameter [1,1] learningRateMultiplier=1.000000 NeedsGradient=true -12.3975668 w=LearnableParameter [1,2] learningRateMultiplier=1.000000 NeedsGradient=true 2.40208364 2.66412568
学习算法
stochastic gradient descent:SGD iteratively looks at some fixed subset of the training examples (called a minibatch) and updates the parameters in the direction of the cost function gradients after every such step, such that if it was to see the same data again, the network output would be a little closer to the desired result.SGD = [ epochSize = 0 # how many examples will be examined per epoch. If it is set to 0 then that means all of the training data will be examined for every epoch (also can be thought of as iteration). minibatchSize = 25 # Within each iteration, minibatches of samples (in this case 25) are examined together and their gradients are computed learningRatesPerSample = 0.04 # learningRatesPerPerSample = 0.05:0.02*5:0.01 maxEpochs = 50 ]
读写数据
reader = [ readerType = "CNTKTextFormatReader" file = "Train_cntk_text.txt" input = [ features = [ dim = 2 format = "dense" ] labels = [ dim = 1 format = "dense" ] ] ]
运行cntk configFile=lr_bs.cntk
结果
Final Results: Minibatch[1-1]: err = 0.00688694 * 500 COMPLETED!
程序输出
------------------------------------------------------------------- Build info: Built time: Jun 6 2016 20:06:59 Last modified date: Sun Jun 5 04:28:32 2016 Build type: release Build target: CPU-only With 1bit-SGD: no Math lib: acml Build Branch: HEAD Build SHA1: b7ed8dc9e5cd8ab35f4badae86dd42e93e9f2564 Built by philly on 3177047f4067 Build Path: /home/philly/jenkins/workspace/CNTK-Build-Linux ------------------------------------------------------------------- Running on localhost at 2016/06/13 14:04:07 Command line: cntk configFile=lr_bs.cntk >>>>>>>>>>>>>>>>>>>> RAW CONFIG (VARIABLES NOT RESOLVED) >>>>>>>>>>>>>>>>>>>> command=Train:Output:DumpNodeInfo:Test modelPath = "Models/LR_reg.dnn" deviceId = -1 dimension = 2 Train = [ action = "train" BrainScriptNetworkBuilder = [ SDim = $dimension$ LDim = 1 features = Input (SDim) labels = Input (LDim) b = Parameter (LDim, 1) w = Parameter (LDim, SDim) p = Sigmoid (w * features + b) lr = Logistic (labels, p) err = SquareError (labels, p) featureNodes = (features) labelNodes = (labels) criterionNodes = (lr) evaluationNodes = (err) outputNodes = (p) ] SGD = [ epochSize = 0 minibatchSize = 25 learningRatesPerSample = 0.04 maxEpochs = 50 ] reader = [ readerType = "CNTKTextFormatReader" file = "Train_cntk_text.txt" input = [ features = [ dim = $dimension$ format = "dense" ] labels = [ dim = 1 format = "dense" ] ] ] ] Test = [ action = "test" reader = [ readerType = "CNTKTextFormatReader" file = "Test_cntk_text.txt" input = [ features = [ dim = $dimension$ format = "dense" ] labels = [ dim = 1 format = "dense" ] ] ] ] Output = [ action = "write" reader = [ readerType = "CNTKTextFormatReader" file = "Test_cntk_text.txt" input = [ features = [ dim = $dimension$ format = "dense" ] labels = [ dim = 1 format = "dense" ] ] ] outputPath = "LR.txt" ] DumpNodeInfo = [ action = "dumpNode" printValues = true ] >>>>>>>>>>>>>>>>>>>> PROCESSED CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>> configparameters: lr_bs.cntk:command=Train:Output:DumpNodeInfo:Test configparameters: lr_bs.cntk:deviceId=-1 configparameters: lr_bs.cntk:dimension=2 configparameters: lr_bs.cntk:DumpNodeInfo=[ action = "dumpNode" printValues = true ] configparameters: lr_bs.cntk:modelPath=Models/LR_reg.dnn configparameters: lr_bs.cntk:Output=[ action = "write" reader = [ readerType = "CNTKTextFormatReader" file = "Test_cntk_text.txt" input = [ features = [ dim = 2 format = "dense" ] labels = [ dim = 1 format = "dense" ] ] ] outputPath = "LR.txt" ] configparameters: lr_bs.cntk:Test=[ action = "test" reader = [ readerType = "CNTKTextFormatReader" file = "Test_cntk_text.txt" input = [ features = [ dim = 2 format = "dense" ] labels = [ dim = 1 format = "dense" ] ] ] ] configparameters: lr_bs.cntk:Train=[ action = "train" BrainScriptNetworkBuilder = [ SDim = 2 LDim = 1 features = Input (SDim) labels = Input (LDim) b = Parameter (LDim, 1) w = Parameter (LDim, SDim) p = Sigmoid (w * features + b) lr = Logistic (labels, p) err = SquareError (labels, p) featureNodes = (features) labelNodes = (labels) criterionNodes = (lr) evaluationNodes = (err) outputNodes = (p) ] SGD = [ epochSize = 0 minibatchSize = 25 learningRatesPerSample = 0.04 maxEpochs = 50 ] reader = [ readerType = "CNTKTextFormatReader" file = "Train_cntk_text.txt" input = [ features = [ dim = 2 format = "dense" ] labels = [ dim = 1 format = "dense" ] ] ] ] <<<<<<<<<<<<<<<<<<<< PROCESSED CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<< Commands: Train Output DumpNodeInfo Test Precision = "float" CNTKModelPath: Models/LR_reg.dnn CNTKCommandTrainInfo: Train : 50 CNTKCommandTrainInfo: CNTKNoMoreCommands_Total : 50 ############################################################################## # # # Action "train" # # # ############################################################################## CNTKCommandTrainBegin: Train Final model exists: Models/LR_reg.dnn No further training is necessary. CNTKCommandTrainEnd: Train Action "train" complete. ############################################################################## # # # Action "write" # # # ############################################################################## Post-processing network... 3 roots: err = SquareError() lr = Logistic() p = Sigmoid() Validating network. 9 nodes to process in pass 1. Validating --> labels = InputValue() : -> [1 x *] Validating --> w = LearnableParameter() : -> [1 x 2] Validating --> features = InputValue() : -> [2 x *] Validating --> p.z.PlusArgs[0] = Times (w, features) : [1 x 2], [2 x *] -> [1 x *] Validating --> b = LearnableParameter() : -> [1 x 1] Validating --> p.z = Plus (p.z.PlusArgs[0], b) : [1 x *], [1 x 1] -> [1 x 1 x *] Validating --> p = Sigmoid (p.z) : [1 x 1 x *] -> [1 x 1 x *] Validating --> err = SquareError (labels, p) : [1 x *], [1 x 1 x *] -> [1] Validating --> lr = Logistic (labels, p) : [1 x *], [1 x 1 x *] -> [1] Validating network. 5 nodes to process in pass 2. Validating network, final pass. 4 out of 9 nodes do not share the minibatch layout with the input data. Post-processing network complete. Allocating matrices for forward and/or backward propagation. Memory Sharing Structure: (nil): {[b Gradient[1 x 1]] [err Gradient[1]] [err Value[1]] [features Gradient[2 x *]] [labels Gradient[1 x *]] [lr Gradient[1]] [lr Value[1]] [p Gradient[1 x 1 x *]] [p.z Gradient[1 x 1 x *]] [p.z.PlusArgs[0] Gradient[1 x *]] [w Gradient[1 x 2]] } 0x1d52248: {[b Value[1 x 1]] } 0x1d529c8: {[features Value[2 x *]] } 0x1d52e48: {[labels Value[1 x *]] } 0x1d53ee8: {[w Value[1 x 2]] } 0x1dc03e8: {[p Value[1 x 1 x *]] } 0x1dc0548: {[p.z Value[1 x 1 x *]] } 0x1dc0e68: {[p.z.PlusArgs[0] Value[1 x *]] } Minibatch[0]: ActualMBSize = 500 Written to LR.txt* Total Samples Evaluated = 500 Action "write" complete. ############################################################################## # # # Action "dumpNode" # # # ############################################################################## Post-processing network... 3 roots: err = SquareError() lr = Logistic() p = Sigmoid() Validating network. 9 nodes to process in pass 1. Validating --> labels = InputValue() : -> [1 x *1] Validating --> w = LearnableParameter() : -> [1 x 2] Validating --> features = InputValue() : -> [2 x *1] Validating --> p.z.PlusArgs[0] = Times (w, features) : [1 x 2], [2 x *1] -> [1 x *1] Validating --> b = LearnableParameter() : -> [1 x 1] Validating --> p.z = Plus (p.z.PlusArgs[0], b) : [1 x *1], [1 x 1] -> [1 x 1 x *1] Validating --> p = Sigmoid (p.z) : [1 x 1 x *1] -> [1 x 1 x *1] Validating --> err = SquareError (labels, p) : [1 x *1], [1 x 1 x *1] -> [1] Validating --> lr = Logistic (labels, p) : [1 x *1], [1 x 1 x *1] -> [1] Validating network. 5 nodes to process in pass 2. Validating network, final pass. 4 out of 9 nodes do not share the minibatch layout with the input data. Post-processing network complete. Warning: node name '__AllNodes__' does not exist in the network. dumping all nodes instead. Action "dumpNode" complete. ############################################################################## # # # Action "test" # # # ############################################################################## Post-processing network... 3 roots: err = SquareError() lr = Logistic() p = Sigmoid() Validating network. 9 nodes to process in pass 1. Validating --> labels = InputValue() : -> [1 x *2] Validating --> w = LearnableParameter() : -> [1 x 2] Validating --> features = InputValue() : -> [2 x *2] Validating --> p.z.PlusArgs[0] = Times (w, features) : [1 x 2], [2 x *2] -> [1 x *2] Validating --> b = LearnableParameter() : -> [1 x 1] Validating --> p.z = Plus (p.z.PlusArgs[0], b) : [1 x *2], [1 x 1] -> [1 x 1 x *2] Validating --> p = Sigmoid (p.z) : [1 x 1 x *2] -> [1 x 1 x *2] Validating --> err = SquareError (labels, p) : [1 x *2], [1 x 1 x *2] -> [1] Validating --> lr = Logistic (labels, p) : [1 x *2], [1 x 1 x *2] -> [1] Validating network. 5 nodes to process in pass 2. Validating network, final pass. 4 out of 9 nodes do not share the minibatch layout with the input data. Post-processing network complete. evalNodeNames are not specified, using all the default evalnodes and training criterion nodes. Allocating matrices for forward and/or backward propagation. Memory Sharing Structure: (nil): {[b Gradient[1 x 1]] [err Gradient[1]] [features Gradient[2 x *2]] [labels Gradient[1 x *2]] [lr Gradient[1]] [p Gradient[1 x 1 x *2]] [p.z Gradient[1 x 1 x *2]] [p.z.PlusArgs[0] Gradient[1 x *2]] [w Gradient[1 x 2]] } 0x1d92be8: {[err Value[1]] } 0x1d92da8: {[lr Value[1]] } 0x1d93108: {[p.z Value[1 x 1 x *2]] } 0x1d93e18: {[p.z.PlusArgs[0] Value[1 x *2]] } 0x1d93eb8: {[p Value[1 x 1 x *2]] } 0x1e19598: {[b Value[1 x 1]] } 0x1e19cd8: {[features Value[2 x *2]] } 0x1e1a1a8: {[labels Value[1 x *2]] } 0x1e1b248: {[w Value[1 x 2]] } BlockRandomizer::StartEpoch: epoch 0: frames [0..500] (first sequence at sample 0), data subset 0 of 1 Final Results: Minibatch[1-1]: err = 0.00718580 * 500; lr = 0.03153573 * 500 Action "test" complete. __COMPLETED__
- 执行流程
- LR问题简述
多分类器构建
CNTK学习(一)
最新推荐文章于 2022-10-25 19:45:00 发布