use opttheta to obtain a(2) which represente the labeled input data.
Training and testing the logistic regression model(with softmaxTrain.m which we have done previously).using the training set features (trainFeatures) and labels (trainLabels).
Classifying on the test set.completing the code to make predictions on the test set (testFeatures)
不同: self-taught———Suppose your goal is a computer vision task where you’d like to distinguish between images of cars and images of motorcycles; so, each labeled example in your training set is either an image of a car or an image of a motorcycle. Where can we get lots of unlabeled data? The easiest way would be to obtain some random collection of images, perhaps downloaded off the internet. We could then train the autoencoder on this large collection of images, and obtain useful features from them. Because here the unlabeled data is drawn from a different distribution than the labeled data (i.e., perhaps some of our unlabeled images may contain cars/motorcycles, but not every image downloaded is either a car or a motorcycle), we call this self-taught learning. semi-supervised———In contrast, if we happen to have lots of unlabeled images lying around that are all images of either a car or a motorcycle, but where the data is just missing its label (so you don’t know which ones are cars, and which ones are motorcycles), then we could use this form of unlabeled data to learn the features. This setting—where each unlabeled example is drawn from the same distribution as your labeled examples—is sometimes called the semi-supervised setting.
练习题答案(推荐自己完成后再参考)
stlExercise.m
%% CS294A/CS294W Self-taught Learning Exercise
% Instructions
% ------------
%
% This filecontains code that helps you get started onthe
% self-taught learning. You will need to complete code in feedForwardAutoencoder.m
% You will also need to have implemented sparseAutoencoderCost.m and
% softmaxCost.m from previous exercises.
%
%% ======================================================================
% STEP 0: Here we provide the relevant parameters values that will
% allow your sparse autoencoder toget good filters; you do not need to
% change the parameters below.
inputSize = 28 * 28;
numLabels = 5;
hiddenSize = 200;
sparsityParam = 0.1; % desired average activation ofthe hidden units.
% (This was denoted bythe Greek alphabet rho, which looks like a lower-case "p",
% inthe lecture notes).
lambda = 3e-3; % weight decay parameter
beta = 3; % weight of sparsity penalty term
maxIter = 400;
%% ======================================================================
% STEP 1: Load data fromthe MNIST database
%
% This loads our training and test data fromthe MNIST database files.
% We have sorted the data for you in this so that you will not have to
% change it.
% Load MNIST database files
mnistData = loadMNISTImages('train-images-idx3-ubyte');
mnistLabels = loadMNISTLabels('train-labels-idx1-ubyte');
% Set Unlabeled Set (All Images)
% Simulate a Labeled and Unlabeled set
labeledSet = find(mnistLabels >= 0 & mnistLabels <= 4);
unlabeledSet = find(mnistLabels >= 5);
numTrain = round(numel(labeledSet)/2);
trainSet = labeledSet(1:numTrain);
testSet = labeledSet(numTrain+1:end);
unlabeledData = mnistData(:, unlabeledSet);
trainData = mnistData(:, trainSet);
trainLabels = mnistLabels(trainSet)' + 1; % Shift Labels tothe Range 1-5
testData = mnistData(:, testSet);
testLabels = mnistLabels(testSet)' + 1; % Shift Labels tothe Range 1-5
% Output Some Statistics
fprintf('# examples in unlabeled set: %d\n', size(unlabeledData, 2));
fprintf('# examples in supervised training set: %d\n\n', size(trainData, 2));
fprintf('# examples in supervised testing set: %d\n\n', size(testData, 2));
%% ======================================================================
% STEP 2: Train the sparse autoencoder
% This trains the sparse autoencoder onthe unlabeled training
% images.
% Randomly initialize the parameters
theta = initializeParameters(hiddenSize, inputSize);
%% ----------------- YOUR CODE HERE ----------------------
% Find opttheta byrunningthe sparse autoencoder on
% unlabeledTrainingImages
opttheta = theta;
addpath minFunc/
options.Method = 'lbfgs'; % Here, we use L-BFGS to optimize our cost
% function. Generally, for minFunc to work, you
% need a function pointer with two outputs: the
% function value andthe gradient. In our problem,
% sparseAutoencoderCost.m satisfies this.
options.maxIter = 400; % Maximum numberof iterations of L-BFGS torun
options.display = 'on';
[opttheta, cost] = minFunc( @(p) sparseAutoencoderCost(p, ...
inputSize, hiddenSize, ...
lambda, sparsityParam, ...
beta, unlabeledData), ...
theta, options);
%% -----------------------------------------------------
% Visualize weights
W1 = reshape(opttheta(1:hiddenSize * inputSize), hiddenSize, inputSize);
display_network(W1');
%%======================================================================
%% STEP 3: Extract Features fromthe Supervised Dataset
%
% You need to complete the code in feedForwardAutoencoder.m so thatthe
% following command will extract features fromthe data.
trainFeatures = feedForwardAutoencoder(opttheta, hiddenSize, inputSize, ...
trainData);
testFeatures = feedForwardAutoencoder(opttheta, hiddenSize, inputSize, ...
testData);
%%======================================================================
%% STEP 4: Train the softmax classifier
softmaxModel = struct;
%% ----------------- YOUR CODE HERE ----------------------
% Use softmaxTrain.m fromthe previous exercise to train a multi-class
% classifier.
% Use lambda = 1e-4forthe weight regularization for softmax
% You need to compute softmaxModel using softmaxTrain on trainFeatures and
% trainLabels
lambda = 1e-4;
options.maxIter = 100;
softmaxModel = softmaxTrain(hiddenSize, 5, lambda, ...
trainFeatures,trainLabels, options);
%注意tainFeatures的大小的hiddenSize
%% -----------------------------------------------------
%%======================================================================
%% STEP 5: Testing
%% ----------------- YOUR CODE HERE ----------------------
% Compute Predictions onthe test set (testFeatures) using softmaxPredict
% and softmaxModel
[pred] = softmaxPredict(softmaxModel, testFeatures);
%% -----------------------------------------------------
% Classification Score
fprintf('Test Accuracy: %f%%\n', 100*mean(pred(:) == testLabels(:)));
% (note that we shift the labels by1, so that digit 0 now corresponds to
% label 1)
%
% Accuracy isthe proportion of correctly classified images
% The results for our implementation was:
%
% Accuracy: 98.3%
%
%
feedForwardAutoencoder.m
function [activation] = feedForwardAutoencoder(theta, hiddenSize, visibleSize, data)
% theta: trained weights fromthe autoencoder
% visibleSize: thenumberof input units (probably 64)
% hiddenSize: thenumberof hidden units (probably 25)
% data: Our matrix containing the training data as columns. So, data(:,i) isthe i-th training example.
% We first convert theta tothe (W1, W2, b1, b2) matrix/vector format, so that this
% follows the notation convention ofthe lecture notes.
W1 = reshape(theta(1:hiddenSize*visibleSize), hiddenSize, visibleSize);
b1 = theta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);
%% ---------- YOUR CODE HERE --------------------------------------
% Instructions: Compute the activation ofthe hidden layer forthe Sparse Autoencoder.
m=size(data,2);
z2 = W1*data+repmat(b1,1,m);%注意这里一定要将b1向量复制扩展成m列的矩阵
activation = sigmoid(z2);
%-------------------------------------------------------------------end
%-------------------------------------------------------------------
% Here's an implementation ofthe sigmoid function, which you may find useful
% in your computation ofthe costs andthe gradients. This inputs a (row or
% column) vector (say (z1, z2, z3)) and returns (f(z1), f(z2), f(z3)).
function sigm = sigmoid(x)
sigm = 1 ./ (1 + exp(-x));
end