学习NG的Machine Learning教程,先关推导及代码。由于在matleb或Octave中需要矩阵或向量,经常搞混淆,因此自己推导,并把向量的形式写出来了,主要包括cost function及gradient descent
见下图。
图中可见公式推导,及向量化表达形式的cost function(J).
图中为参数更新的向量化表达方式(其中有一处写错了,不想改了。。。)
图中为feature scaling的推导,及向量化表示
下面regression with multi variable的代码
%loading data
data = load('ex1data2.txt');
X = data(:, 1:2); %X : m*2
y = data(:, 3); % y : m*1
m = length(y)
%Scale features and set them to zero mean
fprintf('Normalizing Features ...\n');
[X mu sigma] = featureNormalize(X);
% Add intercept term to X
X = [ones(m, 1) X];
fprintf('Running gradient descent ...\n');
% Choose some alpha value
alpha = 0.05;
num_iters = 400;
% Init Theta and Run Gradient Descent
theta = zeros(3, 1); % theta:3*1
[theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters);
% Plot the convergence graph
figure;
plot(1:numel(J_history), J_history, '-b', 'LineWidth', 2);
xlabel('Number of iterations');
ylabel('Cost J');
% Display gradient descent's result
fprintf('Theta computed from gradient descent: \n');
fprintf(' %f \n', theta);
fprintf('\n');
其中featureNormalize代码如下
function [X_norm, mu, sigma] = featureNormalize(X)
X_norm = X; % X: m*2
mu = zeros(size(X,2),1);
sigma = zeros(size(X, 2),1);
mu = mean(X)'; % m*1 ,对每列求mean
sigma = std(X)'; %m*1, 对每列求std
X_norm = (X .- mu') ./ sigma';
end
其中gradientDescentMulti的代码如下
function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
error = X*theta - y; % m*1
theta = theta - alpha/m*(X'*error);
J_history(iter) = computeCostMulti(X, y, theta);
end
end
其中computeCostMulti的代码如下
function J = computeCostMulti(X, y, theta)
m = length(y); % number of training examples
J = 0;
error = X*theta - y; %m*1
J = 1/(2*m)*sum(error .^ 2);
end
为方便查找,下附公式原图