1、 Gaussian Kernel
sim = exp(-sum((x1-x2).^2)/(2*sigma^2));
2、Parameters (C, sigma) for Dataset 3
C_matrix = [0.01 0.03 0.1 0.3 1 3 10 30];
sigma_matrix = C_matrix;
err_min = 100000;
for i = 1:8
C = C_matrix(i);
for j = 1:8
sigma = sigma_matrix(j);
model= svmTrain(X, y, C, @(x1, x2) gaussianKernel(x1, x2, sigma));
pred = svmPredict(model, Xval);
err = mean(double(pred ~= yval));
if err<err_min
err_min = err;
C_rlt = C;
sigma_rlt = sigma;
end
end
end
3、Email Preprocessing
注意,程序里已经把文本小写化等处理好了,只需要在vocabList里查找当前word(存储在str里)。
for i = 1:length(vocabList)
if (strcmp(vocabList(i),str))
word_indices = [word_indices; i];
break;
end
end
4、Email Feature Extraction
for i = 1:length(word_indices)
x(word_indices(i))=1;
end