Machine Learning Project 2 Part B
Our strategy will attempt to classify a test shape based on naive Bayes classi cation. Based on our observation of section 1, we assume that we are able to build pairwise classi ers reliable enough. Our problem is then to decide which class a test shape should belong to given the class cationresults of all possible pairs.
Question A1
We denote by
ω
the set of all classes. Suppose that for any two classes
ωi
ωj
, we have trained a classi er
Cij
, that is, we know the conditional laws
Pr[X|wi;Cij]
and
Pr[X|wj,Cij]
(notice the conditional to
Cij
, which reflects these laws are the opinion of that specif c classifier, and may perfectly by dierent for another. We remind that Bayesian classi cation assigns class to a test pattern X according to
1. The classi er Cij must express its decision by returning Pr[wi|X,Cij] . Using Bayes rule, express this probability as a function of Pr[X|wi,Cij] and Pr[X|wj,Cij] . According to Bayeslaw, what is Pr[wj|X,Cij] anyway ?
1. Answer
Based on formula [1] and[2],
2.Still using Bayes rule, express Pr[ωi|X] assuming equal priors for all classi fiers.
2. Answer
3. Explain why, despite theoretically correct, the values of Pr[ωi|X] may lead to a very unreliable decision once plugged into (1) { hint : look at what happens when a shape in neither i in j.
3. Answer
In this situation the item of Pr[X|ωi,Cijn] which is:
which will lead that
class i nor j.
4.Propose a remedy to this issue, possibly violating the assumption of constant and equal priors for all classi ers (hint: you may consider that your classi er should not return one, but several probabilities). Experiment it, and show it does ectively correct the problem.
4. Answer
A method to solve this problem is to use a threshold value γ to check the probabilities weather pass this threshold γ . If any of the probabilities pass, this X must belong among the set of classes , and the ωi which maximum Pr[ωi|X,Cij] is the class that X belong to, else if no probabilities pass the threshold, the X is not belong to any of the classes.
Question A2
Choose 5 classes (visually quite dierent), and train all pairs of classi ers using autosvm. Store the results in a matrix.
Answer 1
cls_mat =
10x1 struct array with fields:
SupportVectors
Alpha
Bias
KernelFunction
KernelFunctionArgs
GroupNames
SupportVectorIndices
ScaleData
FigureHandles
In the sequel, you will assume that the euclidean distances of samples to the separating hyper- plane are normally distributed, and use the relevant distributions for probabilities Pr[X|ωi;Cij] and Pr[X|ωj;Cij] . Do the margin, or these distances, have a probabilistic meaning anyway?
Answer 2
Here the meaning of ‘the euclidean distances of samples to the separating hyper- plane’ means that the ‘good degree’ of the correspond classifier
Cij
.
And considering the ‘good degree‘satisfied normal distribution, we get,
This probability can be estimated by statistic method (ML etc.) in our example.
As we known before,
Then the best classifier is the classifier make the formula below stand,
with,
Question A3
Using the fi nal classifi cation methodology of question A2, classify 1 shape amongst the remaining 25% of data for each class, and report final posterior probabilities. Choose an “outlier” shape from another class, and compare its posterior again. Comment your results.
Answer
Based on formula [3] and [4] in Question A2, and the classifiers’ matrix ‘cls_mat ’ in answer 1 of Question 2. We can get the calculation relationship below,
[X,Cijn] ~ N(μ,σ2) which can be estimated by ML algorithm.
Pr[ωi] = number of samples of class inumber of samples in all classes
Pr[ωi,X]=number of sample X inside class ωinumberofsample ωi