- why named logistic regression? (cause losigtic function)
- what is the model?
- how to solve the minimal/maximal problem?
2-class problem
// also lies in 0~1
in all,
假定我们已经学到了最优的 θ ,那么分类的实现是计算 P(1|x,θ), if >0.5, then p1−p>1 ; if <0.5, then p1−p<1
学习的目标是最大化整个样本集合成立的概率:
then gradient descent could be applied to solve the MLE problem.
Multi-class problem
ofvitalimportance
0
give the constrain the probability of different class:
The cost function is
SMLR - Sparse multinomial Logistic Regression
In total
m
classes, input vector/feature is
the weight vector for one of the classes need not be estimated. Without loss of generality, we thus set
w(m)=0
and the only parameters to be learned are the weight vectors
w(i)
for
i∈1,…,m−1
. For the remainder of the paper, we use
w
to denote the (d(m-1))-dimensional vector of parameters to be learned.
for ordinary softmax regression (also named as multinomial logistic regression-MLR), the probability that
x
belongs to class
Besides on, add sparsity constraints to the
cost
function,
In SMLR, p(w)∝exp(−λ||w||1)