(or large margin classifiers)
C= 1λ
(C is very large)
Kernels
Using Kernels and SVM to define extra features with landmarks and similarity functions to learn more complex nonlinear classifiers
Informally, the C parameter is a positive value that controls the penalty for misclassified training examples. A large C parameter tells the SVM to try to classify all the examples correctly.
Choose from different kernels,C, δ2 ,etc,whatever performs best on the cross-validation data(not test data).