Pros:
lTraining is relatively easy
lNo local minima because of solving the problem of convex quadratic programming, unlike in neural networks
l It scales relatively well to high dimensional data by introducing the kernel trick
l Tradeoff between classifier complexity and error can be controlled explicitly so as to achieve the best generalization ability and remains resistant to overfitting
lIt can act like RBF networks and feed forward neural networks through the choice of kernel
Cons:
uNo good way to choose kernel
uTraining speed for large data need improving