Machine Learning (3) - Support Vector Machine (SVM)
SVM中文译名:支持向量机
特点是large margin,相对logistic regression可以容易得到全局最优解。
工作原理
基于land marks,如下
Hypothesis
Logistic Regression的Hypothesis为
对以上公式的两部分进行拟合,参看下图:
SVM Decision Boundary
Kernel
Gaussian Kernel
Steps
- Given
- Choose
- x->f
- Predict’“y=1”’if
Note: Do perform feature scaling before using the Gaussian kernel.
Multiclass classification:
Use one vs. all method. (Train K SVMs, one to distinguish y= i from the rest, for i = 1, 2,…,K), get
对于新的输入x,选取使最大的i。
Parameters
C =
- Large C: small Lower’bias,’high’variance.
- Small C: big Higher’bias,’low’variance.
- Large : Features vary more smoothly. Higher bias, lower variance.
- Small : Features vary less smoothly. Lower bias, higher variance.
Andrew提到还有很多其他的Kernel,但是用处不是很多,包括:
- Polynomial kernel
- String’kernel
- chiIsquare’kernel
- histogram intersection kernel,
Logistic regression vs. SVMs
n = number of features (), m = number of training examples
- If n is large (relative to m): Use logistic regression, or SVM without a kernel (“linear kernel”).
- If n is small, m is intermediate: Use SVM with Gaussian kernel.
- If n is small, m is large: Create/add more features, then use logistic regression or SVM without a kernel.
- Neural network likely to work well for most of these settings, but may be slower to train.