最近学李航老师的《统计学习方法》,开贴简单记录
考虑算法的两种形式-原始形式和对偶形式
从结果看来,对偶格式迭代次数更多且耗时更长,结果无明显差异,两种形式下的分割曲线几乎完全重合
结果:
accury1
1.0
accury2
1.0
结果好像…纯线性模型也可以理解
上代码:
from pandas import DataFrame, Series
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import math
from sklearn.model_selection import train_test_split
from pylab import mpl
mpl.rcParams['font.sans-serif'] = ['FangSong'] # 指定默认字体
mpl.rcParams['axes.unicode_minus'] = False # 解决保存图像是负号'-'显示为方块的问题
def creat_liner_datasets(weights, shape):
weight = np.array(weights) / sum(weights)
random_data = np.random.normal(0, 1, shape)
y = np.dot(random_data, weight)
for i in range(len(y)):
y[i] = 1 if y[i] > 0 else -1
out_put = DataFrame(np.column_stack((random_data, y)))
for i in range(out_put.shape[1]-1):
out_put.rename(columns={
out_put.columns[i]: 'x'+str(out_put.columns[i]+1)}, inplace=True)
out_put.rename(columns={
out_put.columns[i+1]: 'y'}, inplace=True)
out_pu