鸢尾花分类
鸢尾花数据链接:
http://bj.bcebos.com/v1/ai-studio-online/93e8a07d6624465c943f60a0b4ec5fd959d44b5e5453410a8b2452ed3720c32f?responseContentDisposition=attachment%3B%20filename%3Diris.data&authorization=bce-auth-v1%2F0ef6765c1e494918bc0d4c3ca3e5c6d1%2F2018-12-12T14%3A57%3A54Z%2F-1%2F%2F2cbe86672d5f2d44278cc3f76789307590c5aeffa85569803fd6e7d625b43ca2
方法一
import numpy as np
from sklearn import model_selection as mo
from sklearn import svm
import matplotlib.pyplot as plt
from matplotlib import colors
import matplotlib as mpl
def iris_type(s):
# 数据转为整型,数据集标签类别由string转为int
it = {b'Iris-setosa': 0, b'Iris-versicolor': 1, b'Iris-virginica': 2}
return it[s]
data = np.loadtxt(r'D:\PycharmProjects\untitled\鸢尾花\iris.data', dtype=float, delimiter=',', converters={4:iris_type})
'''
def loadtxt(fname, dtype=float, comments='#', delimiter=None,
converters=None, skiprows=0, usecols=None, unpack=False,
ndmin=0, encoding='bytes', max_rows=None):
'''
x, y = np.split(data, (4, ), axis=1)
x_train, x_test, y_train, y_test = mo.train_test_split(x, y, random_state=1, test_size=0.3)
'''
train_data:被划分的样本特征集
train_target:被划分的样本标签
test_size:如果是浮点数,在0-1之间,表示样本占比;如果是整数的话就是样本的数量
random_state:是随机数的种子。
随机数种子:其实就是该组随机数的编号,在需要重复试验的时候,保证得到一组一样的随机数。比如你每次都填1,其他参数一样的情况下你得到的随机数组是一样的。但填0或不填,每次都会不一样。
随机数的