iris数据_python matplotlib画图教程学习：（三）IRIS数据集作图

最新推荐文章于 2023-02-14 12:19:46 发布

weixin_39849942

最新推荐文章于 2023-02-14 12:19:46 发布

阅读量1.5k

点赞数 1

文章标签： iris数据 iris数据集 matplot画图控制marker点的个数

在开始我们今天的正题之前，有一个基础的知识先补充一下，即matplotblo所有的画图函数接受的数据类型是numpy.array，所以在画图之前最好将数据类型转化成numpy.array，否则可能会有意外的错误。

例如，将pandas.DataFrame在转化成np.array

a = pandas.DataFrame(np.random.rand(4,5), columns = list('abcde'))a_asndarray = a.values

或者将np.matrix转化成np.array

b = np.matrix([[1,2],[3,4]])b_asarray = np.asarray(b)

好了，今天我们将通过前两节掌握的方法，制作IRIS数据集相关图形，从事数据分析和挖掘工作的小伙伴们一定都听说过这个数据集，那么现在开始吧。

首先引入IRIS数据集和基本画图包，并打印查看数据结构

import matplotlib.pyplot as pltimport pandas as pdimport numpy as npfrom sklearn import datasets iris = datasets.load_iris() # print(type(iris)) # iris的数据类型是sklearn.utils.Bunch；A Bunch is a Python dictionary that provides attribute-style access。for key, _ in iris.items(): # 查看iris所有的key print(key)#结果如下#data#target#target_names#DESCR#feature_names#filename# 打印数据集DESCR print(DESCR)

通过打印数据集DESCR，可以了解数据集的一些重要信息如下：

Iris plants dataset--------------------**Data Set Characteristics:** :Number of Instances: 150 (50 in each of three classes) :Number of Attributes: 4 numeric, predictive attributes and the class :Attribute Information: - sepal length in cm - sepal width in cm - petal length in cm - petal width in cm - class: - Iris-Setosa - Iris-Versicolour - Iris-Virginica  :Summary Statistics: ============== ==== ==== ======= ===== ==================== Min Max Mean SD Class Correlation ============== ==== ==== ======= ===== ==================== sepal length: 4.3 7.9 5.84 0.83 0.7826 sepal width: 2.0 4.4 3.05 0.43 -0.4194 petal length: 1.0 6.9 3.76 1.76 0.9490 (high!) petal width: 0.1 2.5 1.20 0.76 0.9565 (high!) ============== ==== ==== ======= ===== ==================== :Missing Attribute Values: None :Class Distribution: 33.3% for each of 3 classes.

由此知道数据集data的变量依次表示sepal length、sepal width、petal length、petal width(或者也可以由feature_names知晓)，数据集target的0代表Iris-Setosa、1代表Iris-Versicolour、2代表Iris-Virginica(或者也可以由target_names知晓)。

以下是部分data和target输出结果

print(iris['data']) # 是一个多维矩阵# array([[5.1, 3.5, 1.4, 0.2],[4.9, 3. , 1.4, 0.2],[4.7, 3.2, 1.3, 0.2]...])print(iris['data'].shape) # 结果是150x4的二维矩阵(150，4)，即150条数据，4列变量print(iris['target']# 是一个1维矩阵# array([0,0,...,1,2,...])

OK，基本的数据结构已经了解了，我们想通过图形找出petal length、petal width和花的种类之间的关系图。脑海中大概想作出下图^_^

可以从图中清晰看出，setosa的花瓣宽度和长度都偏小，其次是versicolor，最大的是virginca。

那么，接来下就一步步给出作图的具体步骤

第一步，获取petal_length和petal_width数据

petal_length=iris['data'][:,2] # 获取petal_length数据，2对应着第三列# print(petal_length)petal_width=iris['data'][:,3] # 获取petal_width数据，3对应着第四列# print(petal_width)

第二步，创建figure和axes对象

fig = plt.figure()ax = fig.add_subplot(111)

第三步，绘制散点图的关键一步

markers=[ 's' if i == 0 else 'o' if i==1 else 'd' for i in iris.target]colors=['pink' if i==0 else 'skyblue' if i==1 else 'lightgreen' for i in iris.target] for x, y, c, m in zip(petal_length, petal_width, colors, markers): ax.scatter(x, y, c=c, marker=m) # c=color；marker是点d的形状；注意zip函数的使用方法

第四步，添加x/y轴的变量描述说明

ax.set_xlabel('petal length')ax.set_ylabel('petal width')

最后一步，绘制legend。定义了两个空list---x,y，所以只绘制了legend，没有生成其他额外数据。这是绘制legend的一般方法，多试试改改就能更好理解了。

# 制作legendx=[]y=[]ax.scatter(x,y,marker='s',label='setosa',color='pink') # 绘制setosa的legendax.scatter(x,y,marker='o',label='versicolor',color='skyblue') # 绘制versicolor的legendax.scatter(x,y,marker='d',label='virginica',color='lightgreen') # 绘制virginica的legendplt.legend()plt.show()

好了，今天就写到这，更多精彩内容，欢迎继续关注噢～^_^

weixin_39849942

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫