python画三瓣树叶_python matplotlib画图教程学习：（三）IRIS数据集作图

最新推荐文章于 2023-08-20 23:21:26 发布

weixin_39566493

最新推荐文章于 2023-08-20 23:21:26 发布

阅读量429

点赞数

鸢尾花数据集 matplotlib 数据可视化散点图 numpy

关键词由CSDN通过智能技术生成

在开始我们今天的正题之前，有一个基础的知识先补充一下，即matplotblo所有的画图函数接受的数据类型是numpy.array，所以在画图之前最好将数据类型转化成numpy.array，否则可能会有意外的错误。

例如，将pandas.DataFrame在转化成np.array

a = pandas.DataFrame(np.random.rand(4,5), columns = list('abcde'))

a_asndarray = a.values

或者将np.matrix转化成np.array

b = np.matrix([[1,2],[3,4]])

b_asarray = np.asarray(b)

好了，今天我们将通过前两节掌握的方法，制作IRIS数据集相关图形，从事数据分析和挖掘工作的小伙伴们一定都听说过这个数据集，那么现在开始吧。

首先引入IRIS数据集和基本画图包，并打印查看数据结构

import matplotlib.pyplot as plt

import pandas as pd

import numpy as np

from sklearn import datasets

iris = datasets.load_iris()

# print(type(iris)) # iris的数据类型是sklearn.utils.Bunch；A Bunch is a Python dictionary that provides attribute-style access。

for key, _ in iris.items(): # 查看iris所有的key

print(key)

#结果如下

#data

#target

#target_names

#DESCR

#feature_names

#filename

# 打印数据集DESCR

print(DESCR)

通过打印数据集DESCR，可以了解数据集的一些重要信息如下：

Iris plants dataset

--------------------

**Data Set Characteristics:**

:Number of Instances: 150 (50 in each of three classes)

:Number of Attributes: 4 numeric, predictive attributes and the class

:Attribute Information:

- sepal length in cm

- sepal width in cm

- petal length in cm

- petal width in cm

- class:

- Iris-Setosa

- Iris-Versicolour

- Iris-Virginica

:Summary Statistics:

============== ==== ==== ======= ===== ====================

Min Max Mean SD Class Correlation

============== ==== ==== ======= ===== ====================

sepal length: 4.3 7.9 5.84 0.83 0.7826

sepal width: 2.0 4.4 3.05 0.43 -0.4194

petal length: 1.0 6.9 3.76 1.76 0.9490 (high!)

petal width: 0.1 2.5 1.20 0.76 0.9565 (high!)

============== ==== ==== ======= ===== ====================

:Missing Attribute Values: None

:Class Distribution: 33.3% for each of 3 classes.

由此知道数据集data的变量依次表示sepal length、sepal width、petal length、petal width(或者也可以由feature_names知晓)，数据集target的0代表Iris-Setosa、1代表Iris-Versicolour、2代表Iris-Virginica(或者也可以由target_names知晓)。

以下是部分data和target输出结果

print(iris['data']) # 是一个多维矩阵

# array([[5.1, 3.5, 1.4, 0.2],[4.9, 3. , 1.4, 0.2],[4.7, 3.2, 1.3, 0.2]...])

print(iris['data'].shape) # 结果是150x4的二维矩阵（150，4），即150条数据，4列变量

print(iris['target']# 是一个1维矩阵

# array([0,0,...,1,2,...])

OK，基本的数据结构已经了解了，我们想通过图形找出petal length、petal width和花的种类之间的关系图。脑海中大概想作出下图^_^

349922fa35384a64ab858a38d5f17108

可以从图中清晰看出，setosa的花瓣宽度和长度都偏小，其次是versicolor，最大的是virginca。

那么，接来下就一步步给出作图的具体步骤

第一步，获取petal_length和petal_width数据

petal_length=iris['data'][:,2] # 获取petal_length数据，2对应着第三列

# print(petal_length)

petal_width=iris['data'][:,3] # 获取petal_width数据，3对应着第四列

# print(petal_width)

第二步，创建figure和axes对象

fig = plt.figure()

ax = fig.add_subplot(111)

第三步，绘制散点图的关键一步

markers=[ 's' if i == 0 else 'o' if i==1 else 'd' for i in iris.target]

colors=['pink' if i==0 else 'skyblue' if i==1 else 'lightgreen' for i in iris.target]

for x, y, c, m in zip(petal_length, petal_width, colors, markers):

ax.scatter(x, y, c=c, marker=m) # c=color；marker是点d的形状；注意zip函数的使用方法

第四步，添加x/y轴的变量描述说明

ax.set_xlabel('petal length')

ax.set_ylabel('petal width')

最后一步，绘制legend。定义了两个空list---x,y，所以只绘制了legend，没有生成其他额外数据。这是绘制legend的一般方法，多试试改改就能更好理解了。

# 制作legend

x=[]

y=[]

ax.scatter(x,y,marker='s',label='setosa',color='pink') # 绘制setosa的legend

ax.scatter(x,y,marker='o',label='versicolor',color='skyblue') # 绘制versicolor的legend

ax.scatter(x,y,marker='d',label='virginica',color='lightgreen') # 绘制virginica的legend

plt.legend()

plt.show()

好了，今天就写到这，更多精彩内容，欢迎继续关注噢～^_^

weixin_39566493

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
python画三瓣树叶_python matplotlib画图教程学习：（三）IRIS数据集作图

在开始我们今天的正题之前，有一个基础的知识先补充一下，即matplotblo所有的画图函数接受的数据类型是numpy.array，所以在画图之前最好将数据类型转化成numpy.array，否则可能会有意外的错误。例如，将pandas.DataFrame在转化成np.arraya = pandas.DataFrame(np.random.rand(4,5), columns = list('abc...
复制链接

扫一扫