python数据分析与可视化——第三章实训

最新推荐文章于 2022-07-25 11:16:30 发布

小田月朔一

最新推荐文章于 2022-07-25 11:16:30 发布

阅读量2.7k

点赞数 1

分类专栏： Python数据分析与可视化文章标签：数据分析 python

本文链接：https://blog.csdn.net/m0_66279156/article/details/125172693

版权

该博客主要介绍了鸢尾花数据集的处理过程，包括数据导入、去除索引号、数据类型定义以及统计分析。通过numpy库对数据进行排序、去重，并计算了花瓣长度的和、均值、标准差、最小值和最大值，展示了基础的数据清洗和统计操作。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1.导入模块

import csv
import numpy as np

2.获取数据

iris_data=[]
with open("F:\专业课程作业\python时空数据分析与可视化\iris.csv","r") as f:
    #使用csv.reader读取f中的文件
    csv_reader=csv.reader(f)
    #读取第一行各列的标题
    birth_header=next(csv_reader)
    #将数据存入列表中
    for row in csv_reader:
        iris_data.append(row)
iris_data

3.数据清洗：去掉索引号

# 3.数据清洗：去掉索引号
iris_list=[]
for row in iris_data:
    iris_list.append(tuple(row[1:]))
iris_list

4.数据统计

# 4.数据统计
# (1)创建数据类型
datatype=np.dtype([("Sepal.Length",np.float_),\
                   ("Sepal.Width",np.float_),\
                   ("Petal.Length",np.float_),\
                   ("Petal.Width",np.float_),\
                   ("Species",np.str_,40)])
print(