Pandas入门之常用函数介绍

最新推荐文章于 2023-06-05 13:32:04 发布

住手丶让我来

最新推荐文章于 2023-06-05 13:32:04 发布

阅读量1.3k

点赞数 3

分类专栏： Python 文章标签： Pandas入门之常用函数介绍

本文链接：https://blog.csdn.net/weixin_42146366/article/details/90768489

版权

Python 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

一、Pandas简介

pandas 是基于NumPy 的一种工具，该工具是为了解决数据分析任务而创建的。
Pandas 纳入了大量库和一些标准的数据模型，提供了高效地操作大型数据集所需的工具。
pandas提供了大量能使我们快速便捷地处理数据的函数和方法。你很快就会发现，它是使Python成为强大而高效的数据分析环境的重要因素之一。

二、Pandas中常用函数介绍

【1】read_csv()函数：读取CSV（逗号分割）文件到DataFrame，也支持文件的部分导入和选择迭代。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\pandaTest.csv")
# print(practice)
print(practice.dtypes)

结果：

first     int64
second    int64
three     int64
four      int64
five      int64
dtype: object

【2】head()函数：查看文件的指定行数的数据，默认是5行，下标从0开始。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\pandaTest.csv")
head = practice.head()
print(head)

结果：

   first  second  three  four  five  six  seven  eight  nith  ten
0      1       2      3     4     5    6      7      8     9   10
1      1       2      3     4     5    6      7      8     9   10
2      1       2      3     4     5    6      7      8     9   10
3      1       2      3     4     5    6      7      8     9   10
4      1       2      3     4     5    6      7      8     9   10

【3】tail()函数：查看文件的指定行数的数据，默认是5行。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\pandaTest.csv")
tail = practice.tail()
print(tail)

结果：

    first  second  three  four  five  six  seven  eight  nith  ten
24      1       2      3     4     5    6      7      8     9   10
25      1       2      3     4     5    6      7      8     9   10
26      1       2      3     4     5    6      7      8     9   10
27      1       2      3     4     5    6      7      8     9   10
28      1       2      3     4     5    6      7      8     9   10

【4】shape函数：查看DataFrame结构。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\pandaTest.csv")
shape = practice.shape
print(shape)

结果：

(29, 10)

【5】columns函数：获取数据文件中的列名。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\pandaTest.csv")
col = practice.columns
print(col)

结果：

Index(['first', 'second', 'three', 'four', 'five', 'six', 'seven', 'eight',
       'nith', 'ten'],
      dtype='object')

【6】dtypes函数：获取文件中的每一列的数据类型。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\pandaTest.csv")
print(practice.dtypes)

结果：

first     int64
second    int64
three     int64
four      int64
five      int64

【7】tolist()函数：将DataFrame数据转化为数组。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\pandaTest.csv")
colList = practice.columns.tolist()
print(colList)

结果：

['first', 'second', 'three', 'four', 'five', 'six(g)', 'seven', 'eight', 'nith(mg)', 'ten']

【8】endswith()函数：查看元素以某值结尾。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\pandaTest.csv")
colList = practice.columns.tolist()
print(colList)
newColumns = []
for col in colList:
    if col.endswith("(g)"):
        newColumns.append(col)
print(newColumns)
print(practice[newColumns])

结果：

['first', 'second', 'three', 'four', 'five', 'six(g)', 'seven', 'eight', 'nith(mg)', 'ten']
['six(g)']
    six(g)
0      6.1
1      6.2
2      6.3
3      6.0
4      6.0
5      6.0

【9】sort_values()函数：对DataFrame中的某一列进行排序。

ascending：默认True升序排列；False降序排列。
inplace：默认False，否则排序之后的数据直接替换原来的数据框。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\pandaTest.csv")
newPractice = practice.sort_values("first",inplace=False,ascending=False)
print(practice)
print("#########")
print(newPractice)

结果：

    first  second  three  four  five  six(g)  seven  eight  nith(mg)   ten
0     1.1     2.1    3.1   4.1   5.1     6.1    7.1    8.1       9.1  10.1
1     1.2     2.2    3.2   4.2   5.2     6.2    7.2    8.2       9.2  10.2
2     1.3     2.3    3.3   4.3   5.3     6.3    7.3    8.3       9.3  10.3
#########
    first  second  three  four  five  six(g)  seven  eight  nith(mg)   ten
29   12.0    22.0    NaN   NaN   NaN     NaN    NaN    NaN       NaN   NaN
2     1.3     2.3    3.3   4.3   5.3     6.3    7.3    8.3       9.3  10.3
1     1.2     2.2    3.2   4.2   5.2     6.2    7.2    8.2       9.2  10.2
0     1.1     2.1    3.1   4.1   5.1     6.1    7.1    8.1       9.1  10.1

【10】isnull()函数：判断数据文件中某一个数值是否为null。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\pandaTest.csv")
print(practice["first"].isnull())
# print(practice.isnull())

结果：

23    False
24    False
25     True
26    False
27    False

【11】len()函数：获取数据文件中最大的行数是多少行。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\pandaTest.csv")
print(len(practice))

结果：

【12】dropna()函数：数据清理删除NaN。

axis=0指定消除的维度是以行为单位。
subset=[“first”]表示消除的列数名称。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\pandaTest.csv")
drop = practice.dropna(axis=0,subset=["first"])
print(drop)

结果：

    first  second  three  four  five  six(g)  seven  eight  nith(mg)   ten
1     1.2     2.2    3.2   4.2   5.2     6.2    7.2    8.2       9.2  10.2
2     1.3     2.3    3.3   4.3   5.3     6.3    7.3    8.3       9.3  10.3
3     1.0     2.0    3.0   4.0   5.0     6.0    7.0    8.0       9.0  10.0
4     1.0     2.0    3.0   4.0   5.0     6.0    7.0    8.0       9.0  10.0

【13】to_datetime()函数：将数据文件的某一列转化为标准时间格式。

示例：

import pandas
practice = pandas.read_csv("C:\\Users\\Lenovo\\Desktop\\practice.csv")
practice = pandas.to_datetime(pratice["Data"])
print(practice)

结果：

0    1998-06-05
1    1998-06-06
2    1998-06-07
3    1998-06-08
4    1998-06-09

住手丶让我来

关注

3
点赞
踩
11

收藏

觉得还不错? 一键收藏
0
评论
Pandas入门之常用函数介绍

一、Pandas简介pandas 是基于NumPy 的一种工具，该工具是为了解决数据分析任务而创建的。Pandas 纳入了大量库和一些标准的数据模型，提供了高效地操作大型数据集所需的工具。pandas提供了大量能使我们快速便捷地处理数据的函数和方法。你很快就会发现，它是使Python成为强大而高效的数据分析环境的重要因素之一。二、Pandas中常用函数介绍【1】read_csv()函...
复制链接

扫一扫