pandas的使用

最新推荐文章于 2023-04-18 09:21:59 发布

伍肆柒547

最新推荐文章于 2023-04-18 09:21:59 发布

阅读量93

点赞数

分类专栏： python 文章标签： pandas python

本文链接：https://blog.csdn.net/qq_37102150/article/details/119453798

版权

python 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

和numpy的区别

NumPy arrays have one dtype for the entire array, while pandas DataFrames have one dtype per column.Pandas是基于Numpy构建的，让Numpy为中心的应用变得更加简单。

pandas数据导入

导入全部数据

pd.read_csv(filename)：从CSV文件导入数据

pd.read_excel(filename)：从Excel文件导入数据

pd.read_table(filename)：从限定分隔符的文本文件导入数据

import pandas as pd f = open("C:/Users/Thinkpad/Desktop/Data/信息表.txt",encoding='utf8') content = pd.read_table(f,",")print(content)

读取前5行

方法一：

data = pd.read_csv('data.csv',nrows =5)

方法二：

titanic = pd.read_csv("data/titanic.csv")

titanic.head(8)

导入选择特定的行和列

https://pandas.pydata.org/docs/getting_started/intro_tutorials/03_subset_data.html

When selecting subsets of data, square brackets [] are used.
Inside these brackets, you can use a single column/row label, a list of column/row labels, a slice of labels, a conditional expression or a colon.
Select specific rows and/or columns using loc when using the row and column names loc，基于label的索引
Select specific rows and/or columns using iloc when using the positions in the table iloc，完全基于位置的索引
You can assign new values to a selection based on loc/iloc

df.index 查看行号

df.columns 查看列标签

重命名列标签 DataFrame.rename(**kwargs)

merge用法详解

https://blog.csdn.net/Asher117/article/details/84725199

pandas如何交换或重新排序列

import pandas as pd    
employee = {'EmployeeID' : [0,1,2],
     'FirstName' : ['a','b','c'],
     'LastName' : ['a','b','c'],
     'MiddleName' : ['a','b', None],
     'Contact' : ['(M) 133-245-3123', '(F)a123@gmail.com', '(F)312-533-2442 jimmy234@gmail.com']}

df = pd.DataFrame(employee)

一个基本的方法是：

neworder = ['EmployeeID','FirstName','MiddleName','LastName','Contact'] df=df.reindex(columns=neworder)

两列交换

cols = list(df.columns) a, b = cols.index('LastName'), cols.index('MiddleName') cols[b], cols[a] = cols[a], cols[b] df = df[cols]

重新排序列交换（2次交换）

cols = list(df.columns) a, b, c, d = cols.index('LastName'), cols.index('MiddleName'), cols.index('Contact'), cols.index('EmployeeID') cols[a], cols[b], cols[c], cols[d] = cols[b], cols[a], cols[d], cols[c] df = df[cols]

交换多个

现在，它归结为如何使用列表切片 -

cols = list(df.columns) cols = cols[1::2] + cols[::2] df = df[cols]

伍肆柒547

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
pandas的使用

和numpy的区别pandas数据导入导入全部数据读取前5行和numpy的区别NumPy arrays have one dtype for the entire array, while pandas DataFrames have one dtype per column.Pandas是基于Numpy构建的，让Numpy为中心的应用变得更加简单。pandas数据导入导入全部数据pd.read_csv(filename)：从CSV文件导入数据pd...
复制链接

扫一扫

专栏目录