pandas 轻松实现数据类型转化

最新推荐文章于 2024-06-27 09:32:36 发布

大Py

最新推荐文章于 2024-06-27 09:32:36 发布

阅读量1.9k

点赞数

分类专栏： pandas 文章标签： python pandas 数据类型

本文链接：https://blog.csdn.net/a_pinkpig/article/details/106095116

版权

pandas 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

首先，了解一下pandas的数据类型：

Pandas dtype	Python type	NumPy type	Usage
object	str or mixed	string_, unicode_, mixed types	Text or mixed numeric and non-numeric values
int64	int	int_, int8, int16, int32, int64, uint8, uint16, uint32, uint64	Integer numbers
float64	float	float_, float16, float32, float64	Floating point numbers
bool	bool	bool_	True/False values
datetime64	NA	datetime64[ns]	Date and time values
timedelta[ns]	NA	NA	Differences between two datetimes
category	NA	NA	Finite list of text values

Notes:object在加载数据时可以处理任何其他数据类型，包括字符串，所以在pandas新版本1.0.0中新增了一数据类型，StringDtype,，专用来处理字符串。也算是一个改进。

两个重要的数据转化方法
1、astype

DataFrame.astype(dtype, copy = True, errors = 'raise') 
or
Series.astype(dtype, copy = True, errors = 'raise')

上述方法可以将一类数据转换类型或者传入一个dict，列为key，需要转化的数据类型为value。

2、convert_dtypes

DataFrame.convert_dtypes(infer_objects = True, convert_string:  True, convert_integer = True, convert_boolean = True)
or
Series.convert_dtypes(infer_objects = True, convert_string:  True, convert_integer = True, convert_boolean = True)

convert_dtypes可以自动推断数据类型并进行转化。个人感觉，这个方法只在识别string上智能，在int推断时还是会尽可能的选择大高位存储，int还是以int64为主，内存消耗还是很大。
举例：

df = pd.DataFrame({'a':[1,2,3],'b':[0.55,0.66,1.55],'c':['Jack','Tony','Posi']})
df.dtypes
a      int64
b    float64
c     object
dtype: object

df['a'] = df['a'].astype(np.int32)
df.dtypes
a      int32
b    float64
c     object
dtype: object

df.convert_dtypes().dtypes
a      Int64
b    float64
c     string
dtype: object

参考：
https://pandas.pydata.org/pandas-docs/stable/getting_started/basics.html#basics-dtypes

大Py

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录