python中pd是什么意思,什么时候在python中应用(pd.to_numeric)和何时astype(np.float64)?...

I have a pandas DataFrame object named xiv which has a column of int64 Volume measurements.

In[]: xiv['Volume'].head(5)

Out[]:

0 252000

1 484000

2 62000

3 168000

4 232000

Name: Volume, dtype: int64

I have read other posts (like this and this) that suggest the following solutions. But when I use either approach, it doesn't appear to change the dtype of the underlying data:

In[]: xiv['Volume'] = pd.to_numeric(xiv['Volume'])

In[]: xiv['Volume'].dtypes

Out[]:

dtype('int64')

Or...

In[]: xiv['Volume'] = pd.to_numeric(xiv['Volume'])

Out[]: ###omitted for brevity###

In[]: xiv['Volume'].dtypes

Out[]:

dtype('int64')

In[]: xiv['Volume'] = xiv['Volume'].apply(pd.to_numeric)

In[]: xiv['Volume'].dtypes

Out[]:

dtype('int64')

I've also tried making a separate pandas Series and using the methods listed above on that Series and reassigning to the x['Volume'] obect, which is a pandas.core.series.Series object.

I have, however, found a solution to this problem using the numpy package's float64 type - this works but I don't know why it's different.

In[]: xiv['Volume'] = xiv['Volume'].astype(np.float64)

In[]: xiv['Volume'].dtypes

Out[]:

dtype('float64')

Can someone explain how to accomplish with the pandas library what the numpy library seems to do easily with its float64 class; that is, convert the column in the xiv DataFrame to a float64 in place.

解决方案

If you already have numeric dtypes (int8|16|32|64,float64,boolean) you can convert it to another "numeric" dtype using Pandas .astype() method.

Demo:

In [90]: df = pd.DataFrame(np.random.randint(10**5,10**7,(5,3)),columns=list('abc'), dtype=np.int64)

In [91]: df

Out[91]:

a b c

0 9059440 9590567 2076918

1 5861102 4566089 1947323

2 6636568 162770 2487991

3 6794572 5236903 5628779

4 470121 4044395 4546794

In [92]: df.dtypes

Out[92]:

a int64

b int64

c int64

dtype: object

In [93]: df['a'] = df['a'].astype(float)

In [94]: df.dtypes

Out[94]:

a float64

b int64

c int64

dtype: object

It won't work for object (string) dtypes, that can't be converted to numbers:

In [95]: df.loc[1, 'b'] = 'XXXXXX'

In [96]: df

Out[96]:

a b c

0 9059440.0 9590567 2076918

1 5861102.0 XXXXXX 1947323

2 6636568.0 162770 2487991

3 6794572.0 5236903 5628779

4 470121.0 4044395 4546794

In [97]: df.dtypes

Out[97]:

a float64

b object

c int64

dtype: object

In [98]: df['b'].astype(float)

...

skipped

...

ValueError: could not convert string to float: 'XXXXXX'

So here we want to use pd.to_numeric() method:

In [99]: df['b'] = pd.to_numeric(df['b'], errors='coerce')

In [100]: df

Out[100]:

a b c

0 9059440.0 9590567.0 2076918

1 5861102.0 NaN 1947323

2 6636568.0 162770.0 2487991

3 6794572.0 5236903.0 5628779

4 470121.0 4044395.0 4546794

In [101]: df.dtypes

Out[101]:

a float64

b float64

c int64

dtype: object

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值