python判断变量是否为nan_高效地检查Python / numpy / pandas中的任意对象是否为NaN?...

My numpy arrays use np.nan to designate missing values. As I iterate over the data set, I need to detect such missing values and handle them in special ways.

Naively I used numpy.isnan(val), which works well unless val isn't among the subset of types supported by numpy.isnan(). For example, missing data can occur in string fields, in which case I get:

>>> np.isnan('some_string')

Traceback (most recent call last):

File "", line 1, in

TypeError: Not implemented for this type

Other than writing an expensive wrapper that catches the exception and returns False, is there a way to handle this elegantly and efficiently?

解决方案

pandas.isnull() (also pd.isna(), in newer versions) checks for missing values in both numeric and string/object arrays. From the documentation, it checks for:

NaN in numeric arrays, None/NaN in object arrays

Quick example:

import pandas as pd

import numpy as np

s = pd.Series(['apple', np.nan, 'banana'])

pd.isnull(s)

Out[9]:

0 False

1 True

2 False

dtype: bool

The idea of using numpy.nan to represent missing values is something that pandas introduced, which is why pandas has the tools to deal with it.

Datetimes too (if you use pd.NaT you won't need to specify the dtype)

In [24]: s = Series([Timestamp('20130101'),np.nan,Timestamp('20130102 9:30')],dtype='M8[ns]')

In [25]: s

Out[25]:

0 2013-01-01 00:00:00

1 NaT

2 2013-01-02 09:30:00

dtype: datetime64[ns]``

In [26]: pd.isnull(s)

Out[26]:

0 False

1 True

2 False

dtype: bool

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值