【Python】从列表/dataframe/pandas中删除 nan

最新推荐文章于 2024-07-06 16:31:56 发布

原创最新推荐文章于 2024-07-06 16:31:56 发布 · 2.6k 阅读

2 ·

CC 4.0 BY-SA版权

文章标签：

#python #pandas #numpy #删除nan #列表

python 专栏收录该内容

33 篇文章

订阅专栏

本文介绍如何使用Pandas库从列表中移除NaN值的方法。提供了多种方式包括使用dropna()函数、numpy库的isnan()函数及纯Python的实现。这些方法适用于不同场景并保持数据的完整性。

0.问题描述

在尝试与Pandas一起做一个项目时，我遇到了一个问题。我有一个包含 avalue 的列表，但我无法删除nan

incoms=data['int_income'].unique().tolist()
incoms.remove('nan')

报错:
list.remove（x）： x 不在列表中”

列表如下：incoms:

[75000.0, 50000.0, 0.0, 200000.0, 100000.0, 25000.0, nan, 10000.0, 175000.0, 150000.0, 125000.0]

1. 解决方法：

需要删除：NaN

incoms=data['int_income'].dropna().unique().tolist()
print (incoms)
[75000.0, 50000.0, 0.0, 200000.0, 100000.0, 25000.0, 10000.0, 175000.0, 150000.0, 125000.0]

如果所有值都只是整数：

incoms=data['int_income'].dropna().astype(int).unique().tolist()
print (incoms)
[75000, 50000, 0, 200000, 100000, 25000, 10000, 175000, 150000, 125000]

或者通过numpy.isnan 选择所有非 NaN 值来删除：NaN

a = data['int_income'].unique()
incoms= a[~np.isnan(a)].tolist()
print (incoms)
[75000.0, 50000.0, 0.0, 200000.0, 100000.0, 25000.0, 10000.0, 175000.0, 150000.0, 125000.0]

整数

a = data['int_income'].unique()
incoms= a[~np.isnan(a)].astype(int).tolist()
print (incoms)
[75000, 50000, 0, 200000, 100000, 25000, 10000, 175000, 150000, 125000]

纯 python 解决方案 - 如果大，则更慢：DataFrame

incoms=[x for x in  list(set(data['int_income'])) if pd.notnull(x)]
print (incoms)
[0.0, 100000.0, 200000.0, 25000.0, 125000.0, 50000.0, 10000.0, 150000.0, 175000.0, 75000.0]

整数

incoms=[int(x) for x in  list(set(data['int_income'])) if pd.notnull(x)]
print (incoms)
[0, 100000, 200000, 25000, 125000, 50000, 10000, 150000, 175000, 75000]

2. 列表list方法

您可以做的只是获取一个清理列表，其中您不会放置一旦转换为字符串就为“nan”的值。

代码将是：

incoms = [incom for incom in incoms if str(incom) != 'nan']