python中lambda字符串长度_Python：Pandas根据字符串长度过滤字符串数据

最新推荐文章于 2022-06-15 09:16:52 发布

weixin_39610956

最新推荐文章于 2022-06-15 09:16:52 发布

阅读量246

点赞数

文章标签： python中lambda字符串长度

I like to filter out data whose string length is not equal to 10.

If I try to filter out any row whose column A's or B's string length is not equal to 10, I tried this.

df=pd.read_csv('filex.csv')

df.A=df.A.apply(lambda x: x if len(x)== 10 else np.nan)

df.B=df.B.apply(lambda x: x if len(x)== 10 else np.nan)

df=df.dropna(subset=['A','B'], how='any')

This works slow, but is working.

However, it sometimes produce error when the data in A is not a string but a number (interpreted as a number when read_csv read the input file).

File "", line 1, in

TypeError: object of type 'float' has no len()

I believe there should be more efficient and elegant code instead of this.

Based on the answers and comments below, the simplest solution I found are:

df=df[df.A.apply(lambda x: len(str(x))==10]

df=df[df.B.apply(lambda x: len(str(x))==10]

or

df=df[(df.A.apply(lambda x: len(str(x))==10) & (df.B.apply(lambda x: len(str(x))==10)]

or

df=df[(df.A.astype(str).str.len()==10) & (df.B.astype(str).str.len()==10)]

解决方案import pandas as pd

df = pd.read_csv('filex.csv')

df['A'] = df['A'].astype('str')

df['B'] = df['B'].astype('str')

mask = (df['A'].str.len() == 10) & (df['B'].str.len() == 10)

df = df.loc[mask]

print(df)

Applied to filex.csv:

A,B

123,abc

1234,abcd

1234567890,abcdefghij

the code above prints

A B

2 1234567890 abcdefghij

weixin_39610956

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python中lambda字符串长度_Python：Pandas根据字符串长度过滤字符串数据

I like to filter out data whose string length is not equal to 10.If I try to filter out any row whose column A's or B's string length is not equal to 10, I tried this.df=pd.read_csv('filex.csv')df.A=d...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。