pandas根据dtype选择columns

pandas根据dtype选择columns

select_dtypes()方法根据dtype选择columns中的子集。

import numpy as np
import pandas as pd

df = pd.DataFrame({'string': list('abc'),
                    'int64': list(range(1, 4)),
                    'uint8': np.arange(3, 6).astype('u1'),
                    'float64': np.arange(4.0, 7.0),
                    'bool1': [True, False, True],
                    'bool2': [False, True, False],
                    'dates': pd.date_range('now', periods=3),
                    'category': pd.Series(list("ABC")).astype('category')})
df['tdeltas'] = df.dates.diff()
df['uint64'] = np.arange(3, 6).astype('u8')
 df['other_dates'] = pd.date_range('20130101', periods=3)
 df['tz_aware_dates'] = pd.date_range('20130101', periods=3, tz='US/Eastern')
df
stringint64uint8float64bool1bool2datescategorytdeltasuint64other_datestz_aware_dates
0a134.0TrueFalse2019-12-01 22:00:58.958571ANaT32013-01-012013-01-01 00:00:00-05:00
1b245.0FalseTrue2019-12-02 22:00:58.958571B1 days42013-01-022013-01-02 00:00:00-05:00
2c356.0TrueFalse2019-12-03 22:00:58.958571C1 days52013-01-032013-01-03 00:00:00-05:00
 df.dtypes
string                                object
int64                                  int64
uint8                                  uint8
float64                              float64
bool1                                   bool
bool2                                   bool
dates                         datetime64[ns]
category                            category
tdeltas                      timedelta64[ns]
uint64                                uint64
other_dates                   datetime64[ns]
tz_aware_dates    datetime64[ns, US/Eastern]
dtype: object

select_dtypes()有两个参数includeexclude

df.select_dtypes(include=[bool])
bool1bool2
0TrueFalse
1FalseTrue
2TrueFalse
df.select_dtypes(include=['bool'])
bool1bool2
0TrueFalse
1FalseTrue
2TrueFalse
df.select_dtypes(include=['number', 'bool'], exclude=['unsignedinteger'])
int64float64bool1bool2tdeltas
014.0TrueFalseNaT
125.0FalseTrue1 days
236.0TrueFalse1 days

要选择字符串列,你必须使用对象dtype:

df.select_dtypes(include=['object'])
string
0a
1b
2c

要查看像numpy.number这样的泛型dtype的所有子dtypes。你可以定义一个返回子类型树的函数:

def subdtypes(dtype):
    subs = dtype.__subclasses__()
    if not subs:
        return dtype
    return [dtype,[subdtypes(dt) for dt in subs]]
subdtypes(np.generic)
[numpy.generic,
 [[numpy.number,
   [[numpy.integer,
     [[numpy.signedinteger,
       [numpy.int8,
        numpy.int16,
        numpy.int32,
        numpy.int32,
        numpy.int64,
        numpy.timedelta64]],
      [numpy.unsignedinteger,
       [numpy.uint8,
        numpy.uint16,
        numpy.uint32,
        numpy.uint32,
        numpy.uint64]]]],
    [numpy.inexact,
     [[numpy.floating,
       [numpy.float16, numpy.float32, numpy.float64, numpy.float64]],
      [numpy.complexfloating,
       [numpy.complex64, numpy.complex128, numpy.complex128]]]]]],
  [numpy.flexible,
   [[numpy.character, [numpy.bytes_, numpy.str_]],
    [numpy.void, [numpy.record]]]],
  numpy.bool_,
  numpy.datetime64,
  numpy.object_]]
subdtypes(np.number)
[numpy.number,
 [[numpy.integer,
   [[numpy.signedinteger,
     [numpy.int8,
      numpy.int16,
      numpy.int32,
      numpy.int32,
      numpy.int64,
      numpy.timedelta64]],
    [numpy.unsignedinteger,
     [numpy.uint8, numpy.uint16, numpy.uint32, numpy.uint32, numpy.uint64]]]],
  [numpy.inexact,
   [[numpy.floating,
     [numpy.float16, numpy.float32, numpy.float64, numpy.float64]],
    [numpy.complexfloating,
     [numpy.complex64, numpy.complex128, numpy.complex128]]]]]]

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值