Pandas笔记:数据离散化(one-hot)

import pandas as pd
 
data = pd.Series([176, 174, 160, 180, 159, 163, 192, 184],
                 index=["No1:176", "No2:174", "No3:160", "No4:180", "No5:159", "No6:163", "No7:192", "No8:184"])
print(data)
str = pd.qcut(data, 3)
print()  # 自动分组
print(pd.get_dummies(str, prefix="height"))  # one-hot
# 自定义
bins = [150, 165, 180, 195]
str = pd.cut(data, bins)
print(str)
print(str.value_counts())
print(pd.get_dummies(str, prefix="身高"))
 
No1:176    176
No2:174    174
No3:160    160
No4:180    180
No5:159    159
No6:163    163
No7:192    192
No8:184    184
dtype: int64
 
         height_(158.999, 166.667]  ...  height_(178.667, 192.0]
No1:176                          0  ...                        0
No2:174                          0  ...                        0
No3:160                          1  ...                        0
No4:180                          0  ...                        1
No5:159                          1  ...                        0
No6:163                          1  ...                        0
No7:192                          0  ...                        1
No8:184                          0  ...                        1
 
[8 rows x 3 columns]
No1:176    (165, 180]
No2:174    (165, 180]
No3:160    (150, 165]
No4:180    (165, 180]
No5:159    (150, 165]
No6:163    (150, 165]
No7:192    (180, 195]
No8:184    (180, 195]
dtype: category
Categories (3, interval[int64]): [(150, 165] < (165, 180] < (180, 195]]
(165, 180]    3
(150, 165]    3
(180, 195]    2
dtype: int64
         身高_(150, 165]  身高_(165, 180]  身高_(180, 195]
No1:176              0              1              0
No2:174              0              1              0
No3:160              1              0              0
No4:180              0              1              0
No5:159              1              0              0
No6:163              1              0              0
No7:192              0              0              1
No8:184              0              0              1
 
Process finished with exit code 0
 

 

转载于:https://www.cnblogs.com/jumpkin1122/p/11509777.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值