独热编码python实现_实现one hot encode独热编码的两种方法

实现one hot encode的两种方法:

利用pandas实现one hot encode:

# transform a given column into one hot. Use prefix to have multiple dummies

>>> import pandas as pd

>>> df = pd.DataFrame({'A': ['a', 'b', 'c'], 'B': ['b', 'a', 'c']})

>>> # Get one hot encoding of columns B

...

>>> df

A B

0 a b

1 b a

2 c c

>>> one_hot = pd.get_dummies(df['B'])

>>> # Drop columns B as it is now encoded

...

>>> df = df.drop('B', axis=1)

>>> # Join the encoded df

...

>>> df = df.join(one_hot)

>>> df

A a b c

0 a 0 1 0

1 b 1 0 0

2 c 0 0 1

一个定性特征哑编码的demo:

def one_hot(df, cols):

"""

@param df pandas DataFrame

@param cols a list of columns to encode

@return a DataFrame with one-hot encoding

"""

for each in cols:

dummies = pd.get_dummies(df[each], prefix=each, drop_first=False)

df = pd.concat([df, dummies], axis=1)

return df

使用 sklearn进行特征变量哑编码:

>>> from sklearn.preprocessing import OneHotEncoder

>>> enc = OneHotEncoder()

>>> enc.fit([[0, 0, 3], [1,1,0], [0,2,1], [1,0,2]])

OneHotEncoder(categorical_features='all', dtype=,

handle_unknown='error', n_values='auto', sparse=True)

>>> enc.n_values_

array([2, 3, 4])

>>> enc.feature_indices_

array([0, 2, 5, 9])

>>> enc.transform([[0,1,1]])

<1x9 sparse matrix of type ''

with 3 stored elements in Compressed Sparse Row format>

>>> enc.transform([[0,1,1]]).toarray()

array([[ 1., 0., 0., 1., 0., 0., 1., 0., 0.]])

一个保存在全局的Label_Binarizer的demo:

from sklearn.preprocessing import LabelBinarizer

label_binarizer = LabelBinarizer()

label_binarizer.fit(all_your_labels_list) # need to be global or remembered to use it later

def one_hot_encode(x):

"""

One hot encode a list of sample labels. Return a one-hot encoded vector for each label.

: x: List of sample Labels

: return: Numpy array of one-hot encoded labels

"""

return label_binarizer.transform(x)

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值