sklearn.preprocessing.Binarizer

Binarizer类和binarize方法根据指定的阈值将特征二值化,小于等于阈值的,将特征值赋予0,大于特征值的赋予1,其阈值threshold默认都为0

①binarize方法:sklearn.preprocessing.binarize(X, threshold=0.0, copy=True)

a、对于非稀疏矩阵而言,阈值threshold可以设置任何浮点数

In [1]: from sklearn import preprocessing
   ...: from sklearn import datasets
   ...: import numpy as np
   ...: data = datasets.load_boston()
   ...: new_target  = preprocessing.binarize(data.target[:,np.newaxis] , thresh
   ...: old = data.target.mean()).astype(int)#小于等于均值赋予0,否则赋予1
   ...: print(type(preprocessing.binarize(data.target[:,np.newaxis] , threshold
   ...:  = data.target.mean())))
   ...: new_target[:5]
   ...:
<class 'numpy.ndarray'>
Out[1]:
array([[1],
       [0],
       [1],
       [1],
       [1]])

In [2]: preprocessing.binarize(data.target[:,np.newaxis] , threshold = -1).asty
   ...: pe(int)[:5]
Out[2]:
array([[1],
       [1],
       [1],
       [1],
       [1]])
b、对于稀疏矩阵而言,阈值threshold必须设置为大于等于0浮点数
In [3]: from scipy.sparse import coo
   ...: from sklearn import preprocessing
   ...: spar = coo.coo_matrix(np.random.binomial(1,0.25,100))
   ...: preprocessing.binarize(spar,threshold=-1)
   ...:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-ff778f656a6b> in <module>()
      2 from sklearn import preprocessing
      3 spar = coo.coo_matrix(np.random.binomial(1,0.25,100))
----> 4 preprocessing.binarize(spar,threshold=-1)

d:\softwore\python\lib\site-packages\sklearn\preprocessing\data.py in binarize(X
, threshold, copy)
   1470     if sparse.issparse(X):
   1471         if threshold < 0:
-> 1472             raise ValueError('Cannot binarize a sparse matrix with thres
hold '
   1473                              '< 0')
   1474         cond = X.data > threshold

ValueError: Cannot binarize a sparse matrix with threshold < 0

In [4]: preprocessing.binarize(spar,threshold=0)
Out[4]:
<1x100 sparse matrix of type '<class 'numpy.int32'>'
        with 24 stored elements in Compressed Sparse Row format>
②Binarizer类:sklearn.preprocessing.Binarizer(threshold=0.0, copy=True)

a、对于非稀疏矩阵而言,阈值threshold可以设置任意浮点数

In [5]: from sklearn import preprocessing
   ...: from sklearn import datasets
   ...: import numpy as np
   ...: data = datasets.load_boston()
   ...: bz = preprocessing.Binarizer(data.target.mean())
   ...: new_target = bz.fit_transform(data.target[:,np.newaxis]).astype(int)
   ...: print(bz)
   ...: new_target[:5]
   ...:
Binarizer(copy=True, threshold=22.532806324110677)
Out[5]:
array([[1],
       [0],
       [1],
       [1],
       [1]])

In [6]: preprocessing.Binarizer(-1).fit_transform(data.target[:,np.newaxis]).as
   ...: type(int)[:5]
Out[6]:
array([[1],
       [1],
       [1],
       [1],
       [1]])
b、对于稀疏矩阵而言,阈值threshold同样必须设置为大于等于0浮点数

In [7]: from scipy.sparse import coo
   ...: spar = coo.coo_matrix(np.random.binomial(1,0.25,100))
   ...: preprocessing.Binarizer(threshold= -1).fit_transform(spar)
   ...:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-fc5a78d3b8c5> in <module>()
      1 from scipy.sparse import coo
      2 spar = coo.coo_matrix(np.random.binomial(1,0.25,100))
----> 3 preprocessing.Binarizer(threshold= -1).fit_transform(spar)

d:\softwore\python\lib\site-packages\sklearn\base.py in fit_transform(self, X, y
, **fit_params)
    492         if y is None:
    493             # fit method of arity 1 (unsupervised transformation)
--> 494             return self.fit(X, **fit_params).transform(X)
    495         else:
    496             # fit method of arity 2 (supervised transformation)

d:\softwore\python\lib\site-packages\sklearn\preprocessing\data.py in transform(
self, X, y, copy)
   1549         """
   1550         copy = copy if copy is not None else self.copy
-> 1551         return binarize(X, threshold=self.threshold, copy=copy)
   1552
   1553

d:\softwore\python\lib\site-packages\sklearn\preprocessing\data.py in binarize(X
, threshold, copy)
   1470     if sparse.issparse(X):
   1471         if threshold < 0:
-> 1472             raise ValueError('Cannot binarize a sparse matrix with thres
hold '
   1473                              '< 0')
   1474         cond = X.data > threshold

ValueError: Cannot binarize a sparse matrix with threshold < 0






  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值