python的sklearn示例_Python sklearn.Binarizer()用法及代码示例

sklearn.preprocessing.Binarizer()是一种属于预处理模块的方法。它在离散连续特征值中起关键作用。

范例1:

一个8位灰度图像的像素值的连续数据的值范围在0(黑色)和255(白色)之间,并且需要它是黑白的。因此,使用Binarizer()可以设置一个阈值,将像素值从0-127转换为0和128-255转换为1。

范例2:

一个机器记录具有“Success Percentage”作为特征。这些值是连续的,范围从10%到99%,但是研究人员只是想使用此数据基于其他给定参数来预测机器的通过或失败状态。

用法:

sklearn.preprocessing.Binarizer(threshold, copy)

参数:

threshold:[float, optional] Values less than or equal to threshold is mapped to 0, else to 1. By default threshold value is 0.0.

copy :[boolean, optional] If set to False, it avoids a copy. By default it is True.

返回:

Binarized Feature values

下载数据集:

转到链接并下载Data.csv

下面是解释sklearn的Python代码.Binarizer()

# Python code explaining how

# to Binarize feature values

""" PART 1

Importing Libraries """

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

# Sklearn library

from sklearn import preprocessing

""" PART 2

Importing Data """

data_set = pd.read_csv(

'C:\\Users\\dell\\Desktop\\Data_for_Feature_Scaling.csv')

data_set.head()

# here Features - Age and Salary columns

# are taken using slicing

# to binarize values

age = data_set.iloc[:, 1].values

salary = data_set.iloc[:, 2].values

print ("\nOriginal age data values:\n",  age)

print ("\nOriginal salary data values:\n",  salary)

""" PART 4

Binarizing values """

from sklearn.preprocessing import Binarizer

x = age

x = x.reshape(1, -1)

y = salary

y = y.reshape(1, -1)

# For age, let threshold be 35

# For salary, let threshold be 61000

binarizer_1 = Binarizer(35)

binarizer_2 = Binarizer(61000)

# Transformed feature

print ("\nBinarized age:\n", binarizer_1.fit_transform(x))

print ("\nBinarized salary:\n", binarizer_2.fit_transform(y))

输出:

Country Age Salary Purchased

0 France 44 72000 0

1 Spain 27 48000 1

2 Germany 30 54000 0

3 Spain 38 61000 0

4 Germany 40 1000 1

Original age data values:

[44 27 30 38 40 35 78 48 50 37]

Original salary data values:

[72000 48000 54000 61000 1000 58000 52000 79000 83000 67000]

Binarized age:

[[1 0 0 1 1 0 1 1 1 1]]

Binarized salary:

[[1 0 0 0 0 0 0 1 1 1]]

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
要在Python中运行这些代码,你需要确保已经安装了相应的包。大多数包都可以使用`pip`进行安装。 首先,在命令行中运行以下命令来安装所需的包: ```bash pip install scikit-learn pandas matplotlib numpy ``` 然后,在你的Python脚本中导入所需的库: ```python from sklearn.model_selection import train_test_split from sklearn.decomposition import PCA import pandas as pd from sklearn import svm import numpy as np import math import matplotlib.pyplot as plt import matplotlib as mpl from matplotlib import colors from sklearn.model_selection import train_test_split from sklearn import datasets from matplotlib.colors import ListedColormap from sklearn.svm import SVC from sklearn.preprocessing import StandardScaler from sklearn.model_selection import StratifiedShuffleSplit, StratifiedKFold from sklearn.model_selection import GridSearchCV from sklearn.model_selection import GridSearchCV, LeaveOneOut, cross_val_predict from sklearn.model_selection import KFold from sklearn.linear_model import LogisticRegression from sklearn.naive_bayes import GaussianNB from sklearn.neighbors import KNeighborsClassifier from sklearn.tree import DecisionTreeClassifier from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import roc_auc_score import datetime import multiprocessing as mp from sklearn.ensemble import StackingClassifier from sklearn.pipeline import make_pipeline from sklearn.svm import LinearSVC import random ``` 请确保在运行这些代码之前,已经安装了所需的Python库。如果遇到任何问题,请确保已正确安装这些库,并且版本与代码兼容。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值