scikit-learn Tutorials (1)

Statistical learning: the setting and the estimator object in scikit-learn

Datasets

Scikit-learn deals with learning information from one or more datasets that are represented as 2D arrays. They can be understood as a list of multi-dimensional observations. We say that the first axis of these arrays is the samples axis, while the second is the features axis.

A simple example shipped with the scikit: iris dataset

>>>
>>> from sklearn import datasets>>> iris = datasets.load_iris()>>> data = iris.data>>> data.shape(150, 4)

It is made of 150 observations of irises, each described by 4 features: their sepal and petal length and width, as detailed iniris.DESCR.

When the data is not initially in the (n_samples, n_features) shape, it needs to be preprocessed in order to be used by scikit-learn.

An example of reshaping data would be the digits dataset

../../_images/plot_digits_last_image_0012.png

The digits dataset is made of 1797 8x8 images of hand-written digits

>>>
>>> digits = datasets.load_digits()>>> digits.images.shape(1797, 8, 8)>>> import pylab as pl 
>>> pl.imshow(digits.images[-1], cmap=pl.cm.gray_r) 
<matplotlib.image.AxesImage object at ...>

To use this dataset with the scikit, we transform each 8x8 image into a feature vector of length 64

>>>
>>> data = digits.images.reshape((digits.images.shape[0], -1))

Estimators objects   预测模型

Fitting data: the main API implemented by scikit-learn is that of the estimator. An estimator is any object that learns from data; it may be a classification, regression or clustering algorithm or a transformer that extracts/filters useful features from raw data.

All estimator objects expose a fit method that takes a dataset (usually a 2-d array):

>>>
>>> estimator.fit(data)

所有预测模型fit就是梯度下降迭代的过程
Estimator parameters: All the parameters of an estimator can be set when it is instantiated or by modifying the corresponding attribute:

>>>
>>> estimator = Estimator(param1=1, param2=2)>>> estimator.param11

Estimated parameters: When data is fitted with an estimator, parameters are estimated from the data at hand. All the estimated parameters are attributes of the estimator object ending by an underscore:

预测后的参数,就是theta

>>>
>>> estimator.estimated_param_ 
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值