Scikit-learn-python机器学习工具入门学习

最新推荐文章于 2022-05-18 17:26:10 发布

Rilakkuma

最新推荐文章于 2022-05-18 17:26:10 发布

阅读量5k

点赞数

分类专栏： Python学习文章标签： Sklearn python 机器学习 scikit-learn 学习

本文链接：https://blog.csdn.net/CRISPY_RICE/article/details/26490385

版权

Python学习专栏收录该内容

4 篇文章 0 订阅

订阅专栏

1、下载

https://github.com/scikit-learn/scikit-learn

官网：http://scikit-learn.org/stable/

2、安装

参考官网文档，需要numpy、scipy，我直接尝试在文件目录下

sudo python setup.py install

出现错误，提示如下：

>>> import sklearn
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "sklearn/__init__.py", line 37, in <module>
    from . import __check_build
  File "sklearn/__check_build/__init__.py", line 46, in <module>
    raise_build_error(e)
  File "sklearn/__check_build/__init__.py", line 41, in raise_build_error
    %s""" % (e, local_dir, ''.join(dir_content).strip(), msg))
ImportError: No module named _check_build
___________________________________________________________________________
Contents of sklearn/__check_build:
__init__.py               __init__.pyc              _check_build.c
_check_build.pyx          setup.py                  setup.pyc
___________________________________________________________________________
It seems that scikit-learn has not been built correctly.

If you have installed scikit-learn from source, please do not forget
to build the package before using it: run `python setup.py install` or
`make` in the source directory.

If you have used an installer, please check that it is suited for your
Python version, your operating system and your platform.

尝试着重新安装numpy scipy 才发现Mac系统自己已经自带了许多类库了，如下：

CoreGraphics/                              
OpenSSL/                                   
PyObjC/                                    
Twisted-12.2.0-py2.7.egg-info/             
altgraph/                                  
altgraph-0.10.1-py2.7.egg-info/            
bdist_mpkg/                                
bdist_mpkg-0.4.4-py2.7.egg-info/           
bonjour/                                   
dateutil/                                  
macholib/                                  
macholib-1.5-py2.7.egg-info/               
matplotlib/                                
modulegraph/                               
modulegraph-0.10.1-py2.7.egg-info/         
mpl_toolkits/                              
numpy/                                     
py2app/                                    
py2app-0.7.1-py2.7.egg-info/               
python_dateutil-1.5-py2.7.egg-info/        
pytz/                                      
pytz-2012d-py2.7.egg-info/                 
scipy/                                     
setuptools/                                
setuptools-0.6c12dev_r88846-py2.7.egg-info/
twisted/                                   
xattr/                                     
xattr-0.6.4-py2.7.egg-info/                
zope/                                      
zope.interface-3.8.0-py2.7.egg-info/

后来尝试了好几种方法，使用pip和easy_install的方法，分别报错。我就在site-packages下删除了原来的文件，然后重新安装了，就成功了。（刚开始失败的原因可能是没有把终端重启，重新进入python）

3、测试学习

➜  ~  python
Python 2.7.5 (default, Sep 12 2013, 21:33:34) 
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sklearn
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> digits = datasets.load_digits()
>>> print(digits.data)
[[  0.   0.   5. ...,   0.   0.   0.]
 [  0.   0.   0. ...,  10.   0.   0.]
 [  0.   0.   0. ...,  16.   9.   0.]
 ..., 
 [  0.   0.   1. ...,   6.   0.   0.]
 [  0.   0.   2. ...,  12.   0.   0.]
 [  0.   0.  10. ...,  12.   1.   0.]]
>>>

4、后续计划

想跟着自带的例子，将机器学习的常用算法做一个后续的总结，是不错的学习资料。

http://scikit-learn.org/stable/auto_examples/feature_selection_pipeline.html