scikit-learn
In the past I’ve posted a few times about a library I’m working on called category encoders. The idea of it is to provide a complete toolbox of scikit-learn compatible transformers for the encoding of categorical variables in different ways. If that sounds interesting, you can check out much more in-depth posts here and here.
过去,我曾多次发布有关我正在开发的称为类别编码器的库的信息。 它的想法是提供一个完整的scikit-learn兼容转换器工具箱,以不同方式对分类变量进行编码。 如果听起来很有趣,您可以在此处和此处查看更多深入的帖子。
Scikit-learn is an extremely popular python package that extends Numpy and Scipy to provide rich machine learning functionality. It’s one of the most active python open source projects and generally has a reputation for being extremely high quality.
Scikit-learn是一个非常受欢迎的python软件包,它扩展了Numpy和Scipy以提供丰富的机器学习功能。 它是最活跃的python开源项目之一,通常以极高的质量而闻名。
In the past year or so, some of the core scikit-learn developers started a project called scikit-learn-contrib, which focuses on providing a collection of scikit-learn compatible libraries that are both easy to use and easy to install. Contrary to scikit-learn itself, algorithms implemented in contrib libraries may be experimental or not as mature.
在过去的一年左右的时间里,一些scikit-learn核心开发人员启动了一个名为scikit-learn-contrib的项目,该项目着重于提供易于使用和安装的scikit-learn兼容库的集合。 与scikit-learn本身相反,在contrib库中实现的算法可能只是实验性的,还是不够成熟。
Currently in scikit-learn-contrib there are projects:
当前在scikit-learn-contrib中有一些项目:
闪电 (lightning)
Large-scale linear classification, regression and ranking.
大规模线性分类,回归和排名。
Maintained by Mathieu Blondel and Fabian Pedregosa.
由Mathieu Blondl和Fabian Pedregosa维护 。
土 (py-earth)
A Python implementation of Jerome Friedman’s Multivariate Adaptive Regression Splines.
Jerome Friedman的多元自适应回归样条曲线的Python实现。
Maintained by Jason Rudy and Mehdi.
由Jason Rudy和Mehdi维护。
学习失衡 (imbalanced-learn)
Python module to perform under sampling and over sampling with various techniques.
Python模块可使用多种技术在欠采样和过采样下执行。
Maintained by Guillaume Lemaitre, Fernando Nogueira, Dayvid Oliveira and Christos Aridas.
由Guillaume Lemaitre , Fernando Nogueira , Dayvid Oliveira和Christos Aridas维护 。
多元学习 (polylearn)
Factorization machines and polynomial networks for classification and regression in Python.
用于Python中分类和回归的因式分解机和多项式网络。
Maintained by Vlad Niculae.
由Vlad Niculae维护。
森林信心区间 (forest-confidence-interval)
Confidence intervals for scikit-learn forest algorithms.
scikit学习林算法的置信区间。
Maintained by Ariel Rokem, Kivan Polimis and Bryna Hazelton.
由Ariel Rokem , Kivan Polimis和Bryna Hazelton维护。
高清扫描 (hdbscan)
A high performance implementation of HDBSCAN clustering.
HDBSCAN群集的高性能实现。
Maintained by Leland McInnes, jc-healy, c-north and Steve Astels.
由Leland McInnes , jc-healy , c-north和Steve Astels维护 。
And now:
现在:
类别编码器! (Category Encoders!)
Check it out in it’s new home, look at the other great projects, and if you want to help continue to push forward on it, let me know.
在新家中检查一下,看看其他出色的项目,如果您想继续推动它前进,请告诉我。
翻译自: https://www.pybloggers.com/2016/11/category-encoders-accepted-into-scikit-learn-contrib/
scikit-learn