机器学习开源框架_2020年您需要了解的15种未发现的开源机器学习框架。

最新推荐文章于 2024-03-05 17:57:02 发布

cumian9828

最新推荐文章于 2024-03-05 17:57:02 发布

阅读量1k

点赞数

文章标签：神经网络算法编程语言大数据 python

原文链接：https://www.freecodecamp.org/news/15-undiscovered-open-source-machine-learning-frameworks-you-need-to-know-in-2020/

版权

机器学习开源框架

Machine Learning (ML) is one of the fastest emerging technologies today. And the application of machine learning to different areas of computing is gaining popularity rapidly.

机器学习(ML)是当今发展最快的技术之一。机器学习在不同计算领域的应用正在Swift普及。

This is not only because of the existence of cheap and powerful hardware. It's also because of the increasing availability of free and open-source Machine learning frameworks, which allow developers to implement machine learning easily.

这不仅是因为存在廉价而强大的硬件。这也是因为免费和开源的可用性不断提高 机器学习 框架，使开发人员可以轻松实现机器学习。

This wide range of open-source machine learning frameworks let data scientists and machine learning engineers build, implement and maintain machine learning systems, generate new projects, and create new and impactful machine learning systems.

这种广泛的开源机器学习框架使数据科学家和机器学习工程师可以构建，实施和维护机器学习系统，生成新项目以及创建具有影响力的新机器学习系统。

Choosing a Machine Learning Framework or library to solve your use case involves making an assessment to decide what is right for your use case. Several factors are important for this assessment such as:

选择机器学习框架或库来解决您的用例需要进行评估，以决定适合您的用例的情况。几个因素对于此评估很重要，例如：

Ease of use.
使用方便。
Support in the market (Community).
市场支持(社区)。
Running Speeds.
运行速度。
Openness.
开放性

本文适用于谁？ (Who’s this article for?)

This article is for those who want to use the knowledge in practice after learning the theory.

本文适用于那些希望在学习了理论之后在实践中使用知识的人。

It's also for those who want to explore other potential open-source machine learning frameworks for their future ML project.

它也适合那些希望为未来的ML项目探索其他潜在的开源机器学习框架的人。

Now here is the list of undiscovered and open-source frameworks or libraries that businesses and individuals can use to build machine learning systems.

现在，这里是企业和个人可以用来构建机器学习系统的未发现的开源框架或库的列表。

1.块 (1.Blocks)

Blocks is a framework that helps you build neural network models on top of Theano. Currently, it supports and provides, constructing parametrized Theano operations, called “bricks”, pattern matching to select variables and bricks in large models algorithms to optimize your model and saving and resuming of training. Block's Repository

Blocks是一个框架，可帮助您在Theano之上构建神经网络模型。当前，它支持并提供，构造参数化的Theano操作(称为“砖块”)，模式匹配以选择大型模型算法中的变量和积木，以优化模型并节省和恢复训练。 Block的资料库

You can also learn about Fuel, the data processing engine developed primarily for Blocks.

您还可以了解Fuel ，它是主要为Blocks开发的数据处理引擎。

Programming Language: PythonGithub link: https://github.com/mila-iqia/blocks

编程语言：PythonGithub链接： https : //github.com/mila-iqia/blocks

2. Analytics Zoo (2. Analytics Zoo)

Analytics Zoo provides a unified data analytics and AI platform that seamlessly unites TensorFlow, Keras, PyTorch, Spark, Flink, and Ray programs into an integrated pipeline, which can transparently scale from a laptop to large clusters to process production big data. Analytics Zoo Repository

Analytics Zoo提供了一个统一的数据分析和AI平台，该平台将TensorFlow，Keras，PyTorch，Spark，Flink和Ray程序无缝集成到一个集成管道中，该管道可以透明地从笔记本电脑扩展到大型集群，以处理生产大数据。 Analytics Zoo存储库

When you should use Analytics Zoo to develop your AI solution:

当您应该使用Analytics Zoo开发AI解决方案时：

You want to easily prototype AI models.
您想轻松地制作AI模型原型。
When scaling matters to you.
扩展时对您很重要。
When you want to add automation processes into your machine learning pipeline such as feature engineering and model selection.
当您要将自动化流程添加到机器学习管道中时，例如特征工程和模型选择。

This project is maintained by Intel-analytics.

该项目由Intel-analytics维护。

Programming Language: PythonGithub link: https://github.com/intel-analytics/analytics-zoo

编程语言：PythonGithub链接： https : //github.com/intel-analytics/analytics-zoo

3. ML5.js (3. ML5.js)

Ml5.js aims to make machine learning approachable for a broad audience of artists, creative coders, and students. The library provides access to machine learning algorithms and models in the browser, building on top of TensorFlow.js."Ml5.js Repository"

Ml5.js的目标是使机器学习对艺术家，创意编码人员和学生的广泛读者来说是可行的。该库基于TensorFlow.js在浏览器中提供对机器学习算法和模型的访问。 “ Ml5.js存储库”

ml5.js is inspired by Processing and p5.js.

ml5.js受到Processing和p5.js的启发。

This open source project is developed and maintained by NYU's Interactive Telecommunications/Interactive Media Arts program and by artists, designers, students, technologists, and developers across the world.

这个开源项目是由纽约大学的“互动电信/互动媒体艺术”计划以及世界各地的艺术家，设计师，学生，技术人员和开发人员开发和维护的。

NOTE: This project is currently in development.

注意：该项目目前正在开发中。

Programming Language: JavascriptGithub link: https://github.com/ml5js/ml5-library

编程语言：JavascriptGithub链接： https : //github.com/ml5js/ml5-library

4，AdaNet (4.AdaNet)

AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention. AdaNet builds on recent AutoML efforts to be fast and flexible while providing learning guarantees. Importantly, AdaNet provides a general framework for not only learning a neural network architecture but also for learning to ensemble to obtain even better models."AdaNet Repository"

AdaNet是基于TensorFlow的轻量级框架，可在最少专家干预的情况下自动学习高质量模型。 AdaNet建立在AutoML最近的努力基础上，以快速灵活地提供学习保证。重要的是，AdaNet提供了一个通用框架，不仅用于学习神经网络体系结构，而且还用于学习集成以获得更好的模型。 “ AdaNet存储库”

AdaNet provides familiar API like Keras for training, evaluating and serving your models in production.

AdaNet提供了熟悉的API，例如Keras，用于训练，评估和服务生产中的模型。

Programming Language: PythonGithub link: https://github.com/tensorflow/adanet

编程语言：PythonGithub链接： https : //github.com/tensorflow/adanet

5.姆哈尔 (5. Mljar)

If you are looking for a platform to create prototype models and deployment service, Mljar is the right choice for you. Mljar tends to search different algorithms and perform hyper-parameters tuning to find the best model.

如果您正在寻找创建原型模型和部署服务的平台，则Mljar是您的正确选择。 Mljar倾向于搜索不同的算法并执行超参数调整以找到最佳模型。

It also provide quick results by running all computations in the cloud and finally creating ensemble models.Then it creates markdown reports from AutoML training.

它还可以通过在云中运行所有计算并最终创建集成模型来提供快速结果，然后通过AutoML培训创建降价报告。

Mljar can train ML models for:

Mljar可以针对以下方面训练ML模型：

binary classification,
二进制分类
multi-class classification,
多类分类
regression.
回归。

Mljar provides two types of interfaces:

Mljar提供了两种类型的接口：

Python wrapper over Mljar API.
基于Mljar API的Python包装器。
Running Machine Learning models in your web browser.
在Web浏览器中运行机器学习模型。

Programming Language: PythonGithub link: https://github.com/mljar/mljar-supervised.

编程语言：PythonGithub链接： https : //github.com/mljar/mljar-supervised 。

6. ConvNetJS (6. ConvNetJS)

Deep Learning in Javascript. Train Convolutional Neural Networks (or ordinary ones) in your browser."convnetjs Repository"

用Java进行深度学习。在浏览器中训练卷积神经网络(或普通的)。 “ convnetjs存储库”

Like Tensorflow.js, ConvNetJS is a JavaScript library that supports training different Deep learning models in your web browser. You don't need GPUs and other heavy software.

与Tensorflow.js一样，ConvNetJS是一个JavaScript库，支持在Web浏览器中训练不同的深度学习模型。您不需要GPU和其他笨重的软件。

ConvNetJS supports:

ConvNetJS支持：

Neural Network modules.
神经网络模块。
Training Convolutional Networks for images.
训练卷积网络获取图像。
Regression and Classification cost functions.
回归和分类成本函数。
Reinforcement Learning module, based on Deep Q Learning.
强化学习模块，基于深度Q学习。

Note: Not actively maintained.

注意：没有积极维护。

Programming Language: JavascriptGithub link: https://github.com/karpathy/convnetjs

编程语言：JavascriptGithub链接： https : //github.com/karpathy/convnetjs

7，NNI(神经网络智能) (7.NNI (Neural Network Intelligence))

NNI (Neural Network Intelligence) is a lightweight but powerful toolkit to help users automate Feature Engineering, Neural Architecture Search, Hyperparameter Tuning, and Model Compression. The tool manages automated machine learning (AutoML) experiments, dispatches and runs experiments’ trial jobs generated by tuning algorithms to search the best neural architecture and/or hyper-parameters in different training environments like Local Machine, Remote Servers, OpenPAI, Kubeflow, and other cloud options. NNI Repository

NNI(神经网络智能)是一个轻量级但功能强大的工具包，可帮助用户自动化 功能工程，神经体系结构搜索，超参数调整和模型压缩。该工具管理自动机器学习(AutoML)实验， 调度和运行由调整算法生成的实验性试验工作，以搜索不同训练环境(例如本地机器，远程服务器， OpenPAI ， Kubeflow和其他云选项。 NNI资料库

When you should consider using NNI

什么时候应该考虑使用NNI

If you want to try different AutoML algorithms.
如果您想尝试不同的AutoML算法。
If you want to run AutoML trial jobs in different environments.
如果要在不同的环境中运行AutoML试用作业。
If you want to support AutoML in your platform.
如果要在平台中支持AutoML。

NOTE: Open source project by Microsoft.

注意：Microsoft的开源项目。

Programming Language: PythonGithub link: https://github.com/Microsoft/nni

编程语言：PythonGithub链接： https : //github.com/Microsoft/nni

8，数据框 (8.Datumbox)

The Datumbox Machine Learning Framework is an open-source framework written in Java that allows the rapid development of Machine Learning and Statistical applications. The main focus of the framework is to include a large number of machine learning algorithms & statistical methods and to be able to handle large-sized datasets."DatumBox Repository"

Datumbox机器学习框架是用Java编写的开放源代码框架，可以快速开发机器学习和统计应用程序。该框架的主要重点是包括大量的机器学习算法和统计方法，并能够处理大型数据集。 “ DatumBox存储库”

Datumbox provides a number of pre-trained models for different tasks such as Spam Detection, Sentiment Analysis, Language Detection, Topic Classification and so on.

Datumbox提供了许多针对不同任务的预训练模型，例如垃圾邮件检测，情感分析，语言检测，主题分类等。

Programming language: JavaGithub link: https://github.com/datumbox/datumbox-framework

编程语言：JavaGithub链接： https : //github.com/datumbox/datumbox-framework

9.XAI(用于ML的可扩展性工具箱) (9.XAI (An eXplainability toolbox for ML))

XAI is a Machine Learning library that is designed with AI explainability in its core. XAI contains various tools that enable for analysis and evaluation of data and models. The XAI library is maintained by The Institute for Ethical AI & ML, and it was developed based on the 8 principles for Responsible Machine Learning." XAI Repository"

XAI是一个机器学习库，其设计核心是AI的可解释性。 XAI包含各种工具，可用于分析和评估数据和模型。 XAI库由道德与人工智能研究所 ( The Institute for Ethical AI＆ML )维护，它是根据负责任的机器学习的8条原则开发的。 “ XAI存储库”

The 8 principles for Responsible Machine Learning includes:

负责任的机器学习的8条原则包括：

Human augmentation
人体扩增
Bias Evaluation
偏差评估
Explainability by Justification
有理由的解释性
Reproducible operations
可重复的操作
Displacement strategy
排量策略
Practical accuracy
实际精度
Trust by privacy
隐私信任
Data risk awareness
数据风险意识

To learn more about XAI, you can check out this talk at Tensorflow London. It contains insight on the definitions and principles of this library.

要了解有关XAI的更多信息，您可以在Tensorflow London上查看此演讲。它包含有关此库的定义和原理的见解。

XAI is currently in early stage development, the current version is 0.05 (Alpha).

XAI当前处于早期开发阶段，当前版本为0.05(Alpha)。

Programming Language: PythonGithub link: https://github.com/EthicalML/xai

编程语言：PythonGithub链接： https : //github.com/EthicalML/xai

10，柏拉图 (10.Plato)

Plato is a flexible framework for development of any conversational AI agents in different environments. Plato was designed both for users with a limited background in conversational AI and seasoned researchers in the field. It provides a clean and understandable design, integrates with existing deep learning and Bayesian optimization frameworks, and reduces the need to write code.

Plato是用于在不同环境中开发任何对话式AI代理的灵活框架。柏拉图既针对对话型AI方面的背景知识的用户，又针对该领域的资深研究人员而设计。它提供了一个清晰易懂的设计，并与现有的深度学习和贝叶斯优化框架集成在一起，并减少了编写代码的需求。

It supports interactions through text, speech, and dialogue acts. To learn how the Plato Research Dialogue System works, read the article here.

它支持通过文本，语音和对话行为进行交互。要了解柏拉图研究对话系统如何工作，请在此处阅读文章。

NOTE: Plato is an open source project by Uber.

注意： Plato是Uber的开源项目。

Programming Language: PythonGithub link: https://github.com/uber-research/plato-research-dialogue-system

编程语言：PythonGithub链接： https : //github.com/uber-research/plato-research-dialogue-system

11，DeepDetect (11.DeepDetect)

DeepDetect is a machine learning API and server written in C++. It makes state of the art machine learning easy to work with and integrate into existing applications.

DeepDetect是用C ++编写的机器学习API和服务器。它使最先进的机器学习易于使用并集成到现有应用程序中。

DeepDetect implements support for supervised and unsupervised deep learning of images, text, time series, and other data, with a focus on simplicity and ease of use, test, and connection into existing applications. It supports classification, object detection, segmentation, regression, and autoencoders. DeepDetect Repository

DeepDetect实现了对图像，文本，时间序列和其他数据的有监督和无监督的深度学习的支持，重点是简单性和易用性，测试以及与现有应用程序的连接。它支持分类，对象检测，分段，回归和自动编码器。 DeepDetect存储库

DeepDetect relies on external machine learning libraries such as:

DeepDetect依赖于外部机器学习库，例如：

Gradient boosting library XGBoost.
梯度提升库XGBoost 。
Deep learning libraries (Caffe, Tensorflow, Caffe2, Torch, NCNN, and Dlib).
深度学习库( Caffe ， Tensorflow ， Caffe2 ， Torch ， NCNN和Dlib )。
clustering with T-SNE.
与T-SNE聚类。
similarity search with Annoy and FAISS.
与Annoy和FAISS进行相似搜索。

DeepDetect is designed, implemented and supported by Jolibrain with the help of other different contributors.

DeepDetect是由Jolibrain在其他不同贡献者的帮助下设计，实施和支持的。

Programming Language: C++Github link: https://github.com/jolibrain/deepdetect

编程语言：C ++ Github链接： https : //github.com/jolibrain/deepdetect

12，流光 (12.Streamlit)

Streamlit — The fastest way to build custom ML tools.

Streamlit —构建自定义ML工具的最快方法。

Streamlit is an awesome tool that allows Data scientists, ML engineers, and developers to quickly build highly interactive web applications for their machine learning projects.

Streamlit是一个了不起的工具，可让数据科学家，机器学习工程师和开发人员快速为其机器学习项目构建高度交互的Web应用程序。

Streamlit doesn’t require any knowledge of web development. If you know Python then you’re good to go!

Streamlit不需要任何Web开发知识。如果您了解Python，那就太好了！

It also supports hot-reloading which means your app updates live while you're editing and saving your files.

它还支持热重载，这意味着您在编辑和保存文件时可以实时更新应用程序。

Take a look at Streamlit in action:

看一下Streamlit的实际效果：

Programming Language: Javascript & PythonGithub link: https://github.com/streamlit/streamlit

编程语言：Javascript和PythonGithub链接： https : //github.com/streamlit/streamlit

13，多巴胺 (13.Dopamine)

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. It aims to fill the need for a small, easily grokked codebase in which users can freely experiment with wild ideas (speculative research). Dopamine Repository

多巴胺是用于强化学习算法的快速原型制作的研究框架。它旨在满足对小型，易处理的代码库的需求，在该代码库中，用户可以自由地试验荒诞的想法(推测性研究)。多巴胺储存库

The design principles for Dopamine include:

多巴胺的设计原则包括：

Easy experimentation.
简单的实验。
Flexible development.
灵活的发展。
Compact and reliable.
紧凑可靠。
Reproducible.
可重现。

Last year (2019) Dopamine switched its network definitions to use tf.keras.Model. The previous tf.contrib.slim based networks have been removed.

去年(2019)，多巴胺将其网络定义切换为使用tf.keras.Model 。以前的基于tf.contrib.slim的网络已被删除。

To learn how to use Dopamine check out the Colaboratory notebooks.

要了解如何使用多巴胺，请查看Colaboratory笔记本。

Note: Dopamine is an open source project from Google.

注意：多巴胺是Google的开源项目。

Programming Language: PythonGithub link: https://github.com/google/dopamine

编程语言：PythonGithub链接： https ： //github.com/google/dopamine

14.Turi创建 (14.TuriCreate)

TuriCreate is an open-source toolset for creating custom Core ML models.

TuriCreate是用于创建自定义Core ML模型的开源工具集。

With TuriCreate you can accomplish different ML tasks such as Image classification, Sound classification, Object Detection, Style Transfer, Activity classification, Image similarity recommender, text classification, and clustering.

使用TuriCreate，您可以完成不同的ML任务，例如图像分类，声音分类，对象检测，样式转移，活动分类，图像相似性推荐程序，文本分类和聚类。

The framework is simple to use, flexible, and visual. It works on large datasets and is ready to deploy. The trained models can be used right away in iOS, macOS, tvOS and watchOS apps without any extra conversion.

该框架易于使用，灵活且直观。它适用于大型数据集并准备部署。训练有素的模型可以立即在iOS，macOS，tvOS和watchOS应用中使用，而无需任何额外的转换。

Check out TuriCreate talks at WWDC 2019 and WWDC 2018 to learn more about TuriCreate.

在WWDC 2019和WWDC 2018上查看TuriCreate演讲，以了解有关TuriCreate的更多信息。

NOTE: TuriCreate is an 0pen source project by Apple.

注意： TuriCreate是Apple提供的0pen源项目。

Programming Language: PythonGithub link: https://github.com/apple/turicreate

编程语言：PythonGithub链接： https : //github.com/apple/turicreate

15，天才 (15.Flair)

Flair is a simple natural language processing (NLP) framework, developed and open-sourced by the Humboldt University of Berlin. Flair is an official part of the PyTorch ecosystem and is used in hundreds of industrial and academic projects.

Flair是一个简单的自然语言处理(NLP)框架，由柏林洪堡大学开发并开源。 Flair是PyTorch生态系统的正式组成部分，已在数百个工业和学术项目中使用。

Flair allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS), sense disambiguation, and classification. Flair Repository

Flair允许您将我们最先进的自然语言处理(NLP)模型应用于您的文本，例如命名实体识别(NER)，词性标记(PoS)，意义消歧和分类。 Flair存储库

Flair outperforms the previous best methods on a range of NLP tasks: Named Entity Recognition, Part of Speech Tagging, and Chunking. Check out this table:

在一系列NLP任务上，Flair的性能优于以前的最佳方法：命名实体识别，语音标记的一部分和分块。查看此表：

Note: F1 score is an evaluation metric primarily used for classification tasks. The F1 score takes into consideration the distribution of the classes present.

注意：F1分数是主要用于分类任务的评估指标。 F1分数考虑了当前班级的分布。

Learn how to perform text classification Using Flair Embeddings in this article.

在本文中了解如何使用Flair Embeddings执行文本分类。

Programming Language: PythonGithub link: https://github.com/flairNLP/flair

编程语言：PythonGithub链接： https : //github.com/flairNLP/flair

结论 (Conclusion)

Before you start to build a machine learning application, you need to select one ML framework from the many options out there. This can be a difficult task.

在开始构建机器学习应用程序之前，需要从众多选项中选择一个ML框架。这可能是一项艰巨的任务。

Therefore, it’s important to evaluate several options before making a final decision. The open-source machine learning frameworks mentioned above can help anyone build machine learning models efficiently and easily.

因此，在做出最终决定之前评估几个选项很重要。上面提到的开源机器学习框架可以帮助任何人高效，轻松地构建机器学习模型。

Are you wondering what the most popular Machine Learning Frameworks are? Here is the list that most data scientists and Machine learning engineers use most of their time.

您是否想知道最受欢迎的机器学习框架是什么？这是大多数数据科学家和机器学习工程师大部分时间使用的列表。