纯Python vs NumPy vs TensorFlow性能比较

Python has a design philosophy that stresses allowing programmers to express concepts readably and in fewer lines of code. This philosophy makes the language suitable for a diverse set of use cases: simple scripts for web, large web applications (like YouTube), scripting language for other platforms (like Blender and Autodesk’s Maya), and scientific applications in several areas, such as astronomy, meteorology, physics, and data science.

Python的设计理念强调允许程序员以更少的代码行可读地表达概念。 这种理念使该语言适用于各种用例 :用于Web的简单脚本,大型Web应用程序(例如YouTube),其他平台的脚本语言(例如Blender和Autodesk的Maya)以及在多个领域(例如天文学)的科学应用程序,气象,物理学和数据科学。

It is technically possible to implement scalar and matrix calculations using Python lists. However, this can be unwieldy, and performance is poor when compared to languages suited for numerical computation, such as MATLAB or Fortran, or even some general purpose languages, such as C or C++.

从技术上讲,可以使用Python列表实现标量和矩阵计算。 但是,与适合于数值计算的语言(如MATLAB或Fortran)或某些通用语言(如C或C ++)相比,这可能会很麻烦,并且性能很差。

To circumvent this deficiency, several libraries have emerged that maintain Python’s ease of use while lending the ability to perform numerical calculations in an efficient manner. Two such libraries worth mentioning are NumPy (one of the pioneer libraries to bring efficient numerical computation to Python) and TensorFlow (a more recently rolled-out library focused more on deep learning algorithms).

为了避免这种缺陷,已经出现了一些库,这些库在保持Python易用性的同时,还提供了以高效方式执行数值计算的能力。 值得一提的两个这样的库是NumPy(将高效的数值计算引入Python的先驱库之一)和TensorFlow(一个最近推出的库,它更加专注于深度学习算法)。

  • NumPy provides support for large multidimensional arrays and matrices along with a collection of mathematical functions to operate on these elements. The project relies on well-known packages implemented in another languages (like Fortran) to perform efficient computations, bringing the user both the expressiveness of Python and a performance similar to MATLAB or Fortran.
  • TensorFlow is an open-source library for numerical computation originally developed by researchers and engineers working at the Google Brain team. The main focus of the library is to provide an easy-to-use API to implement practical machine learning algorithms and deploy them to run on CPUs, GPUs, or a cluster.
  • NumPy支持大型多维数组和矩阵,以及对这些元素进行操作的一系列数学函数。 该项目依靠以另一种语言(例如Fortran)实现的知名软件包来执行高效的计算,从而为用户带来Python的表现力和类似于MATLAB或Fortran的性能。
  • TensorFlow是一个用于数值计算的开源库,最初由Google Brain团队的研究人员和工程师开发。 该库的主要焦点是提供一个易于使用的API,以实现实用的机器学习算法,并将其部署在CPU,GPU或集群上运行。

But how do these schemes compare? How much faster does the application run when implemented with NumPy instead of pure Python? What about TensorFlow? The purpose of this article is to begin to explore the improvements you can achieve by using these libraries.

但是这些方案相比如何? 使用NumPy而非纯Python实施时,应用程序运行的速度有多快? 那TensorFlow呢? 本文的目的是开始探索使用这些库可以实现的改进。

To compare the performance of the three approaches, you’ll build a basic regression with native Python, NumPy, and TensorFlow.

为了比较这三种方法的性能,您将使用本机Python,NumPy和TensorFlow构建基本回归。

Get Notified: Don’t miss the follow up to this tutorial—Click here to join the Real Python Newsletter and you’ll know when the next instalment comes out.

通知您:不要错过本教程的后续内容- 单击此处加入Real Python Newslet ,您将知道下一期的发行时间。

工程测试数据 (Engineering the Test Data)

To test the performance of the libraries, you’ll consider a simple two-parameter linear regression problem. The model has two parameters: an intercept term, w_0 and a single coefficient, w_1.

为了测试库的性能,您将考虑一个简单的两参数线性回归问题 。 该模型具有两个参数:拦截项w_0和单个系数w_1

Given N pairs of inputs x and desired outputs d, the idea is to model the relationship between the outputs and the inputs using a linear model y = w_0 + w_1 * x where the output of the model y is approximately equal to the desired output d for every pair (x, d).

给定N对输入x和期望输出d ,其思想是使用线性模型y = w_0 + w_1 * x对输出和输入之间的关系进行建模,其中模型y的输出大约等于期望输出d对于每对(x, d)

Technical Detail: The intercept term, w_0, is technically just a coefficient like w_1, but it can be interpreted as a coefficient that multiplies elements of a vector of 1s.

技术细节 :截距项w_0从技术上来说只是一个像w_1的系数,但可以将其解释为乘以1s向量的元素的系数。

To generate the training set of the problem, use the following program:

要生成问题的训练集,请使用以下程序:

 import import numpy numpy as as np

np

npnp .. randomrandom .. seedseed (( 444444 )

)

N N = = 10000
10000
sigma sigma = = 0.1
0.1
noise noise = = sigma sigma * * npnp .. randomrandom .. randnrandn (( NN )
)
x x = = npnp .. linspacelinspace (( 00 , , 22 , , NN )
)
d d = = 3 3 + + 2 2 * * x x + + noise
noise
dd .. shape shape = = (( NN , , 11 )

)

# We need to prepend a column vector of 1s to `x`.
# We need to prepend a column vector of 1s to `x`.
X X = = npnp .. column_stackcolumn_stack (((( npnp .. onesones (( NN , , dtypedtype == xx .. dtypedtype ), ), xx ))
))
printprint (( XX .. shapeshape )
)
(( 1000010000 , , 22 )
)

This program creates a set of 10,000 inputs x linearly distributed over the interval from 0 to 2. It then creates a set of desired outputs d = 3 + 2 * x + noise, where noise is taken from a

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值