python linux windows性能_python – Linux和Windows之间的numpy性能差异

我试图在2台不同的计算机上运行sklearn.decomposition.TruncatedSVD()并了解性能差异.

电脑1(Windows 7,物理电脑)

OS Name Microsoft Windows 7 Professional

System Type x64-based PC

Processor Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 3401 Mhz, 4 Core(s),

8 Logical Installed Physical Memory (RAM) 8.00 GB

Total Physical Memory 7.89 GB

电脑2(Debian,亚马逊云)

Architecture: x86_64

CPU op-mode(s): 32-bit, 64-bit

Byte Order: Little Endian

CPU(s): 8

width: 64 bits

capabilities: ldt16 vsyscall32

*-core

description: Motherboard

physical id: 0

*-memory

description: System memory

physical id: 0

size: 29GiB

*-cpu

product: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz

vendor: Intel Corp.

physical id: 1

bus info: cpu@0

width: 64 bits

电脑3(Windows 2008R2,亚马逊云)

OS Name Microsoft Windows Server 2008 R2 Datacenter

Version 6.1.7601 Service Pack 1 Build 7601

System Type x64-based PC

Processor Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz, 2500 Mhz,

4 Core(s), 8 Logical Processor(s)

Installed Physical Memory (RAM) 30.0 GB

两台计算机都运行Python 3.2和相同的sklearn,numpy,scipy版本

我运行cProfile如下:

print(vectors.shape)

>>> (7500, 2042)

_decomp = TruncatedSVD(n_components=680, random_state=1)

global _o

_o = _decomp

cProfile.runctx('_o.fit_transform(vectors)', globals(), locals(), sort=1)

电脑1输出

>>> 833 function calls in 1.710 seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function)

1 0.767 0.767 0.782 0.782 decomp_svd.py:15(svd)

1 0.249 0.249 0.249 0.249 {method 'enable' of '_lsprof.Profiler' objects}

1 0.183 0.183 0.183 0.183 {method 'normal' of 'mtrand.RandomState' objects}

6 0.174 0.029 0.174 0.029 {built-in method csr_matvecs}

6 0.123 0.021 0.123 0.021 {built-in method csc_matvecs}

2 0.110 0.055 0.110 0.055 decomp_qr.py:14(safecall)

1 0.035 0.035 0.035 0.035 {built-in method dot}

1 0.020 0.020 0.589 0.589 extmath.py:185(randomized_range_finder)

2 0.018 0.009 0.019 0.010 function_base.py:532(asarray_chkfinite)

24 0.014 0.001 0.014 0.001 {method 'ravel' of 'numpy.ndarray' objects}

1 0.007 0.007 0.009 0.009 twodim_base.py:427(triu)

1 0.004 0.004 1.710 1.710 extmath.py:232(randomized_svd)

电脑2输出

>>> 858 function calls in 40.145 seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function)

2 32.116 16.058 32.116 16.058 {built-in method dot}

1 6.148 6.148 6.156 6.156 decomp_svd.py:15(svd)

2 0.561 0.281 0.561 0.281 decomp_qr.py:14(safecall)

6 0.561 0.093 0.561 0.093 {built-in method csr_matvecs}

1 0.337 0.337 0.337 0.337 {method 'normal' of 'mtrand.RandomState' objects}

6 0.202 0.034 0.202 0.034 {built-in method csc_matvecs}

1 0.052 0.052 1.633 1.633 extmath.py:183(randomized_range_finder)

1 0.045 0.045 0.054 0.054 _methods.py:73(_var)

1 0.023 0.023 0.023 0.023 {method 'argmax' of 'numpy.ndarray' objects}

1 0.023 0.023 0.046 0.046 extmath.py:531(svd_flip)

1 0.016 0.016 40.145 40.145 :1()

24 0.011 0.000 0.011 0.000 {method 'ravel' of 'numpy.ndarray' objects}

6 0.009 0.002 0.009 0.002 {method 'reduce' of 'numpy.ufunc' objects}

2 0.008 0.004 0.009 0.004 function_base.py:532(asarray_chkfinite)

电脑3输出

>>> 858 function calls in 2.223 seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function)

1 0.956 0.956 0.972 0.972 decomp_svd.py:15(svd)

2 0.306 0.153 0.306 0.153 {built-in method dot}

1 0.274 0.274 0.274 0.274 {method 'normal' of 'mtrand.RandomState' objects}

6 0.205 0.034 0.205 0.034 {built-in method csr_matvecs}

6 0.151 0.025 0.151 0.025 {built-in method csc_matvecs}

2 0.133 0.067 0.133 0.067 decomp_qr.py:14(safecall)

1 0.032 0.032 0.043 0.043 _methods.py:73(_var)

1 0.030 0.030 0.030 0.030 {method 'argmax' of 'numpy.ndarray' objects}

24 0.026 0.001 0.026 0.001 {method 'ravel' of 'numpy.ndarray' objects}

2 0.019 0.010 0.020 0.010 function_base.py:532(asarray_chkfinite)

1 0.019 0.019 0.773 0.773 extmath.py:183(randomized_range_finder)

1 0.019 0.019 0.049 0.049 extmath.py:531(svd_flip)

注意{内置方法点}差异从0.035s /调用到16.058s / call,慢450倍!!

------+---------+---------+---------+---------+---------------------------------------

ncalls| tottime | percall | cumtime | percall | filename:lineno(function) HARDWARE

------+---------+---------+---------+---------+---------------------------------------

1 | 0.035 | 0.035 | 0.035 | 0.035 | {built-in method dot} Computer 1

2 | 32.116 | 16.058 | 32.116 | 16.058 | {built-in method dot} Computer 2

2 | 0.306 | 0.153 | 0.306 | 0.153 | {built-in method dot} Computer 3

我知道应该存在性能差异,但我应该这么高吗?

有没有办法可以进一步调试这个性能问题?

编辑

我测试了一台新计算机,计算机3,其硬件类似于计算机2和不同的操作系统

结果是0.153秒/ {内置方法点}的调用仍然比Linux快100倍!

编辑2

电脑1 numpy配置

>>> np.__config__.show()

lapack_opt_info:

libraries = ['mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd', 'mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd']

library_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/lib/intel64']

define_macros = [('SCIPY_MKL_H', None)]

include_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/include']

blas_opt_info:

libraries = ['mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd']

library_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/lib/intel64']

define_macros = [('SCIPY_MKL_H', None)]

include_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/include']

openblas_info:

NOT AVAILABLE

lapack_mkl_info:

libraries = ['mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd', 'mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd']

library_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/lib/intel64']

define_macros = [('SCIPY_MKL_H', None)]

include_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/include']

blas_mkl_info:

libraries = ['mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd']

library_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/lib/intel64']

define_macros = [('SCIPY_MKL_H', None)]

include_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/include']

mkl_info:

libraries = ['mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd']

library_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/lib/intel64']

define_macros = [('SCIPY_MKL_H', None)]

include_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/include']

电脑2 numpy配置

>>> np.__config__.show()

lapack_info:

NOT AVAILABLE

lapack_opt_info:

NOT AVAILABLE

blas_info:

libraries = ['blas']

library_dirs = ['/usr/lib']

language = f77

atlas_threads_info:

NOT AVAILABLE

atlas_blas_info:

NOT AVAILABLE

lapack_src_info:

NOT AVAILABLE

openblas_info:

NOT AVAILABLE

atlas_blas_threads_info:

NOT AVAILABLE

blas_mkl_info:

NOT AVAILABLE

blas_opt_info:

libraries = ['blas']

library_dirs = ['/usr/lib']

language = f77

define_macros = [('NO_ATLAS_INFO', 1)]

atlas_info:

NOT AVAILABLE

lapack_mkl_info:

NOT AVAILABLE

mkl_info:

NOT AVAILABLE

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值