python windows和linux运算速度_Linux和Windows的numpy性能差异

最新推荐文章于 2021-05-10 09:45:31 发布

weixin_39588679

最新推荐文章于 2021-05-10 09:45:31 发布

阅读量1k

点赞数

文章标签： python windows和linux运算速度

我试图在两台不同的计算机上运行sklearn.decomposition.TruncatedSVD()并了解性能差异。在

计算机1(Windows 7，物理计算机)OS Name Microsoft Windows 7 Professional

System Type x64-based PC

Processor Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 3401 Mhz, 4 Core(s),

8 Logical Installed Physical Memory (RAM) 8.00 GB

Total Physical Memory 7.89 GB

计算机2(Debian，亚马逊云上)

^{pr2}$

计算机3(亚马逊云上的Windows 2008R2)OS Name Microsoft Windows Server 2008 R2 Datacenter

Version 6.1.7601 Service Pack 1 Build 7601

System Type x64-based PC

Processor Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz, 2500 Mhz,

4 Core(s), 8 Logical Processor(s)

Installed Physical Memory (RAM) 30.0 GB

两台计算机都运行python3.2和相同的sklearn、numpy和scipy版本

我运行了cProfile如下：print(vectors.shape)

>>> (7500, 2042)

_decomp = TruncatedSVD(n_components=680, random_state=1)

global _o

_o = _decomp

cProfile.runctx('_o.fit_transform(vectors)', globals(), locals(), sort=1)

计算机1输出>>> 833 function calls in 1.710 seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function)

1 0.767 0.767 0.782 0.782 decomp_svd.py:15(svd)

1 0.249 0.249 0.249 0.249 {method 'enable' of '_lsprof.Profiler' objects}

1 0.183 0.183 0.183 0.183 {method 'normal' of 'mtrand.RandomState' objects}

6 0.174 0.029 0.174 0.029 {built-in method csr_matvecs}

6 0.123 0.021 0.123 0.021 {built-in method csc_matvecs}

2 0.110 0.055 0.110 0.055 decomp_qr.py:14(safecall)

1 0.035 0.035 0.035 0.035 {built-in method dot}

1 0.020 0.020 0.589 0.589 extmath.py:185(randomized_range_finder)

2 0.018 0.009 0.019 0.010 function_base.py:532(asarray_chkfinite)

24 0.014 0.001 0.014 0.001 {method 'ravel' of 'numpy.ndarray' objects}

1 0.007 0.007 0.009 0.009 twodim_base.py:427(triu)

1 0.004 0.004 1.710 1.710 extmath.py:232(randomized_svd)

计算机2输出>>> 858 function calls in 40.145 seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function)

2 32.116 16.058 32.116 16.058 {built-in method dot}

1 6.148 6.148 6.156 6.156 decomp_svd.py:15(svd)

2 0.561 0.281 0.561 0.281 decomp_qr.py:14(safecall)

6 0.561 0.093 0.561 0.093 {built-in method csr_matvecs}

1 0.337 0.337 0.337 0.337 {method 'normal' of 'mtrand.RandomState' objects}

6 0.202 0.034 0.202 0.034 {built-in method csc_matvecs}

1 0.052 0.052 1.633 1.633 extmath.py:183(randomized_range_finder)

1 0.045 0.045 0.054 0.054 _methods.py:73(_var)

1 0.023 0.023 0.023 0.023 {method 'argmax' of 'numpy.ndarray' objects}

1 0.023 0.023 0.046 0.046 extmath.py:531(svd_flip)

1 0.016 0.016 40.145 40.145 :1()

24 0.011 0.000 0.011 0.000 {method 'ravel' of 'numpy.ndarray' objects}

6 0.009 0.002 0.009 0.002 {method 'reduce' of 'numpy.ufunc' objects}

2 0.008 0.004 0.009 0.004 function_base.py:532(asarray_chkfinite)

计算机3输出>>> 858 function calls in 2.223 seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function)

1 0.956 0.956 0.972 0.972 decomp_svd.py:15(svd)

2 0.306 0.153 0.306 0.153 {built-in method dot}

1 0.274 0.274 0.274 0.274 {method 'normal' of 'mtrand.RandomState' objects}

6 0.205 0.034 0.205 0.034 {built-in method csr_matvecs}

6 0.151 0.025 0.151 0.025 {built-in method csc_matvecs}

2 0.133 0.067 0.133 0.067 decomp_qr.py:14(safecall)

1 0.032 0.032 0.043 0.043 _methods.py:73(_var)

1 0.030 0.030 0.030 0.030 {method 'argmax' of 'numpy.ndarray' objects}

24 0.026 0.001 0.026 0.001 {method 'ravel' of 'numpy.ndarray' objects}

2 0.019 0.010 0.020 0.010 function_base.py:532(asarray_chkfinite)

1 0.019 0.019 0.773 0.773 extmath.py:183(randomized_range_finder)

1 0.019 0.019 0.049 0.049 extmath.py:531(svd_flip)

注意{内置方法dot}从0.035s/call到16.058s/call的差异，450倍的速度！！------+---------+---------+---------+---------+---------------------------------------

------+---------+---------+---------+---------+---------------------------------------

1 | 0.035 | 0.035 | 0.035 | 0.035 | {built-in method dot} Computer 1

2 | 32.116 | 16.058 | 32.116 | 16.058 | {built-in method dot} Computer 2

2 | 0.306 | 0.153 | 0.306 | 0.153 | {built-in method dot} Computer 3

我知道应该有绩效差异，但我应该有那么高吗？在

有没有办法可以进一步调试这个性能问题？在

编辑

我测试了一台新电脑，电脑3，它的硬件与电脑2相似，操作系统也不同

结果是0.153s/调用{内置方法dot}仍然比Linux快100倍！！

编辑2

>>> np.__config__.show()

lapack_opt_info:

libraries = ['mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd', 'mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd']

library_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/lib/intel64']

define_macros = [('SCIPY_MKL_H', None)]

include_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/include']

blas_opt_info:

libraries = ['mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd']

library_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/lib/intel64']

define_macros = [('SCIPY_MKL_H', None)]

include_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/include']

openblas_info:

NOT AVAILABLE

lapack_mkl_info:

library_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/lib/intel64']