测试环境如下:
IDE:Jupyter notebook
Java:SciJava
C++:xeus-cling
python:python3.7
Julia: Julia1.4.2
C++的安装方式
conda activate base
conda install anaconda
conda install xeus-cling -c conda-forge
1. 迭代计算斐波那契数列
1.1. java
public static int fib(int n){
if(n == 0)
return 0;
if(n ==1)
return 1;
else
return fib(n-1) + fib(n-2);
}
long startTime=System.currentTimeMillis(); //获取开始时间
fib(40); //测试的代码段
long endTime=System.currentTimeMillis(); //获取结束时间
输出为:102334155
Java time: 1.626s
1.2. C++
int fib(int n) {
if(n == 0)
return 0;
if(n ==1)
return 1;
else
return fib(n-1) + fib(n-2);
}
%timeit fib(40);
900 ms ± 10.5 ms per loop (mean ± std. dev. of 7 runs 1 loop each)
1.3. Python
import time
# recursive method
def fib(n):
if n <= 1:
return n
else:
return fib(n-1) + fib(n-2)
time_cost = 0
for _ in range(100):
# time cost of cursive method
t1 = time.time()
print(fib(40))
t2 = time.time()
time_cost += (t2-t1)
print('python time:%s'%(time_cost))
我们会发现速度巨慢无比:
102334155
python time:43.716s
加上cython看看
%load_ext Cython
%%cython
def fib_cython(n):
if n<2:
return n
return fib_cython(n-1)+fib_cython(n-2)
%timeit fib_cython(40)
11.4 s ± 227 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
加上@jit看下,飞速提升:
import time
from numba import jit
# recursive method
@jit
def fib(n):
if n <= 1:
return n
else:
return fib(n-1) + fib(n-2)
time_cost = 0
for _ in range(100):
# time cost of cursive method
t1 = time.time()
print(fib(40))
t2 = time.time()
time_cost += (t2-t1)
print('python time:%s'%(time_cost/100))
102334155
python time:0.950s
java之所以速度很快,也是因为默认开启了JIT技术
静态编译之后速度又快了些:
102334155
python time:0.932s
针对此问题有个很有效的加速方法,用上functools的lru_cache包:
from functools import lru_cache as cache
@cache(maxsize=None)
def fib_cache(n):
if n<2:
return n
return fib_cache(n-1)+fib_cache(n-2)
%time fib_cache(40)
CPU times: user 27 µs, sys: 9 µs, total: 36 µs
Wall time: 37.9 µs
1.4 julia
代码和python几乎一致:
function fib(n)
if n <= 1
return n
else
return fib(n-1) + fib(n-2)
end
end
@time fib(40)
julia time: 0.684886 seconds
102334155
1.5 总结
没有用任何技巧,julia就达到了C++的水准。Java在这里用到了JIT的技巧,但是在下面这个数值计算的例子里面就不占优势了。
2. for循环运算
2.1 java
public static int forjava(int n){
int s = 0;
for(int i=0;i<=n;i++){
s+=i*i
}
return s;
}
long startTime=System.currentTimeMillis(); //获取开始时间
for(int j=0;j<1000;j++){
forjava(1000000); //测试的代码段
}
long endTime=System.currentTimeMillis(); //获取结束时间
(endTime - startTime)/1000
0.438ms
2.2 C++
int forcpp(int n) {
int sum = 0;
for(int i=0;i<=n;i++){
sum += i*i;
}
return sum;
}
%timeit forcpp(1000000);
2.26 ms ± 36.1 us per loop (mean ± std. dev. of 7 runs 100 loops each)
2.3 python
def forpython(n):
sum = 0
for i in range(n+1):
sum += i*i
return sum
%timeit forpython(1000000)
111 ms ± 1.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
使用静态编译后飞速提升:
189 ns ± 2.59 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
2.4 Julia
function forjulia(n::Int64)
s = 0
for i in 1:n
s += i*i
end
s
end
@time forjulia(1000000)
0.000000 seconds
3. 浮点数计算
3.1 java
double[] d = new double[1000000];
for(int i=0;i<1000000;i++){
d[i] = Math.random();
}
public static double cjava(double[] d){
int s = 0;
for(int i=0;i<1000000;i++){
s+=d[i]*d[i]
}
return s;
}
long startTime=System.currentTimeMillis(); //获取开始时间
for(int j=0;j<1000;j++){
cjava(d); //测试的代码段
}
long endTime=System.currentTimeMillis(); //获取结束时间
(endTime - startTime)/1000
3.613ms
3.2 C++
double *genArray(long n) {
double *arr = new double[n]; // 创建一个 n个元素的数组
srand(1024); // 随机种子
for (int i = 0; i < n; i++){
arr[i] = rand()/ double(RAND_MAX);
}
return arr;
}
double ccpp(double* x){
double s = 0.0;
for(int i=0;i< 1000000;i++){
s+=x[i]*x[i];
}
return s;
}
clock_t tStart, tFinish;
tStart = clock();
double* ddd = genArray(1000000);
ccpp(ddd);
tFinish = clock();
printf("%.8f",(float)(tFinish-tStart)/CLOCKS_PER_SEC)
0.01426600s,也就是14.3ms
3.3 python
import numpy as np
d = 1000000
np.random.seed(1024) # make reproducible
xb = np.random.random(d)
%timeit np.linalg.norm(xb,2)
def cpython():
s = 0.0
for d in xb:
s += d*d
return np.sqrt(s)
%timeit cpython()
分别为171us和328ms
3.4 julia
using PyCall
@pyimport numpy as np
d = 1000000
np.random.seed(1024)
xb = np.random.random(d)
function cjulia(xb)
s = 0.0
for d in xb
s += d*d
end
np.sqrt(s)
end
cjulia(xb)
@time cjulia(xb)
1.04ms
4. 总结
Julia几乎不需要要做任何技巧,就能得到非常不错的性能。令人意外的是,java竟然比C++要快。python如果用上numpy或者numba的话可以数量级的提升速度。
java | C++ | python | python(jit) | cython | julia | |
---|---|---|---|---|---|---|
fib/秒 | 1.63 | 0.90 | 43.72 | 0.95 | 11.40 | 0.68 |
recursive/毫秒 | 0.44 | 2.26 | 111.00 | 0.00 | 84 | 0.00 |
float/毫秒 | 3.61 | 14.3 | 328/0.17 | 1.11 | / | 1.04 |