OpenCV vs. Armadillo vs. Eigen on Linux revisited

原文:http://nghiaho.com/?p=954

This is a quick revisit to my recent post comparing 3 different libraries with matrix support. As suggested by one of the comments to the last post, I’ve turned off any debugging option that each library may have. In practice you would have them on most of the time for safety reasons, but for this test I thought it would be interesting to see it turned off.

Armadillo and Eigen uses the define ARMA_NO_DEBUG and NDEBUG respectively to turn off error checking. I could not find an immediate way to do the same thing in OpenCV, unless I  edit the source code, but chose not to. So keep that in that mind. I also modified the number of iterations for each of the 5 operation performed to be slightly more accurate. Fast operations like add, multiply, transpose and invert have more iterations performed to get a better average, compared to SVD, which is quite slow.

On with the results …

Add

Performing C = A + B

Raw data

Results in ms OpenCV Armadillo Eigen
4×4 0.00093 0.00008 0.00007
8×8 0.00039 0.00006 0.00015
16×16 0.00066 0.00030 0.00059
32×32 0.00139 0.00148 0.00194
64×64 0.00654 0.00619 0.00712
128×128 0.02454 0.02738 0.03225
256×256 0.09144 0.11315 0.10920
512×512 0.47997 0.57668 0.47382

Normalised

Speed up over slowest OpenCV Armadillo Eigen
4×4 1.00x 12.12x 14.35x
8×8 1.00x 6.53x 2.63x
16×16 1.00x 2.19x 1.13x
32×32 1.39x 1.31x 1.00x
64×64 1.09x 1.15x 1.00x
128×128 1.31x 1.18x 1.00x
256×256 1.24x 1.00x 1.04x
512×512 1.20x 1.00x 1.22x

Multiply

Performing C = A * B

Raw data

Results in ms OpenCV Armadillo Eigen
4×4 0.00115 0.00017 0.00086
8×8 0.00195 0.00078 0.00261
16×16 0.00321 0.00261 0.00678
32×32 0.01865 0.01947 0.02130
64×64 0.15366 0.33080 0.07835
128×128 1.87008 1.72719 0.35859
256×256 15.76724 3.70212 2.70168
512×512 119.09382 24.08409 22.73524

Normalised

Speed up over slowest OpenCV Armadillo Eigen
4×4 1.00x 6.74x 1.34x
8×8 1.34x 3.34x 1.00x
16×16 2.11x 2.60x 1.00x
32×32 1.14x 1.09x 1.00x
64×64 2.15x 1.00x 4.22x
128×128 1.00x 1.08x 5.22x
256×256 1.00x 4.26x 5.84x
512×512 1.00x 4.94x 5.24x

Transpose

Performing C = A^T

Raw data

Results in ms OpenCV Armadillo Eigen
4×4 0.00067 0.00004 0.00003
8×8 0.00029 0.00006 0.00008
16×16 0.00034 0.00028 0.00028
32×32 0.00071 0.00068 0.00110
64×64 0.00437 0.00592 0.00500
128×128 0.01552 0.06537 0.03486
256×256 0.08828 0.40813 0.20032
512×512 0.52455 1.51452 0.77584

Normalised

Speed up over slowest OpenCV Armadillo Eigen
4×4 1.00x 17.61x 26.76x
8×8 1.00x 4.85x 3.49x
16×16 1.00x 1.20x 1.21x
32×32 1.56x 1.61x 1.00x
64×64 1.35x 1.00x 1.18x
128×128 4.21x 1.00x 1.88x
256×256 4.62x 1.00x 2.04x
512×512 2.89x 1.00x 1.95x

Inversion

Performing C = A^-1

Raw data

Results in ms OpenCV Armadillo Eigen
4×4 0.00205 0.00046 0.00271
8×8 0.00220 0.00417 0.00274
16×16 0.00989 0.01255 0.01094
32×32 0.06101 0.05146 0.05023
64×64 0.41286 0.25769 0.27921
128×128 3.60347 3.76052 1.88089
256×256 33.72502 23.10218 11.62692
512×512 285.03784 126.70175 162.74253

Normalised

Speed up over slowest OpenCV Armadillo Eigen
4×4 1.32x 5.85x 1.00x
8×8 1.90x 1.00x 1.52x
16×16 1.27x 1.00x 1.15x
32×32 1.00x 1.19x 1.21x
64×64 1.00x 1.60x 1.48x
128×128 1.04x 1.00x 2.00x
256×256 1.00x 1.46x 2.90x
512×512 1.00x 2.25x 1.75x

SVD

Performing full SVD, [U,S,V] = SVD(A)

Raw data

Results in ms OpenCV Armadillo Eigen
4×4 0.01220 0.22080 0.01620
8×8 0.01760 0.05760 0.03340
16×16 0.10700 0.16560 0.25540
32×32 0.51480 0.70230 1.13900
64×64 3.63780 3.43520 6.63350
128×128 27.04300 23.01600 64.27500
256×256 240.11000 210.70600 675.84100
512×512 1727.44000 1586.66400 6934.32300

Normalised

Discussion

Overall, the average running time has decreased for all the operations, which is a good start. Even OpenCV has lower running time, maybe the NDEBUG has an affect, since it’s a standardised define.

Speed up over slowest OpenCV Armadillo Eigen
4×4 18.10x 1.00x 13.63x
8×8 3.27x 1.00x 1.72x
16×16 2.39x 1.54x 1.00x
32×32 2.21x 1.62x 1.00x
64×64 1.82x 1.93x 1.00x
128×128 2.38x 2.79x 1.00x
256×256 2.81x 3.21x 1.00x
512×512 4.01x 4.37x 1.00x

Discussion

Overall, average running time has decreased for all operations, which is a good sign. Even OpenCV, maybe the NDEBUG has an affect, since it’s a standardised define.

The results from the addition test show all 3 libraries giving more or less the same result. This is probably not a surprise since adding matrix is a very straight forward O(N) task.

The multiply test is a bit more interesting. For matrix 64×64 or larger, there is a noticeable gap between the libraries. Eigen is very fast, with Armadillo coming in second for matrix 256×256 or greater. I’m guessing for larger matrices Eigen and Armadillo leverages the extra CPU core, because I did see all the CPU cores utilised briefly during benchmarking.

The transpose test involve shuffling memory around. This test is affected by the CPU’s caching mechanism. OpenCV does a good job as the matrix size increases.

The inversion test is a bit of a mixed bag. OpenCV seems to be the slowest out of the two.

The SVD test is interesting. Seems like there is a clear range where OpenCV and Armadillo are faster. Eigen lags behind by quite a bit as the matrix size increases.

Conclusion

In practice, if you just want a matrix library and nothing more then Armadillo or Eigen is probably the way to go. If you want something that is very portable with minimal effort then choose Eigen, because the entire library is header based, no library linking required. If you want the fastest matrix code possible then you can be adventurous and try combining the best of each library.

Download

test_matrix_lib.cpp

Code compiled with:

g++ test_matrix_lib.cpp -o test_matrix_lib -lopencv_core -larmadillo -lgomp -fopenmp \
-march=native -O3 -DARMA_NO_DEBUG -DNDEBUG
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值