求次幂是个老生常谈的话题了,在任何一门语言里都是最基础的部分。在这里主要以JS为例。
首先,最容易想到的是暴力法,也就是连续乘上n个x;但是这个的性能显然很差,我们一般不会考虑这么写。而且,为什么不用内置函数呢?调内置函数没什么难度,随便举个例子吧:
var myPow = function(x, n) {
return Math.pow(x, n);
};
确实,这个没什么难度,并且内置函数一般得到了充分的优化,性能也是很有保障的。但是,内置函数一定就快吗?还能进一步优化吗?还有没有别的方法?这些问题是值得思考的。
这是利用数学性质来进行计算的,基于一个很简单的思路, x n = 2 n l o g 2 x x^n=2^{nlog_2x} xn=2nlog2x:
var myPow = function (x, n) {
if (n == 0) return 1;
let tmp = Math.exp(n * Math.log(Math.abs(x)));
if (x < 0 && n % 2 == 1) tmp *= -1;
return tmp;
};
而这个用的是快速幂乘法,需要注意的是,要用Math.trunc
而不是Math.floor
,因为Math.floor
是向下取整,在面对负数的时候也会向下取整,导致无限递归;而Math.trunc
则是截断,相当于向0取整:
function pow(x, n) {
if (n == 0) {
return 1;
}
const tmp = pow(x, Math.trunc(n / 2));
return n % 2 == 0 ? tmp * tmp : tmp * tmp * x;
}
var myPow = function (x, n) {
return n > 0 ? pow(x, n) : 1 / pow(x, n);
};
我们已经看到了其他的写法,按理来说,到这里已经可以结束了,但是我发现,在执行次数很少的情况下,手写的代码反而可能比原生的快,这就很奇怪了。所以我打算看看V8的源码,能不能看出点东西来:
CodeStubAssembler::Node* MathBuiltinsAssembler::MathPow(Node* context,
Node* base,
Node* exponent) {
Node* base_value = TruncateTaggedToFloat64(context, base);
Node* exponent_value = TruncateTaggedToFloat64(context, exponent);
Node* value = Float64Pow(base_value, exponent_value);
return ChangeFloat64ToTagged(value);
}
Node* Float64Pow(Node* a, Node* b) {
return AddNode(machine()->Float64Pow(), a, b);
}
Node* Float64Exp(Node* a) { return AddNode(machine()->Float64Exp(), a); }
Node* Float64Log(Node* a) { return AddNode(machine()->Float64Log(), a); }
得,话都说到这份上了,用的是硬件自带的运算,咱也没辙。所以只能进一步深入,看看能不能从汇编层面看出点东西来,用d8编译一下,看看输出的结果。
内容很长,放到最后面了。简单说一下结论:在执行次数很少的情况下(我这里测试大概是10000次以下),原生代码可能不会成为所谓的“热点”,也就是不会被V8编译成汇编指令;而手写的代码会被编译成汇编指令。所以,在这个情况下,是有可能存在手写比原生快的情况的;而随着执行次数的增加,原生代码也会变成汇编指令,并且条数更少,自然也就更快了。
附录
原生
Instructions (size = 460)
00000061000C2E60 0 488d1df9ffffff REX.W leaq rbx,[rip+0xfffffff9]
00000061000C2E67 7 483bd9 REX.W cmpq rbx,rcx
00000061000C2E6A a 7418 jz 00000061000C2E84 <+0x24>
00000061000C2E6C c 48ba6800000000000000 REX.W movq rdx,0000000000000068
00000061000C2E76 16 49bae0c0d5cdfb7f0000 REX.W movq r10,00007FFBCDD5C0E0 (Abort) ;; off heap target
00000061000C2E80 20 41ffd2 call r10
00000061000C2E83 23 cc int3l
00000061000C2E84 24 8b59d0 movl rbx,[rcx-0x30]
00000061000C2E87 27 4903dd REX.W addq rbx,r13
00000061000C2E8A 2a f6430701 testb [rbx+0x7],0x1
00000061000C2E8E 2e 740d jz 00000061000C2E9D <+0x3d>
00000061000C2E90 30 49baa0f9cacdfb7f0000 REX.W movq r10,00007FFBCDCAF9A0 (CompileLazyDeoptimizedCode) ;; off heap target
00000061000C2E9A 3a 41ffe2 jmp r10
00000061000C2E9D 3d 55 push rbp
00000061000C2E9E 3e 4889e5 REX.W movq rbp,rsp
00000061000C2EA1 41 56 push rsi
00000061000C2EA2 42 57 push rdi
00000061000C2EA3 43 48ba4200000000000000 REX.W movq rdx,0000000000000042
00000061000C2EAD 4d 4c8b15c4ffffff REX.W movq r10,[rip+0xffffffc4]
00000061000C2EB4 54 41ffd2 call r10
00000061000C2EB7 57 cc int3l
00000061000C2EB8 58 4883ec10 REX.W subq rsp,0x10
00000061000C2EBC 5c 488975b0 REX.W movq [rbp-0x50],rsi
00000061000C2EC0 60 488b55d0 REX.W movq rdx,[rbp-0x30]
00000061000C2EC4 64 f6c201 testb rdx,0x1
00000061000C2EC7 67 0f8516010000 jnz 00000061000C2FE3 <+0x183>
00000061000C2ECD 6d 81fa002d3101 cmpl rdx,0x1312d00
00000061000C2ED3 73 0f8c0b000000 jl 00000061000C2EE4 <+0x84>
00000061000C2ED9 79 488b45d8 REX.W movq rax,[rbp-0x28]
00000061000C2EDD 7d 488be5 REX.W movq rsp,rbp
00000061000C2EE0 80 5d pop rbp
00000061000C2EE1 81 c20800 ret 0x8
00000061000C2EE4 84 48b9b93a240861000000 REX.W movq rcx,0000006108243AB9 ;; object: 0x006108243ab9 <Object map = 0000006108280A09>
00000061000C2EEE 8e 8b4903 movl rcx,[rcx+0x3]
00000061000C2EF1 91 4903cd REX.W addq rcx,r13
00000061000C2EF4 94 8b495b movl rcx,[rcx+0x5b]
00000061000C2EF7 97 49ba0000000001000000 REX.W movq r10,0000000100000000
00000061000C2F01 a1 4c3bd1 REX.W cmpq r10,rcx
00000061000C2F04 a4 7715 ja 00000061000C2F1B <+0xbb>
00000061000C2F06 a6 48ba0200000000000000 REX.W movq rdx,0000000000000002
00000061000C2F10 b0 4c8b1561ffffff REX.W movq r10,[rip+0xffffff61]
00000061000C2F17 b7 41ffd2 call r10
00000061000C2F1A ba cc int3l
00000061000C2F1B bb bff13f2408 movl rdi,0000000008243FF1 ;; (compressed) object: 0x006108243ff1 <JSFunction pow (sfi = 00000061081C7729)>
00000061000C2F20 c0 3bf9 cmpl rdi,rcx
00000061000C2F22 c2 0f85c7000000 jnz 00000061000C2FEF <+0x18f>
00000061000C2F28 c8 488bca REX.W movq rcx,rdx
00000061000C2F2B cb d1f9 sarl rcx, 1
00000061000C2F2D cd 83c101 addl rcx,0x1
00000061000C2F30 d0 0f80c5000000 jo 00000061000C2FFB <+0x19b>
00000061000C2F36 d6 493b6560 REX.W cmpq rsp,[r13+0x60] (external value (StackGuard::address_of_jslimit()))
00000061000C2F3A da 0f8713000000 ja 00000061000C2F53 <+0xf3>
00000061000C2F40 e0 e940000000 jmp 00000061000C2F85 <+0x125>
00000061000C2F45 e5 660f1f840000000000 nop
00000061000C2F4E ee 6690 nop
00000061000C2F50 f0 488bca REX.W movq rcx,rdx
00000061000C2F53 f3 81f980969800 cmpl rcx,0x989680
00000061000C2F59 f9 0f8d17000000 jge 00000061000C2F76 <+0x116>
00000061000C2F5F ff 488bd1 REX.W movq rdx,rcx
00000061000C2F62 102 83c201 addl rdx,0x1
00000061000C2F65 105 0f809c000000 jo 00000061000C3007 <+0x1a7>
00000061000C2F6B 10b 493b6560 REX.W cmpq rsp,[r13+0x60] (external value (StackGuard::address_of_jslimit()))
00000061000C2F6F 10f 77df ja 00000061000C2F50 <+0xf0>
00000061000C2F71 111 e942000000 jmp 00000061000C2FB8 <+0x158>
00000061000C2F76 116 48b80008000000000000 REX.W movq rax,0000000000000800
00000061000C2F80 120 e958ffffff jmp 00000061000C2EDD <+0x7d>
00000061000C2F85 125 48894da8 REX.W movq [rbp-0x58],rcx
00000061000C2F89 129 33c0 xorl rax,rax
00000061000C2F8B 12b 48bea10d240861000000 REX.W movq rsi,0000006108240DA1 ;; object: 0x006108240da1 <NativeContext[247]>
00000061000C2F95 135 48bbd0ec24cffb7f0000 REX.W movq rbx,00007FFBCF24ECD0 ;; external reference (Runtime::StackGuard)
00000061000C2F9F 13f 488bd0 REX.W movq rdx,rax
00000061000C2FA2 142 488bfe REX.W movq rdi,rsi
00000061000C2FA5 145 49ba00a8edcdfb7f0000 REX.W movq r10,00007FFBCDEDA800 (CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit) ;; off heap target
00000061000C2FAF 14f 41ffd2 call r10
00000061000C2FB2 152 488b4da8 REX.W movq rcx,[rbp-0x58]
00000061000C2FB6 156 eb9b jmp 00000061000C2F53 <+0xf3>
00000061000C2FB8 158 488955a8 REX.W movq [rbp-0x58],rdx
00000061000C2FBC 15c 488b1dd4ffffff REX.W movq rbx,[rip+0xffffffd4]
00000061000C2FC3 163 33c0 xorl rax,rax
00000061000C2FC5 165 48bea10d240861000000 REX.W movq rsi,0000006108240DA1 ;; object: 0x006108240da1 <NativeContext[247]>
00000061000C2FCF 16f 4c8b15d1ffffff REX.W movq r10,[rip+0xffffffd1]
00000061000C2FD6 176 41ffd2 call r10
00000061000C2FD9 179 488b55a8 REX.W movq rdx,[rbp-0x58]
00000061000C2FDD 17d e96effffff jmp 00000061000C2F50 <+0xf0>
00000061000C2FE2 182 90 nop
00000061000C2FE3 183 49c7c500000000 REX.W movq r13,0x0
00000061000C2FEA 18a e851f00300 call 0000006100102040 ;; eager deoptimization bailout
00000061000C2FEF 18f 49c7c501000000 REX.W movq r13,0x1
00000061000C2FF6 196 e845f00300 call 0000006100102040 ;; eager deoptimization bailout
00000061000C2FFB 19b 49c7c502000000 REX.W movq r13,0x2
00000061000C3002 1a2 e839f00300 call 0000006100102040 ;; eager deoptimization bailout
00000061000C3007 1a7 49c7c503000000 REX.W movq r13,0x3
00000061000C300E 1ae e82df00300 call 0000006100102040 ;; eager deoptimization bailout
00000061000C3013 1b3 49c7c504000000 REX.W movq r13,0x4
00000061000C301A 1ba e821f00700 call 0000006100142040 ;; lazy deoptimization bailout
00000061000C301F 1bf 49c7c505000000 REX.W movq r13,0x5
00000061000C3026 1c6 e815f00700 call 0000006100142040 ;; lazy deoptimization bailout
00000061000C302B 1cb 90 nop
手写
Instructions (size = 580)
0000028E000C2F20 0 488d1df9ffffff REX.W leaq rbx,[rip+0xfffffff9]
0000028E000C2F27 7 483bd9 REX.W cmpq rbx,rcx
0000028E000C2F2A a 7418 jz 0000028E000C2F44 <+0x24>
0000028E000C2F2C c 48ba6800000000000000 REX.W movq rdx,0000000000000068
0000028E000C2F36 16 49bae0c0d5cdfb7f0000 REX.W movq r10,00007FFBCDD5C0E0 (Abort) ;; off heap target
0000028E000C2F40 20 41ffd2 call r10
0000028E000C2F43 23 cc int3l
0000028E000C2F44 24 8b59d0 movl rbx,[rcx-0x30]
0000028E000C2F47 27 4903dd REX.W addq rbx,r13
0000028E000C2F4A 2a f6430701 testb [rbx+0x7],0x1
0000028E000C2F4E 2e 740d jz 0000028E000C2F5D <+0x3d>
0000028E000C2F50 30 49baa0f9cacdfb7f0000 REX.W movq r10,00007FFBCDCAF9A0 (CompileLazyDeoptimizedCode) ;; off heap target
0000028E000C2F5A 3a 41ffe2 jmp r10
0000028E000C2F5D 3d 55 push rbp
0000028E000C2F5E 3e 4889e5 REX.W movq rbp,rsp
0000028E000C2F61 41 56 push rsi
0000028E000C2F62 42 57 push rdi
0000028E000C2F63 43 48ba4200000000000000 REX.W movq rdx,0000000000000042
0000028E000C2F6D 4d 4c8b15c4ffffff REX.W movq r10,[rip+0xffffffc4]
0000028E000C2F74 54 41ffd2 call r10
0000028E000C2F77 57 cc int3l
0000028E000C2F78 58 4883ec10 REX.W subq rsp,0x10
0000028E000C2F7C 5c 488975b0 REX.W movq [rbp-0x50],rsi
0000028E000C2F80 60 488b55d0 REX.W movq rdx,[rbp-0x30]
0000028E000C2F84 64 f6c201 testb rdx,0x1
0000028E000C2F87 67 0f8576010000 jnz 0000028E000C3103 <+0x1e3>
0000028E000C2F8D 6d 81fa002d3101 cmpl rdx,0x1312d00
0000028E000C2F93 73 0f8c0b000000 jl 0000028E000C2FA4 <+0x84>
0000028E000C2F99 79 488b45d8 REX.W movq rax,[rbp-0x28]
0000028E000C2F9D 7d 488be5 REX.W movq rsp,rbp
0000028E000C2FA0 80 5d pop rbp
0000028E000C2FA1 81 c20800 ret 0x8
0000028E000C2FA4 84 48b9b93a24088e020000 REX.W movq rcx,0000028E08243AB9 ;; object: 0x028e08243ab9 <Object map = 0000028E08280A09>
0000028E000C2FAE 8e 8b7903 movl rdi,[rcx+0x3]
0000028E000C2FB1 91 4903fd REX.W addq rdi,r13
0000028E000C2FB4 94 448b472f movl r8,[rdi+0x2f]
0000028E000C2FB8 98 49ba0000000001000000 REX.W movq r10,0000000100000000
0000028E000C2FC2 a2 4d3bd0 REX.W cmpq r10,r8
0000028E000C2FC5 a5 7715 ja 0000028E000C2FDC <+0xbc>
0000028E000C2FC7 a7 48ba0200000000000000 REX.W movq rdx,0000000000000002
0000028E000C2FD1 b1 4c8b1560ffffff REX.W movq r10,[rip+0xffffff60]
0000028E000C2FD8 b8 41ffd2 call r10
0000028E000C2FDB bb cc int3l
0000028E000C2FDC bc 8b7f43 movl rdi,[rdi+0x43]
0000028E000C2FDF bf 4c8b15d4ffffff REX.W movq r10,[rip+0xffffffd4]
0000028E000C2FE6 c6 4c3bd7 REX.W cmpq r10,rdi
0000028E000C2FE9 c9 7712 ja 0000028E000C2FFD <+0xdd>
0000028E000C2FEB cb 488b15d7ffffff REX.W movq rdx,[rip+0xffffffd7]
0000028E000C2FF2 d2 4c8b153fffffff REX.W movq r10,[rip+0xffffff3f]
0000028E000C2FF9 d9 41ffd2 call r10
0000028E000C2FFC dc cc int3l
0000028E000C2FFD dd 8b490b movl rcx,[rcx+0xb]
0000028E000C3000 e0 4c8b15b3ffffff REX.W movq r10,[rip+0xffffffb3]
0000028E000C3007 e7 4c3bd1 REX.W cmpq r10,rcx
0000028E000C300A ea 7712 ja 0000028E000C301E <+0xfe>
0000028E000C300C ec 488b15b6ffffff REX.W movq rdx,[rip+0xffffffb6]
0000028E000C3013 f3 4c8b151effffff REX.W movq r10,[rip+0xffffff1e]
0000028E000C301A fa 41ffd2 call r10
0000028E000C301D fd cc int3l
0000028E000C301E fe 41b909412408 movl r9,0000000008244109 ;; (compressed) object: 0x028e08244109 <JSFunction abs (sfi = 0000028E081C7919)>
0000028E000C3024 104 443bc9 cmpl r9,rcx
0000028E000C3027 107 0f85e2000000 jnz 0000028E000C310F <+0x1ef>
0000028E000C302D 10d b9493f2408 movl rcx,0000000008243F49 ;; (compressed) object: 0x028e08243f49 <JSFunction log (sfi = 0000028E081C7639)>
0000028E000C3032 112 3bcf cmpl rcx,rdi
0000028E000C3034 114 0f85e1000000 jnz 0000028E000C311B <+0x1fb>
0000028E000C303A 11a b9bd3e2408 movl rcx,0000000008243EBD ;; (compressed) object: 0x028e08243ebd <JSFunction exp (sfi = 0000028E081C7571)>
0000028E000C303F 11f 413bc8 cmpl rcx,r8
0000028E000C3042 122 0f85df000000 jnz 0000028E000C3127 <+0x207>
0000028E000C3048 128 488bca REX.W movq rcx,rdx
0000028E000C304B 12b d1f9 sarl rcx, 1
0000028E000C304D 12d 83c101 addl rcx,0x1
0000028E000C3050 130 0f80dd000000 jo 0000028E000C3133 <+0x213>
0000028E000C3056 136 493b6560 REX.W cmpq rsp,[r13+0x60] (external value (StackGuard::address_of_jslimit()))
0000028E000C305A 13a 0f8713000000 ja 0000028E000C3073 <+0x153>
0000028E000C3060 140 e940000000 jmp 0000028E000C30A5 <+0x185>
0000028E000C3065 145 660f1f840000000000 nop
0000028E000C306E 14e 6690 nop
0000028E000C3070 150 488bca REX.W movq rcx,rdx
0000028E000C3073 153 81f980969800 cmpl rcx,0x989680
0000028E000C3079 159 0f8d17000000 jge 0000028E000C3096 <+0x176>
0000028E000C307F 15f 488bd1 REX.W movq rdx,rcx
0000028E000C3082 162 83c201 addl rdx,0x1
0000028E000C3085 165 0f80b4000000 jo 0000028E000C313F <+0x21f>
0000028E000C308B 16b 493b6560 REX.W cmpq rsp,[r13+0x60] (external value (StackGuard::address_of_jslimit()))
0000028E000C308F 16f 77df ja 0000028E000C3070 <+0x150>
0000028E000C3091 171 e942000000 jmp 0000028E000C30D8 <+0x1b8>
0000028E000C3096 176 48b80008000000000000 REX.W movq rax,0000000000000800
0000028E000C30A0 180 e9f8feffff jmp 0000028E000C2F9D <+0x7d>
0000028E000C30A5 185 48894da8 REX.W movq [rbp-0x58],rcx
0000028E000C30A9 189 33c0 xorl rax,rax
0000028E000C30AB 18b 48bea10d24088e020000 REX.W movq rsi,0000028E08240DA1 ;; object: 0x028e08240da1 <NativeContext[247]>
0000028E000C30B5 195 48bbd0ec24cffb7f0000 REX.W movq rbx,00007FFBCF24ECD0 ;; external reference (Runtime::StackGuard)
0000028E000C30BF 19f 488bd0 REX.W movq rdx,rax
0000028E000C30C2 1a2 488bfe REX.W movq rdi,rsi
0000028E000C30C5 1a5 49ba00a8edcdfb7f0000 REX.W movq r10,00007FFBCDEDA800 (CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit) ;; off heap target
0000028E000C30CF 1af 41ffd2 call r10
0000028E000C30D2 1b2 488b4da8 REX.W movq rcx,[rbp-0x58]
0000028E000C30D6 1b6 eb9b jmp 0000028E000C3073 <+0x153>
0000028E000C30D8 1b8 488955a8 REX.W movq [rbp-0x58],rdx
0000028E000C30DC 1bc 488b1dd4ffffff REX.W movq rbx,[rip+0xffffffd4]
0000028E000C30E3 1c3 33c0 xorl rax,rax
0000028E000C30E5 1c5 48bea10d24088e020000 REX.W movq rsi,0000028E08240DA1 ;; object: 0x028e08240da1 <NativeContext[247]>
0000028E000C30EF 1cf 4c8b15d1ffffff REX.W movq r10,[rip+0xffffffd1]
0000028E000C30F6 1d6 41ffd2 call r10
0000028E000C30F9 1d9 488b55a8 REX.W movq rdx,[rbp-0x58]
0000028E000C30FD 1dd e96effffff jmp 0000028E000C3070 <+0x150>
0000028E000C3102 1e2 90 nop
0000028E000C3103 1e3 49c7c500000000 REX.W movq r13,0x0
0000028E000C310A 1ea e831ef0300 call 0000028E00102040 ;; eager deoptimization bailout
0000028E000C310F 1ef 49c7c501000000 REX.W movq r13,0x1
0000028E000C3116 1f6 e825ef0300 call 0000028E00102040 ;; eager deoptimization bailout
0000028E000C311B 1fb 49c7c502000000 REX.W movq r13,0x2
0000028E000C3122 202 e819ef0300 call 0000028E00102040 ;; eager deoptimization bailout
0000028E000C3127 207 49c7c503000000 REX.W movq r13,0x3
0000028E000C312E 20e e80def0300 call 0000028E00102040 ;; eager deoptimization bailout
0000028E000C3133 213 49c7c504000000 REX.W movq r13,0x4
0000028E000C313A 21a e801ef0300 call 0000028E00102040 ;; eager deoptimization bailout
0000028E000C313F 21f 49c7c505000000 REX.W movq r13,0x5
0000028E000C3146 226 e8f5ee0300 call 0000028E00102040 ;; eager deoptimization bailout
0000028E000C314B 22b 49c7c506000000 REX.W movq r13,0x6
0000028E000C3152 232 e8e9ee0700 call 0000028E00142040 ;; lazy deoptimization bailout
0000028E000C3157 237 49c7c507000000 REX.W movq r13,0x7
0000028E000C315E 23e e8ddee0700 call 0000028E00142040 ;; lazy deoptimization bailout
0000028E000C3163 243 90 nop