本文翻译自:Transitivity of Auto-Specialization in GHC
From the docs for GHC 7.6: 来自GHC 7.6 的文档 :
[Y]ou often don't even need the SPECIALIZE pragma in the first place. [Y]你通常甚至不需要SPECIALIZE pragma。 When compiling a module M, GHC's optimiser (with -O) automatically considers each top-level overloaded function declared in M, and specialises it for the different types at which it is called in M. The optimiser also considers each imported INLINABLE overloaded function, and specialises it for the different types at which it is called in M. 在编译模块M时,GHC的优化器(带-O)自动考虑在M中声明的每个顶级重载函数,并将其专门用于在M中调用它的不同类型。优化器还考虑每个导入的INLINABLE重载函数,并将其专门用于M中调用的不同类型。
and 和
Moreover, given a SPECIALIZE pragma for a function f, GHC will automatically create specialisations for any type-class-overloaded functions called by f, if they are in the same module as the SPECIALIZE pragma, or if they are INLINABLE; 此外,给定函数f的SPECIALIZE编译指示,GHC将自动为f调用的任何类型类重载函数创建特殊化,如果它们与SPECIALIZE编译指示位于同一模块中,或者它们是否为INLINABLE; and so on, transitively. 等等,过渡性的。
So GHC should automatically specialize some/most/all(?) functions marked INLINABLE
without a pragma, and if I use an explicit pragma, the specialization is transitive. 因此,GHC应该自动专门化一些/大多数/所有(?)函数,标记为INLINABLE
而不使用编译指示,如果我使用显式编译指示,则特化是可传递的。 My question is: is the auto -specialization transitive? 我的问题是: auto -specialization是否具有传递性?
Specifically, here's a small example: 具体来说,这是一个小例子:
Main.hs: Main.hs:
import Data.Vector.Unboxed as U
import Foo
main =
let y = Bar $ Qux $ U.replicate 11221184 0 :: Foo (Qux Int)
(Bar (Qux ans)) = iterate (plus y) y !! 100
in putStr $ show $ foldl1' (*) ans
Foo.hs: Foo.hs:
module Foo (Qux(..), Foo(..), plus) where
import Data.Vector.Unboxed as U
newtype Qux r = Qux (Vector r)
-- GHC inlines `plus` if I remove the bangs or the Baz constructor
data Foo t = Bar !t
| Baz !t
instance (Num r, Unbox r) => Num (Qux r) where
{-# INLINABLE (+) #-}
(Qux x) + (Qux y) = Qux $ U.zipWith (+) x y
{-# INLINABLE plus #-}
plus :: (Num t) => (Foo t) -> (Foo t) -> (Foo t)
plus (Bar v1) (Bar v2) = Bar $ v1 + v2
GHC specializes the call to plus
, but does not specialize (+)
in the Qux
Num
instance which kills performance. GHC专门调用plus
,但不擅长(+)
在Qux
Num
杀死性能实例。
However, an explicit pragma 但是,一个明确的pragma
{-# SPECIALIZE plus :: Foo (Qux Int) -> Foo (Qux Int) -> Foo (Qux Int) #-}
results in transitive specialization as the docs indicate, so (+)
is specialized and the code is 30x faster (both compiled with -O2
). 如文档所示,导致传递特化 ,因此(+)
是专用的,代码快30倍(都用-O2
编译)。 Is this expected behavior? 这是预期的行为吗? Should I only expect (+)
to be specialized transitively with an explicit pragma? 我是否应该只期望(+)
具有明确的编译指示的传递?
UPDATE UPDATE
The docs for 7.8.2 haven't changed, and the behavior is the same, so this question is still relevant. 7.8.2的文档没有改变,行为是相同的,所以这个问题仍然是相关的。
#1楼
参考:https://stackoom.com/question/1SDjr/GHC中自动专业化的传递性
#2楼
Short answers: 简短的答案:
The question's key points, as I understand them, are the following: 正如我所理解的那样,问题的关键点如下:
- "is the auto-specialization transitive?" “是自动专业传递吗?”
- Should I only expect (+) to be specialized transitively with an explicit pragma? 我是否应该只期望(+)具有明确的编译指示的传递?
- (apparently intended) Is this a bug of GHC? (显然是有意的)这是GHC的错误吗? Is it inconsistent with the documentation? 它与文档不一致吗?
AFAIK, the answers are No, mostly yes but there are other means, and No. AFAIK,答案是否定的,大部分都是,但还有其他方法,而且不是。
Code inlining and type application specialization is a trade-off between speed (execution time) and code size. 代码内联和类型应用程序专门化是速度(执行时间)和代码大小之间的权衡。 The default level gets some speedup without bloating the code. 默认级别获得一些加速,而不会膨胀代码。 Choosing a more exhaustive level is left to the programmer's discretion via SPECIALISE
pragma. 选择更详尽的级别由程序员通过SPECIALISE
pragma自行决定。
Explanation: 说明:
The optimiser also considers each imported INLINABLE overloaded function, and specialises it for the different types at which it is called in M. 优化器还会考虑每个导入的INLINABLE重载函数,并将其专门用于M中调用它的不同类型。
Suppose f
is a function whose type includes a type variable a
constrained by a type class C a
. 假设f
是一个函数,其类型包括类型变量a
由类型约束类C a
。 GHC by default specializes f
with respect to a type application (substituting a
for t
) if f
is called with that type application in the source code of (a) any function in the same module, or (b) if f
is marked INLINABLE
, then any other module that imports f
from B
. GHC默认专门f
相对于应用程序的类型(代替a
用于t
)如果f
被调用,在同一模块中的(a)的任何函数的源代码的类型的应用程序,或(b)如果f
被标记INLINABLE
,然后从B
导入 f
任何其他模块。 Thus, auto-specialization is not transitive, it only touches INLINABLE
functions imported and called for in the source code of A
. 因此,自动特化不是传递性的,它只接触在 A
的源代码中导入和调用的INLINABLE
函数。
In your example, if you rewrite the instance of Num
as follows: 在您的示例中,如果您重写Num
的实例,如下所示:
instance (Num r, Unbox r) => Num (Qux r) where
(+) = quxAdd
quxAdd (Qux x) (Qux y) = Qux $ U.zipWith (+) x y
-
quxAdd
is not specifically imported byMain
.quxAdd
不是由Main
专门导入的。Main
imports the instance dictionary ofNum (Qux Int)
, and this dictionary containsquxAdd
in the record for(+)
.Main
导入Num (Qux Int)
的实例字典,该字典在(+)
的记录中包含quxAdd
。 However, although the dictionary is imported, the contents used in the dictionary are not. 但是,虽然导入了字典,但字典中使用的内容却不是。 -
plus
does not callquxAdd
, it uses the function stored for the(+)
record in the instance dictionary ofNum t
.plus
不调用quxAdd
,它使用为Num t
的实例字典中的(+)
记录存储的函数。 This dictionary is set at the call site (inMain
) by the compiler. 该字典由编译器在调用站点(在Main
)设置。