有趣的人脸:找到并编译_返回基础：您并不比编译器更聪明。（加上Microbenchmarks带来的乐趣）-CSDN博客

有趣的人脸:找到并编译

Microbenchmarks are evil. Ya, I said it. Folks spend hours in tight loops measuring things trying to find out the "best way" to do something and forget that while they are changing 5ms between two techniques they've missed the 300ms Database Call or the looming N+1 selects issue that has their ORM quietly making even more database calls.

微基准是邪恶的。是的，我说了。人们在紧密的循环中花费了数小时来衡量事物，试图找出做某事的“最佳方法”，而忘记了他们在两种技术之间改变5毫秒时错过了300毫秒数据库调用或迫在眉睫的N + 1选择问题ORM悄悄地进行了更多的数据库调用。

My friend Sam Saffron says we should "take global approach to optimizations." Sam cautions us to avoid trying to be too clever.

我的朋友萨姆·萨弗隆( Sam Saffron)说，我们应该“采用全局方法进行优化”。山姆警告我们避免尝试变得太聪明。

// You think you're slick. You're not.
// faster than .Count()? Stop being clever.
var count = (stuff as ICollection<int>).Count;

All that said, let's ~~argue~~ microbenchmark, shall we? ;)

所有这些，让我们争论一下微基准测试，对吧？ ;)

I did a blog post a few months back called "Back to Basics: Moving beyond for, if and switch" and as with all blog posts where one makes a few declarative statement (or shows ANY code at all, for that matter) it inspired some spirited comments. The best of them was from Chris Rigter in defense of LINQ.

几个月前，我写了一篇博客文章，名为“回到基础知识：超越目标，如果再切换”，就像所有博客文章一样，它发表了一些声明性声明(或就此显示了所有代码)，这激发了我的灵感。一些热烈的评论。其中最好的是克里斯·里格特( Chris Rigter )为LINQ辩护。

I started the post by showing this little bit of counting code:

我通过显示以下计数代码开始了这篇文章：

var biggerThan10 = new List;
for (int i = 0; i < array.Length; i++){
    if (array [i] > 10)
       biggerThan10.Add (array[i]);
}

and then changed it into LINQ which can be either of these one liners

然后将其更改为LINQ，可以是这两种衬管中的任何一种

var a = from x in array where x > 10 select x; 
var b = array.Where(x => x > 10);

and a few questions came up like this one from Teusje:

Teusje提出了一些这样的问题：

"does rewriting code to one line make your code faster or slower or is it not worth talking about these nanoseconds?"

“将代码重写为一行会使您的代码更快或更慢，还是不值得谈论这些纳秒级的时间？”

The short answer is, measure it. The longer answer is measure it yourself. You have the power to profile your code. If you don't know what's happening, profile it. There's some interesting discussion on benchmarking small code samples over on this StackOverflow question.

简短的答案是测量它。更长的答案是自己测量。您可以分析您的代码。如果您不知道发生了什么，请对其进行分析。关于在这个StackOverflow问题上对小型代码样本进行基准测试，有一些有趣的讨论。

Now, with all kinds of code like this folks go and do microbenchmarks. This usually means doing something trivial a million times in a tight loop. That's lots of fun and I'm doing to do JUST that very thing right now with Chris's good work, but it's important to remember that your code is usually NOT doing something trivial a million times in a tight loop. Unless it is.

现在，有了这样的各种代码，人们开始进行微基准测试。这通常意味着在一圈紧紧的事情中完成一百万次琐碎的事情。这很有趣，我现在正在做的就是克里斯的出色工作，但这很重要，但要记住，您的代码通常不会在一圈紧紧的时间内完成百万次。除非是。

Knaģis says:

克奈斯说：

"Unfortunately LINQ has now created a whole generation of coders who completely ignores any perception of writing performant code. for/if are compiled into nice machine code, whereas .Where() creates instances of enumerator class and then iterates through that instance using MoveNext method...Please, please do not advocate for using LINQ to produce shorter, nicer to read etc. code unless it is accompanied by warning that it affects performance"

“不幸的是，LINQ现在已经创建了一代编码器，它们完全忽略了编写高性能代码的任何感觉。for / if被编译成不错的机器代码，而.Where()创建枚举器类的实例，然后使用MoveNext方法遍历该实例。 ...请，请不要提倡使用LINQ来产生更短，更易于阅读的代码。

I think that LINQ above could probably be replaced with "datagrids" or "pants" or "google" or any number of conveniences but I get the point. Some code is shown in the comments where LINQ appears to be 10x slower. I can't reproduce his result.

我认为上面的LINQ可以用“ datagrids”或“ pants”或“ google”或任何其他便利方式代替，但我明白了。注释中显示了一些代码，其中LINQ似乎慢了10倍。我无法重现他的结果。

Let's take Chris's comment and deconstruct it. First, taking an enumerable Range as an array and spinning through it.

让我们接受克里斯的评论并对其进行解构。首先，将一个可枚举的范围作为数组并对其进行旋转。

var enumerable = Enumerable.Range(0, 9999999);
var sw = new Stopwatch();
int c = 0;

// approach 1

sw.Start();
var array = enumerable.ToArray();
for (int i = 0; i < array.Length; i++)
{
    if (array[i] > 10)
        c++;
}
sw.Stop();
Console.WriteLine("Enumerable.ToArray()");
Console.WriteLine(c.Dump());
Console.WriteLine(sw.ElapsedMilliseconds.Dump());

The "ToArray()" part takes 123ms and the for loop takes 9ms on my system. Arrays are super fast.

在我的系统上，“ ToArray()”部分需要123毫秒，而for循环需要9毫秒。数组非常快。

Starting from the enumerable itself (not the array!) we can try the Count() one liner:

从可枚举本身(而不是数组！)开始，我们可以尝试使用Count()一种衬里：

// approach 2
Console.WriteLine("Enumerable.Count()");
sw.Restart();
c = enumerable.Count(x => x > 10);
sw.Stop();
Console.WriteLine(c.Dump());
Console.WriteLine(sw.ElapsedMilliseconds.Dump());

It takes 86ms.

耗时86ms。

I can try it easily in Parallel over 12 processors but it's not a large enough sample nor is it doing enough work to justify the overhead.

我可以在超过12个处理器的Parallel上轻松地尝试它，但是它的样本量不够大，也没有做足够的工作来证明开销。

// approach 3
Console.WriteLine("Enumerable.AsParallel() (12 procs)");
sw.Restart();
c = enumerable.AsParallel().Where(x => x > 10).Count();
sw.Stop();
Console.WriteLine(c.Dump());
Console.WriteLine(sw.ElapsedMilliseconds.Dump());

It adds overhead and takes 129ms. However, you see how easy it was to try a naïve parallel loop in this case. Now you know how to try it (and measure it!) in your own tests.

它增加了开销，并且花费了129ms。但是，您会发现在这种情况下尝试简单的并行循环是多么容易。现在您知道了如何在自己的测试中尝试(和测量！)。

Next, let's do something stupid and tell LINQ that everything is an object so we are forced to do a bunch of extra work. You'd be surprised (or maybe you wouldn't) how often you find code like this in production. This is an example of coercing types back and forth and as you can see, you'll pay the price if you're not paying attention. It always seems like a good idea at the time, doesn't it?

接下来，让我们做一些愚蠢的事情，并告诉LINQ所有东西都是对象，因此我们被迫做很多额外的工作。您会惊讶(或也许不会)在生产中发现这样的代码的频率。这是一个来回强制类型的示例，如您所见，如果您不注意，则将付出代价。当时似乎总是个好主意，不是吗？

//Approach 4 - Type Checking?
Console.WriteLine("Enumerable.OfType(object) ");
var objectEnum = enumerable.OfType<object>().Concat(new[] { "Hello" });
sw.Start();
var objectArray = objectEnum.ToArray();
for (int i = 0; i < objectArray.Length; i++)
{
    int outVal;
    var isInt = int.TryParse(objectArray[i].ToString(), out outVal);
    if (isInt && Convert.ToInt32(objectArray[i]) > 10)
        c++;
}
sw.Stop();
Console.WriteLine(c.Dump());
Console.WriteLine(sw.ElapsedMilliseconds.Dump());

That whole thing cost over 4 seconds. 4146ms in fact. Avoid conversions. Tell the compiler as much as you can up front so it can be more efficient, right?

整个过程耗时超过4秒。实际上是4146ms。避免转换。尽可能多地告诉编译器，这样可以提高编译效率，对吗？

What if we enumerate over the types with a little hint of extra information?

如果我们在列举一些类型的信息的同时略加提示怎么办？

// approach 5
Console.WriteLine("Enumerable.OfType(int) ");
sw.Restart();
c = enumerable.OfType<int>().Count(x => x > 10);
sw.Stop();
Console.WriteLine(c.Dump());
Console.WriteLine(sw.ElapsedMilliseconds.Dump());

Nope, the type check wasn't necessarily in this case. It took 230ms and added overhead. What if this was parallel?

是的，在这种情况下不一定要进行类型检查。它花费了230毫秒并增加了开销。如果这是并行的怎么办？

// approach 6
Console.WriteLine("Enumerable.AsParallel().OfType(int) ");
sw.Restart();
c = enumerable.AsParallel().OfType<int>().Where(x => x > 10).Count();
sw.Stop();
Console.WriteLine(c.Dump());
Console.WriteLine(sw.ElapsedMilliseconds.Dump());

That's 208ms, consistently. Slightly faster, but ultimately I shouldn't be doing unnecessary work.

一直是208毫秒。快一点，但最终我不应该做不必要的工作。

In this simple example of looping over something simple, my best bet turned out to be either the Array (super fast if it was an Array to start) or a simple Count() with LINQ. I measured, so I would know what was happening, but in this case the simplest thing also performed the best.

在这个简单的循环简单示例中，我最好的选择是使用Array(如果要启动的是Array，则超级快)或使用LINQ的简单Count()。我测量了一下，所以我知道发生了什么，但是在这种情况下，最简单的事情也表现最好。

What's the moral of this story? Measure and profile and make a good judgment. Microbenchmarks are fun and ALWAYS good for an argument but ultimately they exists only so you can know your options, try a few, and pick the one that does the least work. More often than not (not always, but usually) the compiler creators aren't idiots and more often than not the simplest syntax will the best one for you.

这个故事的寓意是什么？测量并剖析并做出正确的判断。微基准测试很有趣，并且总是很有利于争论，但最终它们只存在，因此您可以知道自己的选择，尝试一些，然后选择效果最小的方法。编译器创建者经常(并非总是，但通常)不是白痴，而且最简单的语法通常更适合您。

Network access, database access, unnecessary serializations, unneeded marshaling, boxing and unboxing, type coercion - these things all take up time. Avoid doing them and when do you do them, don't just know why you're doing them, but also that you are doing them.

网络访问，数据库访问，不必要的序列化，不需要的封送处理，装箱和拆箱，强制类型-这些都需要花费时间。避免做它们，什么时候做它们，不仅要知道为什么要做它们，还要知道自己正在做它们。

Is it fair to say "LINQ is evil and makes things slow?" No, it's fair to say that code in general can be unintuitive if you don't know what's going on. There can be subtle side-effects whose time can get multiplied inside of a loop. This includes type checking, type conversion, boxing, threads and more.

说“ LINQ是邪恶的并使事情变慢”是否公平？不，可以公平地说，如果您不知道发生了什么，那么代码通常会很不直观。可能存在一些细微的副作用，这些副作用的时间可以在循环内成倍增加。这包括类型检查，类型转换，装箱，线程等。

The Rule of Scale: The less you do, the more you can do of it.

规模规则：您做得越少，您可以做的越多。