【分治练习】求和_求和范围是优化中的一项练习

最新推荐文章于 2022-08-07 00:38:14 发布

weixin_26705651

最新推荐文章于 2022-08-07 00:38:14 发布

阅读量395

点赞数

文章标签： python

原文链接：https://medium.com/globant/summing-ranges-an-exercise-in-optimization-3fd141a113bf

版权

【分治练习】求和

In a past article, Better Form Design through Dynamic Programming, we developed a search algorithm that looked for an optimum layout for web forms. At (many!) points in the algorithm, a simple calculation was needed: summing all the values in a given array of numbers, between two given positions. In this article we’ll consider several ways to optimize it, by applying different techniques to get a maximum performance solution, thus providing a basis to optimize other functions you may have.

在上一篇文章“通过动态编程实现更好的表单设计”中，我们开发了一种搜索算法，用于寻找Web表单的最佳布局。在算法的(很多！)点，需要一个简单的计算：将给定数字数组中两个给定位置之间的所有值相加。在本文中，我们将考虑通过应用不同的技术来获得最佳性能解决方案的几种优化方法，从而为优化您可能拥有的其他功能提供了基础。

原来的问题 (The original problem)

In the Form Design article we mentioned, we had a list of field lengths (an array of numbers) and we frequently required finding the sum of the values in a range of positions (from x to y, both inclusive ) of the array. That function, totalWidth(x,y), needed to be as fast as possible to not impact negatively the performance of the general algorithm. So, let’s see how we can go about optimizing it, and we’ll produce seven different versions of the code so we can compare styles and techniques!

在我们提到的“表单设计”文章中，我们有一个字段长度列表(一个数字数组)，我们经常需要在数组的一系列位置(从x到y ，包括两端)中找到值的总和。该函数totalWidth(x，y)必须尽可能快，以免对常规算法的性能产生负面影响。因此，让我们看看如何对其进行优化，并且我们将产生七个不同版本的代码，以便我们可以比较样式和技术！

第一个(慢！)解决方案：循环 (A first (slow!) solution: looping)

Let’s start with the simplest possible solution. To sum the values in a given range (from … to) of a given array arr, we just do a loop summing elements, and return the sum. This is a very basic, rookie-level programming question!

让我们从最简单的解决方案开始。要对给定数组arr的给定范围( 从到 )内的值求和，我们只需对元素进行循环求和，然后返回总和。这是一个非常基本的菜鸟级编程问题！

const range1 = (arr, from, to) => {
  let sum = 0;
  for (let i = from; i <= to; i++) {
    sum += arr[i];
  }
  return sum;
};

This is clear, short, and correct — but if the array’s length is n, the algorithm’s performance is O(n)… not very good since we’ll be repeatedly calling this function. (In fact, we should write Θ(n) but let’s go with the more common, if not equally precise, usage.) The function does the job, but if called again and again with the same arguments, it just redoes all calculations, wasting time… how can we make this better?

这是清晰，简短和正确的-但是，如果数组的长度为n ，则算法的性能为O( n )…不太好，因为我们将反复调用此函数。 (实际上，我们应该写出Θ( n )，但让我们继续探讨更常见的用法，即使不是同样精确的用法。)函数可以完成这项工作，但是如果一次又一次地使用相同的参数调用它，它将重做所有计算，浪费时间……我们怎样才能使它更好？

第二种解决方案：记忆 (A second solution: memoizing)

In the forms design article I mentioned, and also in a previous article Memoize JavaScript Promises for Performance, we used a Functional Programming technique to optimize pure functions and avoid redoing work: memoizing. Basically, a memoized function, whenever called, will check an internal cache to see if it had already been called earlier with the same arguments: if so, the result will be returned directly from the cache, without any further ado, and if not found the function will do whatever it’s needed, and store the calculated result in the cache for the sake of possible future calls.

在我提到的表单设计文章中，以及在上一篇文章“ 为性能而实现JavaScript承诺”中，我们都使用了一种函数式编程技术来优化纯函数并避免重做工作： memoizing 。基本上，一个被记忆的函数无论何时被调用，都会检查内部缓存，以查看是否先前已经使用相同的参数对其进行了调用：如果是，则结果将直接从缓存中返回，而无需花费更多精力，如果找不到该函数将执行所需的任何操作，并将计算结果存储在缓存中，以备将来可能调用。

Given this idea, we could write a second version of our range summing function, as follows.

有了这个想法，我们可以编写范围求和函数的第二个版本，如下所示。

const memoize = (f) => {
  const cache = {};
  return (...args) => {
    const params = JSON.stringify(args);
    return params in cache 
      ? cache[params] 
      : (cache[params] = f(...args));
  };
};const range2 = memoize((arr, from, to) => {
  let sum = 0;
  for (let i = from; i <= to; i++) {
    sum += arr[i];
  }
  return sum;
});

Let’s comment on the code above before considering the benefits of this solution. We are using a memoize(…) function of our own, but you could have used a widely available and high performance library such as fast-memoize, instead. Also, range2(…) could have been defined as follows — but I preferred having the function depend on no other functions.

在考虑此解决方案的好处之前，让我们先评论上面的代码。我们正在使用我们自己的memoize(…)函数，但是您可以使用广泛使用的高性能库，例如fast-memoize 。另外， range2(…)可以定义如下-但我更希望该函数不依赖于其他函数。

const range2 = memoize(range1);

How does this new version perform? If (a big if!) we call the function two or more times with the same arguments, it will return the needed sum instantly, but if we call it with different arguments every time, there will be no speedup, and rather it will perform worse than before, because of the extra cache work!

这个新版本的性能如何？如果(大if！)我们使用相同的参数调用函数两次或更多次，它将立即返回所需的总和，但是如果每次使用不同的参数调用该函数，将不会加速，而是会执行比以前更糟，因为额外的缓存工作！

And what about memory, could you run out? For the context of this article we are just considering computing time, with no memory issues — but we will discuss memory consumption issues and more intelligent caches on a future article!

那记忆力，你能用完吗？就本文而言，我们只是在考虑计算时间，而没有内存问题-但是在以后的文章中，我们将讨论内存消耗问题和更智能的缓存！

Here’s a hint about the performance problem: if you had calculated the sum of range of positions 10 through 20, and also the sum of range of positions 21 through 30, you would be able to quickly find the sum of range 10 through 30… but memoizing isn’t good enough for this; let’s turn to a Dynamic Programming solution.

这是有关性能问题的提示：如果您已计算出位置10到20的范围之和，以及位置21到30的范围之和，则可以快速找到范围10到30的总和……但是记忆不足以解决这个问题；让我们转向动态编程解决方案。

第三种解决方案：动态编程 (A third solution: Dynamic Programming)

The basic idea of Dynamic Programming algorithms is to reduce a problem to a collection of smaller ones, of the same type, whose solutions can be used to solve the original problem. If we want to optimize our range-summing algorithm in this fashion, we need to define the sum of a range in terms of sums of previous ranges. How can we do that? The following definition would work.

动态编程算法的基本思想是将一个问题简化为一组较小的相同类型的问题，这些问题的解决方案可用于解决原始问题。如果要以这种方式优化范围求和算法，则需要根据先前范围的总和定义范围的总和。我们该怎么做？以下定义将起作用。

if we want to calculate the sum of positions p through p (so, just a single element) of array arr, the result is simply arr[p].
如果我们要计算数组arr的 p到p的位置之和(因此只是一个元素)，结果就是arr [p] 。
if we want to calculate the sum of positions 0 through p (with p>0) we can do this recursively by calculating the sum of positions 0 through p-1, and adding arr[p] to that sum. Note that if we had already calculated the previous result, all we’d need now is a single new addition.
如果我们要计算位置0到p的总和( p > 0)，可以通过计算位置0到p-1的总和并向该总和加上arr [p]来递归地执行此操作。请注意，如果我们已经计算了之前的结果，那么现在只需要添加一个新的结果即可。
finally, if we want to calculate the sum of positions p through q (with q≥p>0) we can take advantage of previous calculations: subtract the sum of range 0 through p-1 from the sum of range 0 through q, and you just needed a single subtraction.
最后，如果我们想通过Q计算位置p的总和(其中q≥P> 0)，我们可以利用以前的计算的：穿过p减去的范围0之和从-1的范围为0至Q的总和，和您只需要一个减法。

We can implement this solution by using memoizing to keep track of previously calculated sums.

我们可以通过使用备忘录来跟踪先前计算出的总和来实现此解决方案。

const range3 = memoize((arr, from, to) => {
  if (from === to) {
    return arr[from];
  } else if (from === 0) {
    return range3(arr, 0, to - 1) + arr[to];
  } else {
    return range3(arr, 0, to) - range3(arr, 0, from - 1);
  }
});

Or, if you’d rather go for “one-liners” — even though we need several lines to show it, because of reduced space…

或者，如果您希望使用“单线”，即使由于空间有限，我们也需要显示几行文字…

const range4 = memoize((arr, from, to) =>
  from === to
    ? arr[from]
    : from === 0
    ? range4(arr, 0, to - 1) + arr[to]
    : range4(arr, 0, to) - range4(arr, 0, from - 1)
);

This works better than previous solutions. For instance, let’s redo our example of ranges 10 through 20 and 21 through 30. When calculating the sum of range 10 through 20, all the sums from 0 to 1, 0 to 2, etc., up to 0 to 20 would get cached. When you ask for the sum of range 21 through 30, it would recursively call for the sums of ranges 0 through 29, then 0 through 28, etc., but when asking for range 0 through 20 it wouldn’t go any further, because the result was already cached and available. And, to cap this all, if at a later point you asked for the range 10 through 30, it would just do a single subtraction, because the needed ranges (0 to 9 and 0 to 30) would have already been computed; nice!

这比以前的解决方案更好。例如，让我们重做范围10到20和21到30的示例。计算范围10到20的总和时，将缓存从0到1、0到2等的所有总和，直到0到20 。当您要求范围21到30的总和时，它将递归地要求范围0到29的总和，然后是0到28的依此类推，但是当要求范围0到20的时候它就不会再走了，因为结果已被缓存并可用。而且，总之，如果您稍后要求输入10到30的范围，它将只进行一次减法，因为所需的范围(0到9和0到30)已经被计算出来了。真好！

Note that range4(…), being memoized, still provides the advantage of instantaneous response when called again with the same arguments — but now it also optimizes calls with arguments that hadn’t been used yet.

请注意，被记忆化的range4(…)在再次使用相同的参数再次调用时仍具有即时响应的优势-但现在，它还优化了尚未使用的参数的调用。

As to performance, note that basically we still have an O(n) algorithm, but over time with many new calls, it eventually becomes an O(1) process. (Initial calls tend to be slow, later calls to be fast.) Can we do any better? Instead of having all those recursive calls and memoizing, we could attempt another solution: precomputing values, accept an extra initial performance cost, and go for an amortized performance solution.

关于性能，请注意，基本上我们仍然有O( n )算法，但是随着时间的推移，随着许多新的调用，它最终变成了O(1)进程。 (初始呼叫通常很慢，以后的呼叫通常很快。)我们可以做得更好吗？除了进行所有这些递归调用和记录外，我们可以尝试其他解决方案：预先计算值，接受额外的初始性能成本，然后选择摊销的性能解决方案。

第四种解决方案：预计算 (A fourth solution: precomputing)

Why use an implied cache as in memoizing? We could precompute all the possible sums for all ranges from 0 to any other position p, and then all future range calls would be answered in constant O(1) time. The initial precomputation would be O(n), to be sure, but that cost would be amortized over all future calls, to achieve an average O(1) result. (This assumes there will be many calls, certainly more than n; if you just needed a very few calls to sum ranges, precomputing would be a bad solution.) We can use an object to encapsulate the needed partial sums, as follows.

为什么在记忆中使用隐式缓存？我们可以预计算从0到任何其他位置p的所有范围的所有可能的总和，然后所有将来的范围调用将在常数O(1)时间内得到应答。可以肯定的是，初始的预计算将为O( n )，但是该费用将在以后的所有调用中摊销，以实现平均O(1)结果。 (这假设将有许多调用，肯定会超过n ；如果只需要很少的求和范围调用，则预计算将是一个糟糕的解决方案。)我们可以使用一个对象封装所需的部分和，如下所示。

class Ranger {
  #partial = [0];constructor(arr) {
    arr.forEach((v, i) => {
      this.#partial[i + 1] = this.#partial[i] + v;
    });
  }range(from, to) {
    return this.#partial[to + 1] - this.#partial[from];
  }
}

Note the use of private fields (#partial) for better encapsulation: this feature is already available in many browsers and in Node 12, but otherwise you should use Babel instead. The #partial array has an extra zero at the beginning to simplify the rest of the logic; check by yourself that the range(…) method works properly in all cases.

请注意使用私有字段( #partial )进行更好的封装：此功能已在许多浏览器和Node 12中可用，但否则应改用Babel 。 #partial数组的开头有一个额外的零，以简化其余逻辑。自己检查range(…)方法在所有情况下均能正常工作。

Did you notice that we also applied Dynamic Programming when computing partial[i+1] in terms of partial[i]? This makes the constructor run in O(n) time instead of in O(n²) time, as a simple implementation would have cost.

您是否注意到在根据partial [i]计算partial [i + 1]时我们也应用了动态编程？这使得构造函数在O(n)时间而不是O(n²)时间中运行，因为简单的实现会产生成本。

How would we use this class? The following is the normal pattern.

我们将如何使用该课程？以下是正常模式。

// Initialization:
const summer = new Ranger(myArray);// and then something like this:
width = summer.range(10,30);

We are getting somewhere! The constructor does the heavy lifting needed to calculate all the partial sums, but all range(…) calls require a single subtraction to produce the needed result.

我们到了某个地方！构造函数完成了计算所有部分和所需的繁重工作，但是所有range(…)调用都需要一个减法才能产生所需的结果。

You might want to optimize the range(…) method further by separately considering the from===to case and avoid a subtraction— but be sure to test if the extra test doesn’t take too much!

您可能想通过分别考虑from === to大小写并避免减法来进一步优化range(…)方法，但请确保测试额外的测试是否花费太多！

So, now we have a fast range sum calculation that, at the cost of some initialization, performs future requests in constant time. Can we do anything else? We may also want (for variety!) to implement this same sort of solution just with functions, instead of classes and objects; let’s finish with that.

因此，现在我们有了一个快速的范围总和计算，该计算以一定的初始化为代价，可以在恒定时间内执行将来的请求。我们还能做别的吗？我们可能还想(为了多样化！)仅使用函数而不是类和对象来实现这种解决方案。让我们结束。

第五种解决方案：函数和闭包 (A fifth solution: functions and closures)

We need not use classes in order to get private fields or initial (constructor-like) calculations; common functions using closures work as well. We could write a makeRanger(…) function that would take an array as input, and produce the corresponding range-summing function as a result.

我们不需要使用类来获取私有字段或初始(类似于构造函数的)计算；使用闭包的常见功能也可以工作。我们可以编写一个makeRanger(…)函数，该函数将数组作为输入，并产生相应的范围求和函数。

const makeRanger = (arr) => {
  const partial = [0];
  arr.forEach((v, i) => {
    partial[i + 1] = partial[i] + v;
  });
  return (from, to) => partial[to + 1] - partial[from];
};

Usage would be similar to our class-based solution from above.

用法将类似于上面的基于类的解决方案。

// Initialization:
const range6 = makeRanger(myArray);// and afterwards:
width = range6(10,30);

Calling makeRanger(arr) returns a function that uses an internal partial array, whose values were pre-computed in terms of the input arr argument. Essentially, this works in the same way as the #partial private field — but using a closure instead of an object.

调用makeRanger(arr)返回一个使用内部部分数组的函数，该数组的值已根据输入arr参数进行了预先计算。本质上，这与#partial私有字段的工作方式相同-但使用闭包而不是对象。

Just for the sake of it, we can show another version of the code above, with an IIFE (inmediately invoked function expression) which is another common programming pattern… though admittedly harder to understand!

仅仅为了这个目的，我们可以显示上面代码的另一个版本，带有IIFE(中间调用的函数表达式 )，这是另一个常见的编程模式……尽管很难理解！

const range7 = ((arr) => {
  const partial = [0];
  arr.forEach((v, i) => {
    partial[i + 1] = partial[i] + v;
  });
  return (from, to) => partial[to + 1] - partial[from];
})(myArray);// and then...
width = range7(10,30);

This style, eschewing a separate implementation for makeRanger(…) and putting it inline instead, would only be useful if you wanted just a single range function — but I decided to include it anyway just to highlight IIFEs, which are often found in JavaScript code!

这种样式避免了makeRanger(...)的单独实现，而是将其内联，仅在您只需要单个范围函数的情况下才有用-但是我还是决定将其包括在内只是为了突出显示IIFE，这些通常在JavaScript代码中找到！

摘要 (Summary)

In this article we have gone over a simple basic algorithm — summing numbers in a range of an array — and we’ve found several optimizations, making its performance go from a (bad) O(n) initial version to an essentially O(1) constant result over time.

在本文中，我们介绍了一种简单的基本算法-对数组范围内的数字求和-并且我们发现了几种优化方法，使其性能从(错误的)O( n )初始版本变为实质上的O(1 )随时间变化的恒定结果。

More important than the specific results here are the special techniques we applied: memoizing, Dynamic Programming, and pre-computing, with functional and OOP versions of the final code. These techniques may be successfully applied to many kinds of algorithms, and should be well known tools for all developers, for quick, easy wins!

比这里的特定结果更重要的是我们应用的特殊技术：记忆，动态编程和预计算，以及最终代码的功能和OOP版本。这些技术可以成功地应用于多种算法，并且应该是所有开发人员的众所周知的工具，以便快速，轻松地获胜！

翻译自: https://medium.com/globant/summing-ranges-an-exercise-in-optimization-3fd141a113bf

【分治练习】求和

weixin_26705651

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【分治练习】求和_求和范围是优化中的一项练习

【分治练习】求和In a past article, Better Form Design through Dynamic Programming, we developed a search algorithm that looked for an optimum layout for web forms. At (many!) points in the algorithm, a simple...
复制链接

扫一扫