数据处理编程2代码复杂度

最新推荐文章于 2023-11-21 09:54:54 发布

杨_明

最新推荐文章于 2023-11-21 09:54:54 发布

阅读量350

点赞数

文章标签： python 算法 java leetcode c++

原文链接：https://medium.com/@nazar.merza/data-processing-programming-2-code-complexity-cddbde141cd0

版权

第一部分：概念框架 (Part I: Conceptual Framework)

In this part some of the most important concepts of programming and design will be introduced which will lay a foundation and a starting point for the subject. In subsequent parts, as the topic will require more concepts and principles will be introduced.

在这一部分中，将介绍一些最重要的编程和设计概念，这将为该主题奠定基础和起点。在随后的部分中，由于该主题将需要更多的概念和原理，因此将被介绍。

I think instead of opening a new subject by giving a bunch of definitions, it is more useful to develop it, going through the thought process together with the reader or audience (In general one has to be very careful with definitions, for they can be useful when used properly but in many other cases they can work as inhibitors for a subject exposition, since by setting the boundaries for a problem, at once limit their expansion. Anyways, this is a question in theory of definitions in philosophy and we will not dwell on it here.)

我认为，与其提供一堆定义来开设一个新主题，不如通过与读者或听众一起进行思考的过程来发展它，这将更为有用(通常，对定义必须非常谨慎，因为它们可能会如果使用得当，它们很有用，但在许多其他情况下，它们可以作为主题博览会的抑制剂，因为通过设置问题的界限，立即限制了它们的扩展。住这里。)

So, let’s begin it. When programming, the right way is to think about it, to look at it from another person’s perspective. This idea can be stated as the following principle:

所以，让我们开始吧。进行编程时，正确的方法是考虑它，从另一个人的角度来看它 。这个想法可以陈述为以下原则：

Write the code such that for someone else, completely unfamiliar with it, it should require the least amount of effort to understand it.

编写代码，以便对于完全不熟悉的其他人，只需花最少的精力即可理解它。

This means, to put the same meaning in other words, programmer should intend for simplicity or program should be simple. Thus, simplicity is the first concept so far encountered in trying to formulate a conceptual framework for Data-processing programming (DPP). For now, there is no proof given for the above statement and let’s accept it as an axiom. It will be justified later, throughout this writing.

换句话说，换句话说，相同的含义是程序员应该简化或程序应该简单。因此，简单性是迄今为止试图为数据处理编程(DPP)制定概念框架时遇到的第一个概念。目前，尚无上述声明的证据，让我们将其作为一个公理。稍后将在整篇文章中论证。

Okay, then what is simplicity? It turns out that defining simplicity directly is not easy (without stating the same thing in other words) nor is it that useful to do so. Instead, a better approach would be to understand it through its opposite — complexity. Because, at least in the context of coding as it will be shown, simplicity is nothing but avoiding complexity. Avoiding complexity though is possible only by understanding it. Now that we have the second important concept in our investigation of the subject of programming — complexity — let’s understand what it is.

好吧，那么简单是什么？事实证明，直接定义简单性并不容易(没有用同样的话说同样的事情)，这样做也没有用。相反，一种更好的方法是通过相反的复杂性来理解它。因为至少在将要显示的编码上下文中，简单性不过是避免复杂性。但是只有通过了解复杂性，才有可能避免复杂性。现在，我们在研究编程主题时有了第二个重要概念-复杂性-让我们了解它是什么。

什么是复杂性？ (What is Complexity?)

Complexity, depending on the context may refer to different things and it needs to be clarified from the outset which meaning is intended in our exposition. But before, let’s begin by describing what meanings of the term are NOT intended here.

复杂性(取决于上下文)可能指的是不同的事物，因此需要从一开始就弄清我们论述中的含义。但是在此之前，让我们先描述一下该术语的含义。

这与计算复杂性无关 (It is NOT about Computational Complexity)

Perhaps the most widespread use of the term complexity in computer science and related fields is computational complexity which is often also called just complexity. Computational Complexity has a very specialized meaning: it is a measure of runtime efficiency of algorithms or the amount of resources an algorithm requires to run. In particular, the two most important and common resources being considered in complexity analysis are Time and Space. For example, there are different algorithms for sorting of arrays like Selection Sort, Bubble Sort, Insertion Sort, Merge Sort, Quick Sort, Heap Sort etc. For the same array, each of these algorithms require different amounts of space and time to execute. This consumption of resources for each algorithm determines their respective computational complexity. The more resources an algorithm uses the more complex it is. Mathematically, computational complexity is represented by what is called Big O Notation.

在计算机科学及相关领域中，术语“复杂性”的最广泛使用可能是计算复杂性 ，通常也称为复杂性。计算复杂度具有非常专业的含义：它是算法运行时效率或算法运行所需资源量的度量。特别是，复杂性分析中考虑的两个最重要且最常见的资源是时间和空间。例如，有不同的算法可以对数组进行排序，例如选择排序，气泡排序，插入排序，合并排序，快速排序，堆排序等。对于同一数组，这些算法中的每一种都需要不同的空间和时间来执行。每种算法的资源消耗决定了它们各自的计算复杂度。算法使用的资源越多，算法就越复杂。在数学上，计算复杂性由所谓的Big O Notation表示 。

Computational complexity is not of interest to us and is not our subject of discussion. As was stated earlier, computational complexity is a runtime measure, runtime behavior of algorithms and which is its fundamental property. Instead, we are interested in the complexity of the code of the program. From this point onward, we will completely forget about computational complexity.

计算复杂性不是我们感兴趣的内容，也不是我们讨论的主题。如前所述，计算复杂度是一种运行时度量，它是算法的运行时行为，并且是其基本属性。相反，我们对程序代码的复杂性感兴趣。从现在开始，我们将完全忘记计算复杂性。

Now let’s move on to the next topic, cyclomatic complexity.

现在让我们继续下一个主题，圈复杂度。

圈复杂度 (Cyclomatic Complexity)

According to Wikipedia,

根据维基百科，

Cyclomatic complexity is a software metric used to indicate the complexity of a program. It is a quantitative measure of the number of linearly independent paths through a program’s source code.

循环复杂度是一种用于指示程序复杂度的软件度量。它是对程序源代码中线性独立路径数量的定量度量。

To put it simply, based on this metric, for a piece of code, the more possible paths of execution, the more complex it is. For example, if the following is the whole code under consideration,

简而言之，基于此度量标准，对于一段代码，执行的可能路径越多，则它越复杂。例如，如果以下是正在考虑的整个代码，

z = x * 2

its complexity 1. Since there is only one path for this code. But for the following code complexity is increased, it is now 2

它的复杂性1.由于此代码只有一条路径。但是由于以下代码复杂度增加，现在为2

if (condition is true) then
    z = x * 2
else
    z = x * 4

Because, depending whether the condition evaluates to true or false, there are two paths along which the program can run. Thus, the second program is more complex than the first one.

因为，根据条件是真还是假，程序可以沿两条路径运行。因此，第二个程序比第一个程序复杂。

Image for post — Photo from
照片来自 Craftofcoding Craftofcoding

The concept of cyclomatic complexity is useful as it provides an insight into the nature of code complexity. Any serious programmer has to know it. Yet, as important as it is, it is not broad enough and does not explain the types of complexity that commonly arises in data-processing programming. Why? Because it is focused on measuring the control flow of the program. Control flow analysis, while an important factor in programming paradigms, in DPP many of computations are declarative in nature (like in SQL) where there is no control flow and their complexity comes from other sources. Even in most cases of if-else, switch or any conditional statements, their semantic can be recast so it becomes declarative, eliminating control flow concern (this will be a major topic in part III). To understand the type of complexity associated with DPP, a new concept which I call structural complexity is introduced.

循环复杂性的概念很有用，因为它可以深入了解代码复杂性的本质。任何认真的程序员都必须知道这一点。然而，尽管它很重要，但它还不够广泛，并且不能解释数据处理编程中通常出现的复杂性类型。为什么？因为它专注于测量程序的控制流 。控制流分析是编程范例中的重要因素，但在DPP中，许多计算本质上都是声明性的(如SQL)，其中没有控制流，并且其复杂性来自其他来源。即使在大多数情况下， 如果if-else ， switch或任何条件语句，它们的语义也可以重铸，从而变得声明性，从而消除了对控制流的关注(这将是第三部分的主要主题)。为了理解与DPP相关的复杂性的类型，引入了一个我称为结构复杂性的新概念。

结构复杂度 (Structural complexity)

Maybe a good point for explaining structural complexity is to start from somewhere else, an analogy from mathematics where objects are distinct and clear. Although at first the examples in analogy may seem too trivial, the idea is quite analytically powerful and helps understand code complexity in DPP and perhaps complexity in general.

解释结构复杂性的一个好点可能是从其他地方开始，这可以从数学上进行类比，即对象清晰而清晰。尽管起初类比的示例看似微不足道，但这种想法在分析上非常强大，有助于理解DPP中的代码复杂性，也许还可以理解一般的复杂性。

In mathematics, linear equations are the simplest of all equation types. It is the simplest because it is the easiest to understand and to solve. That’s why it is the first type of equation one learns at school. (Note: Equations and examples are intentionally simple, only to make the point and not to take us away from focusing on our subject of inquiry.)

在数学中，线性方程式是所有方程式中最简单的。这是最简单的，因为它最容易理解和解决。这就是为什么它是人们在学校学习的第一类方程式的原因。 ( 注意：公式和示例在故意上是简单的，仅是为了指出要点，而不是使我们脱离关注调查的主题。)

Anyways, the following is a linear equation:

无论如何，以下是线性方程式：

24 = 2x                                     (1)

Easy to solve, isn’t it? Just divide both sides by 2 and the solution is x = 12. Now, consider the next equation.

容易解决，不是吗？只需将两边都除以2即可得出x = 12 。现在，考虑下一个方程。

24 = 2x + 5x — 3x + 6x + 4x                 (2)

This equation appears more complicated than the first one; it is longer; has more terms etc. But, it is not. It can be reduced to

这个方程看起来比第一个更复杂。它更长有更多的条款，等等。可以减少到

24 = 20x

Therefore, in spite of its more complicated-looking view, in terms of complexity it is exactly equal to the first equation. In other words, they have the same degree of complexity.

因此，尽管它看起来更复杂，但就复杂性而言，它完全等于第一个方程。换句话说，它们具有相同程度的复杂性 。

Now consider the following quadratic equation:

现在考虑以下二次方程式：

24 = 2x^2                                     (3)

Although it looks simpler (shorter and more compact) than equation (2), in fact it is of a higher degree of complexity. It is not as intuitive and easy to understand as the linear equation. It can no longer be solved with the means of basic arithmetic operations (+, -, *, / ). It requires a separate method (Quadratic formula) and more mathematical devices such as radicals, irrational numbers, multiple solutions etc (In fact, modern formulation of quadratic formula had to wait until 1637 by Rene Descartes.) Speaking in terms of structural complexity (2x^2) is a more complex structure than (2x + 5x — 3x + 6x + 4x) as it is more difficult to understand and to solve.

尽管它看起来比方程式(2)更简单(更短，更紧凑)，但实际上它具有更高的复杂度。它不像线性方程式那样直观和易于理解。它不再可以通过基本算术运算(+, -, *, / ) 。它需要一个单独的方法(二次方程式)和更多的数学手段，例如部首，无理数，多个解等(实际上，现代二次方程式的制定要等到1637年才由Rene Descartes进行。)就结构复杂性而言( 2x^2 )比更复杂的结构 (2x + 5x — 3x + 6x + 4x)因为它是更难以理解和解决。

Taking this analogy to the field of programming, the following nested loop

以此类推到编程领域，下面的嵌套循环

for i in array1:
    for j in array2:
        Do something involving i and j at once

is more complex than the next code snippet consisting of two loops.

比下一个包含两个循环的代码段更复杂。

for i in array1:
    Do something with ifor j in array2:
    Do something with j

Why? Because in the first case, one needs to trace the values of indices, i and j, together with all variables and logic associated with them at once, as both loops progress (it gets more complicated as the loops and the operations within them get more complex). The two loops are intertwined and one cannot be separated from the other. While in the second case, each loop is independent of the other and it involves only tracing progression of one at a time. Hence simpler!

为什么？因为在第一种情况下，随着两个循环的进行，都需要立即跟踪索引i和j的值以及与它们关联的所有变量和逻辑(随着循环和其中的操作变得更多，它会变得更加复杂)复杂)。这两个循环是交织在一起的，一个循环不能彼此分开。在第二种情况下，每个循环彼此独立，并且一次只跟踪一个循环。因此更简单！

Now let’s take this idea one step further, or, one step closer to our subject of study, an example from data-processing programming. A nested query, like the following

现在，让我们将这一想法再进一步一点，或者离我们的研究主题更近一步，即数据处理编程中的一个例子。嵌套查询，如下所示

involving three layers of SELECT, is more complex than three SELECT queries in sequence.

涉及三层SELECT的顺序要比依次进行三个SELECT查询的顺序复杂。

As the last example to reinforce the notion of structural code complexity, now we know the the following code is complex and why so.

作为加强结构代码复杂性概念的最后一个示例，现在我们知道以下代码很复杂，为什么会这样。

Structurally more complex code involves more dimensions to consider at once, hence it can also be called dimensional complexity.

结构上更复杂的代码会同时涉及更多维度，因此也可以称为维度复杂度。

Emphasis: here, it is not being suggested that things like nested loops or nested queries are always bad and to be avoided; it would be naive to say so.

重点：这里并没有建议嵌套循环或嵌套查询之类的东西总是不好的，应该避免；这么说是天真的。

In the same way that in mathematics equations of different degrees and types are necessary, these constructs are part of programming and exist for a reason. The problem is not with the language constructs themselves but in how they are used. In later parts, these cases will be contextualized. Here they are used only to illustrate the idea of structural code complexity.

就像在数学中需要不同程度和类型的方程式一样，这些结构也是编程的一部分，并且出于某种原因而存在。问题不在于语言结构本身，而在于如何使用它们。在后面的部分中，将对这些情况进行情境化。在这里，它们仅用于说明结构代码复杂性的想法。

<< Previous : Data Processing Programming (1): Introduction

<<上一页：数据处理编程(一)：简介

To be continued …

未完待续 …

Originally published at https://objectacademy.com.

最初发布在 https://objectacademy.com上 。

翻译自: https://medium.com/@nazar.merza/data-processing-programming-2-code-complexity-cddbde141cd0

杨_明

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
数据处理编程2代码复杂度

第一部分：概念框架 (Part I: Conceptual Framework)In this part some of the most important concepts of programming and design will be introduced which will lay a foundation and a starting point for the subject...
复制链接

扫一扫