Python图解（第2部分）

最新推荐文章于 2023-01-29 09:30:00 发布

cunchi8090

最新推荐文章于 2023-01-29 09:30:00 发布

阅读量662

点赞数

文章标签：指针编程语言 c++ python java

原文链接：https://www.experts-exchange.com/articles/6589/Python-illustrated-part-2.html

版权

不太奇怪，但仍是介绍 (Less strange, but still introduction)

This introduction was added (1st August, 2011) to reflect some reactions. Firstly, the term basics in the title of the article... As any other word, it is a symbol with meaning attached to the word by some agreement. Still, we always have to think in some context to understand the agreement. You may say that what is described in this and previous articles is not related to Python basics at all. It could be because you expect a text written in some context that you met when reading some other "basics". The article series could be renamed to Python foundation or Python internals. However, the later terms are more technical, and I do not think it would help you to understand what I want to write about. Instead, I am asking you to change the context of thinking about it. Tutorials often start with the simplest example that shows a working program (say "Hello, World!" -- it is also a kind of unspoken agreement bound to the "creation of a tutorial"). This is understandable. There are many ways of how to attract a beginner's attention. The context of "basics" in this series is based on "what you should know to understand". I am focusing on "mental pictures of what is done" rather than on "how to write the block of code". For that, I need also the text of this second part. I am aware of the situation that you may want to skip these two articles with disagreement. ("Not related to Python at all!") In the same time I believe, that the information will be useful for those who find some knowledge gaps when reading the next part 3.

添加了此简介（2011年8月1日）以反映一些React。首先，该文章标题中的

The previous part 1 (http:A_5354-Python-basics-illustrated-part-1.html) explained where the term "variable" came from, how we can think about it, what it represents, how the variables are related to mathematics on one side, and to computers on the other side. This part (2) is focused on more details that you have to understand if you really want to be good in thinking about variables (creating a mental picture) and about using them in programming languages.

上一部分1（ http：A_5354-Python-basics- 说明 d-part-1.h tml ）解释了“变量”一词的来源，我们如何思考，它代表什么，变量如何一方面与数学相关，另一方面与计算机相关。第2部分着重于更多细节，如果您真的想很好地考虑变量（创建心理图）以及在编程语言中使用变量，则必须了解这些细节。

For explaining principles of working with variables, we need some excursion outside the Python point of view. This part describes variables in context of traditional compiled languages to show some details. Let's talk about pointers and references. Good understanding will be necessary to swallow the part 3.

为了解释使用变量的原理，我们需要在Python观点之外进行一些考察。这部分在传统编译语言的上下文中描述了变量，以显示一些细节。让我们谈谈指针和引用。吞下这部分3。

（旧）编译语言中的变量 (Variables in the (old) compiled languages)

Let's think about the equation a + b = 5 as about the Boolean expression that returns--for the given content of variables a and b--the Boolean value that indicates whether the values are the solution for the equation.

让我们考虑方程

names of the variables are totally replaced by addresses, by register names/numbers, or the values are optimized to be the parts of the generated machine instructions. There is no trace of strings like "a", "b", or "result" in the generated executable. The only exception is when you generate so called debug version of the executable and you explicitly tell the compiler to remember the original names from the source code; however, the computer does not need that. Only the human, the programmer wants the names be present in the debug info to make the compiled code more readable when debugging.

From that point of view, a name of a variable to a human programmer is in the similar relation as an address of a memory space to a processor. The name of the variable is directly bound to the address. Or YOU work with the name as with the string (in the source code) or the COMPILER works with the address that is the result of the compilation of the name of the variable (in the binary, executable code).

从这个角度来看，

变量的技术表示 (Technical representation of a variable)

Technically, the variable needs the memory space and the identification to be useful for a program. The identification without the memory space makes no sense -- one cannot store anything inside. The memory space without the identification is useless either -- one cannot access the memory space.

从技术上讲，变量需要存储空间和标识才能对程序有用。没有存储空间的标识是没有意义的-内部不能存储任何东西。没有标识的存储空间也无用，无法访问该存储空间。

We were speaking about variables in the previous part. However, the same holds for objects, i.e. for the memory space that is used to store the data of the object. Better to say, anything in Python is an object that can be identified, including the things that are called variables in some languages. But stay tuned, the details will be explained later. The topic is not that easy as you may think at first. (There are more situations like that. Say, the strings -- the topic "well known from the time of written history". Or not? See http://diveintopython3.org/strings.html).

我们在上一部分中讨论了http://diveintopython3.org /strings.h tml ）。

The memory address serves as a good technical identifier. It is unambiguous. Moreover, you need no transformation to get the information where the memory is located. The major Python implementations also take that approach -- the address is equal to the technical identification of any variable (or of any object). Having any object obj in Python, you can apply the built-in function id(obj) to get the identification. (However, other Python implementations are free to use a different kind of identification in future. Think about a distributed computing environment where there is no single, shared memory address space. The address is ambiguous in such systems unless it is extended by some extra information about the location of the memory/computer.)

内存地址是一个很好的技术标识符。这是明确的。此外，您无需进行任何转换即可获取内存所在的信息。主要的Python实现也采用这种方法-地址等于任何变量（或任何对象）的技术标识。在Python中具有任何对象obj ，您可以应用内置函数id（obj）来获取标识。（但是，将来其他Python实现可以自由使用其他类型的标识。请考虑一个没有单个共享内存地址空间的分布式计算环境。在此类系统中，该地址是模棱两可的，除非通过一些额外的信息对其进行扩展。关于内存/计算机的位置。）

What about the size of the reserved memory space? When you need to store a Boolean value, one bit would be enough. As you cannot address one single bit, the smallest possible piece of memory, one byte, is often used. If you want to store an integer, you usually think about 4 or more bytes. It depends on the programming language, on the compiler, and also on the hardware. (In Python, integer variables are not that limited. The space for one integer value may vary depending on the actual value.)

保留的内存空间大小如何？当您需要存储布尔值时，一位就足够了。由于无法寻址一位，因此经常使用最小的内存（一个字节）。如果要存储整数，通常会考虑4个或更多字节。它取决于编程语言，编译器以及硬件。（在Python中，整数变量没有限制。一个整数值的空间可能会根据实际值而有所不同。）

To summarize, the memory space depends on the type of the value and sometimes also on the actual value (think about a string of a different length in whatever language]. In compiled languages, the size of trivial types is known. Therefore, the size of memory is related to the type. However, the information about the type is used/processed only during the compilation. Similarly to variable names, the type is used only to check statically if the things are done in a correct way. During the compilation, the type-name information is lost, and the related size of memory is present as numbers in the executable.

总而言之，内存空间取决于值的类型，有时还取决于实际值（考虑任何语言的长度不同的字符串）。在编译语言中，琐碎类型的大小是已知的。因此，大小内存的大小与类型有关，但是有关该类型的信息仅在编译期间使用/处理，与变量名类似，该类型仅用于静态检查是否以正确的方式完成操作。，类型名称信息将丢失，并且相关的内存大小在可执行文件中以数字形式显示。

内存空间在哪里？ (Where the memory space is located?)

Think about a simple situation. You have a normal computer with one processor and with one RAM (Random Access Memory). The RAM address goes (for simplicity) from zero to 4.000.000.000. The variable needs say eight bytes. Where the eight bytes are to be located?

考虑一个简单的情况。您有一台带有一台处理器和一个RAM（随机存取存储器）的普通计算机。 RAM地址（为简单起见）从零变为4.000.000.000。该变量需要八个字节。八个字节将位于何处？

"I don't care." And you are right. The machine and the compiler should care. Anyway, the code needs to know the location. Where the knowledge about the location is stored?

“我不在乎。” 你是对的。机器和编译器应注意。无论如何，代码需要知道位置。有关位置的知识存储在哪里？

The compiled-language sources are processed by the compilers that consume the source texts and convert them to the machine code. In such case, the knowledge about the location must be hidden somewhere in the code. Otherwise the code would not be able to access that portion of memory (i.e. what once was the name of the variable). In the case, the address of the memory must be hidden in the code. But how?

Roughly said (i do not want to go into too much details):

粗略地说（我不想讲太多细节）：

The address can be a constant, and the related memory was once (in the source text) called

地址可以是常量，并且相关的内存曾经（在源文本中）被称为

a static variable. Such a variable keeps the content during the lifetime of the program and its content changes only when it is explicitly changed by the code.

The address is computed relatively -- by adding the numeric offset to some base address (in some register) of the memory subspace. (This is done as the part of the low-level instruction behaviour -- no human-related programming like "take the register content, add 5, and assign...) Think about a function that uses its own, local variables. When the function is called, it gets a block of memory big enough for storing its local variables. The local variable is located relatively to the beginning of the allocated block. The block is released when the function returns.

地址是相对计算的-通过将数字偏移量添加到内存子空间的某个基址（在某些寄存器中）来实现。（这是低级指令行为的一部分-无需进行任何与人类相关的编程，例如“获取寄存器内容，加5，然后赋值……）考虑使用自己的局部变量的函数。该函数被调用时，它将获得一个足以存储其局部变量的内存块，该局部变量位于相对于已分配块的开头的位置，该函数返回时将释放该块。

The memory is allocated dynamically, during the running time via calling some function (like

在运行期间，通过调用某些函数（例如，

C中的 malloc() in C) or via some other action bound to the dedicated keyword (like new). In such case, the address is known only after the data space was created (when the program is already running), and the address must be stored somewhere for the later reference. No name is bound to such memory space -- even in compiled languages. The size was deduced from the prescribed type ( new) or it was given explicitly to the function ( malloc()). In the first case, the compiler keeps track about the size derived from the type. In the later case ( malloc()), a programmer is responsible for working correctly within the allocated space.

脚本语言 (Scripting languages)

There was a time when so called scripting languages appeared (think about the Unix shell scripts or the Windows batch files). A command processor (say bash or cmd takes the source text called a script line by line and interprets the commands written in the file. These languages are also called interpreted. This is because of when and how the source text is processed when the script is launched. There are no binary native instructions stored in the executed script file. The source text is read and interpreted immediately. Of course, the script has to be interpreted by some binary executable program--the interpreter--that is capable to do the actions prescribed in the script. (This is done via association with the script extension or via special command at the beginning of the script.) Because of the way how it works, the interpreted languages are not so fast, not so powerful in comparison with compiled languages, and also their data types are somehow more limited. Often, the part of a name of a variable indicates also the type.

曾经有一段时间出现所谓的

Simply said, the work with variables is a bit magical in the scripting languages. You never need to know their memory address. As you usually write only simple scripts, you usually do not want to build more complex data structures. You do not care how the memory is allocated. (Anyway, the memory must be allocated dynamically.)

简而言之，在脚本语言中使用变量进行操作有点神奇。您永远不需要知道他们的内存地址。由于通常只编写简单的脚本，因此通常不希望构建更复杂的数据结构。您不在乎如何分配内存。（无论如何，必须动态分配内存。）

The interpreted languages may often be weakly typed. Simply said, if the string value looks like a number, it can be treated as a number. Because of that, you can use such value with numeric operators, for example.

解释语言通常可能是

The simpler languages often play tit for tat. And we sometimes need something in between the simple scripting languages and the extremely powerful (fast running and expressive) compiled languages. This can be done, and Python is the example of such language. Before speaking about details of (also called) dynamic languages, we need to learn something more about indirect access to memory space (i.e. to the stored values). It is a natural feature also in compiled languages; however, it is essential for dynamic languages.

较简单的语言经常

指针 (Pointers)

Oh the bloody pointers! It is the theme where many programmers with less formal education and/or with not enough greed for knowledge fail. When looking at, say, C++ source code without the knowledge, many students just give up and switch off their brain. Possibly they have never got the satisfactory explanation. Let's enhance our imagination using the following pictures. Let's start the hard way -- from the magical C++ source code to the abstract pictures. (No problem if you do not know the C++ language. I will explain the necessary things.)

哦，该死的指针！这是许多受过正规教育和/或对知识的贪婪程度不够的程序员失败的主题。例如，在没有知识的情况下查看C ++源代码时，许多学生只是放弃并关闭了大脑。可能他们从来没有得到令人满意的解释。让我们使用以下图片来增强我们的想象力。让我们开始艰难的方法-从神奇的C ++源代码到抽象图片。（如果您不懂C ++语言，没问题。我将解释必要的事情。）

Let's start with:

让我们开始：

int var;
int *ptr;

The first line declares that there will be the variable named var of the type int (means integer). The second line looks almost the same. There will be variable ptr somehow related to the type int. But what means the star? The star changes the meaning of the declaration of the variable. The variable will not be of the int. It will be of the pointer to int type, instead. Have a look at the following picture:

第一行声明将存在一个名为

var and ptr variables need their memory space. In other words, pointer variable is also a variable. A pointer variable also stores a value. The value could be called a pointer value. Basically, it is the address of another memory. As explained earlier, any variable needs memory space and the identification. Any pointer variable needs that much space to be able to store the address. If the hardware is capable to address directly say 4 GB (i.e. 2^32 bytes), then the variable needs 32 bits (4 bytes) of memory to store the address. However, if we use the 64-bit Operating System, then we say we can (theoretically) directly address 2^64 bytes of memory. I do not believe you have that much memory in your computer. Anyway, the pointer variables need 8 bytes in such case.

Let's assign the values to the variables to see how to work with pointers:

让我们将值分配给变量，以了解如何使用指针：

var variable. The earlier picture shows that the four bytes at the address 12 are filled with the binary representation of the value 2. Notice, that the variable was given the memory space at the address 12. The notation &var means the address of the variable. Now, the ptr variable is ready to store the address. This way, the address 12 can be assigned to the pointer variable (see the second line on the picture).

Having the pointer value (i.e. the address 12) inside the pointer variable that is located on address 56 (see the picture above), we can access the memory space of the var variable also indirectly, through the pointer. To do that, we have to read the content of the variable ptr located at the address 56, and then we have to use its content (12) to locate the memory at that address. That's all! No extra magic. Well, the syntax may have look magically...

通过将指针值（即地址12）放在位于地址56（参见上图）的指针变量中，我们也可以通过指针间接访问

The *ptr (with the star at in front of the name of the variable says: take the content of the ptr and use it for indirect access to another part of memory. The process is named dereferencing. The pointer variable refers to some other memory, and we want to access it. As we have to explicitly use the star in front of the pointer variable, it is called explicit dereferencing.

Having the access to the part of memory means also the write access. The last line on the picture shows that the memory space is assigned by new value 3. If you now read the variable var, you would get the value 3. The memory space belongs to the var variable, but from now on it is not the only way how it can be accessed.

Does the pointer theme look so difficult to you now? Is there anything more to be said? Thinking about the principle, then nothing. But, well, yes.

指针主题现在对您来说看起来如此困难吗？还有什么要说的吗？考虑一下原理，什么也没有。但是，是的。

When talking about variables in compiled languages, we have said that the compiler converts the name to the address, deduces the size from the type of the variable, and uses the type for static checking during the compilation. Is there anything about types at the last picture? Well, yes.

在谈论编译语言中的变量时，我们曾说过编译器将名称转换为地址，从变量的类型中推断出大小，并在编译过程中使用该类型进行静态检查。最后一张图片中有关类型的内容吗？嗯，是。

The *ptr = 3; assigns the integer value. The integer value can be assigned only to an integer variable. How the compiler knows that this assignment is correct? The answer is not that difficult. It is more difficult to spot that the check must be done by the compiler.

The ptr was declared as pointer to int. This way, it says that it point to the memory of the size that is capable to store the int value. In other words, pointers in compiled languages are usually bound with some type. And again, the type is used only for checking during compilation (hence the name statically typed languages). The information about the type disappears when generating the binary executable. Anyway, the pointers are typed in the compiled languages.

What about untyped pointers? Are they possible? Is it possible to have a pointer to any type? The short answer is yes. The pointer variable itself requires always the same amount of memory (4 bytes on 32-bit OS, 8 bytes on 64-bit OS). It simply stores the address. The problem is that then the compiler does not know how big is the block of memory that is pointed to. The pointer variable stores no knowledge about the size of the target. The programmer have to get the information from somewhere else. One of the possibilities is to store the size of the pointed structure at the beginning of the structure.

那无类型的指针呢？有可能吗？是否有可能指向任何类型的指针？简短的答案是肯定的。指针变量本身始终需要相同数量的内存（32位OS为4字节，64位OS为8字节）。它只是存储地址。问题在于，编译器不知道所指向的内存块有多大。指针变量不存储有关目标大小的知识。程序员必须从其他地方获取信息。一种可能性是在结构的开头存储尖的结构的大小。

What else? NULL. Or the similar name. This is the special value that can be assigned a pointer variable of any type. This means that you can assign it also, say, to the pointer to int. This is the only pointer-value constant available. It says: "the pointer points to nowhere." Naturally, you cannot de-reference such pointer. The "nowhere" can never be accessed. If you try, the screaming sound... kidding, not screaming actually... the error is somehow announced.

还有什么？空值。或类似的名字。这是可以为任何类型的指针变量分配的特殊值。这意味着您也可以将其分配给指向int的指针。这是唯一可用的指针值常量。它说：“指针指向无处。” 自然，您不能取消引用此类指针。无法访问“无处”。如果尝试，尖叫的声音...开玩笑，实际上不是尖叫...错误以某种方式被宣布。

象征性地绘制变量和指针 (Drawing variables and pointers symbolically)

Our brain likes thinking in abstractions. "No boring numbers, please. I tend to forget them." We always solve (sub)problem in our head first, and only then we write the idea down. We use our imagination. If it is too much to keep it in our brain, we draw it on a paper...

我们的大脑喜欢抽象思维。 “请不要无聊的数字。我倾向于忘记它们。” 我们总是首先解决头脑中的（子）问题，然后才把想法写下来。我们运用我们的想象力。如果太多内容无法保存在我们的大脑中，我们可以将其绘制在纸上...

Also, we need some graphical abstraction to be able to share what we mean with others. We want to keep the picture as abstract as possible, as simple as possible, but not simpler than needed.

另外，我们需要一些图形抽象

the pointer). And the pointer should be pointed, right? ;) To summarize, the pointer value is drawn as a fat dot with an arrow pointed to the other memory-space rectangle.

We also do not care how the NULL is implemented. The only interesting feature is that it points to nowhere. The "electric ground" mark is usually used. That mark was very usual and well known to those who worked with first computers -- not counting Charles Babbage (http://en.wikipedia.org/wiki/Babbage) and "World's First Computer Programmer" Ada Lovelace (http://en.wikipedia.org/wiki/Ada_Lovelace).

我们也不在乎如何实现NULL。唯一有趣的功能是它指向无处。通常使用“接地”标记。这个标记非常常见，并且对于那些使用第一台计算机的人来说是众所周知的-不包括Charles Babbage（ http://en.wikipedia.org/wi ki /白菜）和“世界上第一个计算机程序员” Ada Lovelace（ http://en.wikipedia.org/wi ki / Ada_Lov 埃莱克斯）。

Now, repeat "the pointers" until you really know what they are about.

现在，重复“指针”，直到您真正了解它们的含义为止。

参考资料 (References)

The term references often causes confusion. The main reason is that it can be used in different contexts.

术语“

The more abstract point of view is based on the idea that the reference value allows you to access some data (memory space) indirectly. From that abstract point of view, the plain old pointers are also references.

更抽象的观点是基于这样的思想，即参考值允许您间接访问某些数据（内存空间）。从抽象的角度来看，普通的旧指针也是引用。

However, the terms references and pointers are often discussed together, and there are some differences emphasized between them in the case. Usually, the references are said to be de-referenced automatically when used (unlike the pointers).

但是，术语“引用”和“指针”通常一起讨论，在这种情况下，它们之间要强调一些区别。通常，引用在使用时被称为自动取消引用（与指针不同）。

The confusion has roots also in the fact that different programming languages think differently about references. Some languages do not use the term references, some languages do not know pointers. Some languages do not have any special syntax for references, and the fact of working with references internally may be completely hidden (this is also the case of Python -- see later).

混乱的根源还在于不同的编程语言对引用的看法不同。某些语言不使用术语引用，某些语言不知道指针。某些语言没有用于引用的任何特殊语法，并且在内部使用引用的事实可能被完全隐藏（Python也是这种情况，请参阅稍后）。

Back to the old compiled languages, here C++.

回到旧的编译语言，这里是C ++。

ref variable must be initialized when declared. The reason is that it must always contain a reference to some already existing variable, i.e. to its allocated memory space.

Here the ref variable was located by the compiler/linker/loader at the address 75 (not important). The important is that its memory space was filled with the address of the var variable immediately. The C++ restriction is that you cannot change the reference variable later. Any usage of the ref has the same effect as using the variable var.

此处

as if it was another name for the pointed variable. In the example, the reference variable is named ref which forces you to think this way about it. But imagine if it was named myVar. When using a reference variable, it looks the same as if you worked with a normal, simple variable. It is because the automatic de-reference is done (by compiler, for you). Anyway, the target memory space is still accessed indirectly.

In other words, when you do not see the reference-variable declaration, you are not able to say if you work with a simple variable or (indirectly) with another variable.

换句话说，当您看不到引用变量声明时，您将无法说出是使用简单变量还是（间接）使用另一个变量。

Python参考 (References for Python)

Python uses references internally a lot. Let's define the term reference the way we need for explanation of the Python internals.

Python内部大量使用引用。让我们定义术语“

References for explaining Python internals -- graphically.

However, Python is a dynamic language where memory space for the objects is always allocated in run-time, and where the name of any variable is kept intentionally in a string form inside Python's internal data structures.

但是，Python是

Why to call them references and not pointers? This is because you will find no explicit de-referencing when working with Python. But that's not all! There is one more indirection level when working with variables. The topic will be discussed more in part 3.

为什么要称它们为

A side note: There is nothing like NULL value for Python references. This would break the concept of references. However, there is the single object named None that plays the same role. Whenever another object refers to None, it means that there is nothing more interesting there (if the fact is not interesting on its own).

附带说明：Python引用没有类似NULL的值。这将破坏引用的概念。但是，只有一个名为

Topics to be discussed in part 3

第3部分中 要讨论的主题

- Python as a dynamic language

-Python作为一种动态语言

- everything in Python is an object

-Python中的所有对象都是对象

- trivial Python built-in types

-普通的Python内置类型

- container built-in types

-容器内置类型

- what is behind the assignment operation in Python