python 小数据池

最新推荐文章于 2024-05-02 10:04:13 发布

左魏

最新推荐文章于 2024-05-02 10:04:13 发布

阅读量684

点赞数

本文链接：https://blog.csdn.net/z_bright/article/details/84637852

版权

强大自己是唯一获得幸福的途径，这是长远的，而非当下的玩乐！

首先，先说一下 "==" "is" "id"，因为下面说小数据池时会用到

>>> s1 = 300
>>> s2 = 300
>>> print(s1 == s2)           # == 判断两个对象是否相等，大家应该都没有异议
True
>>> print(s1 is s2)           # is 判断两个值的内存地址。
False
>>> id(s1)
4360482576
>>> id(s2)
4360482608                    # 为什么都是300，指向的内存地址不一样呢？


>>> s1 = 100
>>> s2 = 100
>>> print(s1 == s2)
True
>>> print(s1 is s2)           
True
>>> id(s1)
4356850720
>>> id(s2)                   # 值为100，这一次为什么内存地址又一样了呢
4356850720

代码块

A Python program is constructed from code blocks. A block is a piece of Python program text that is 
executed as a unit. 
The following are blocks: a module, a function body, and a class definition. Each command typed
interactively is a block.
A script file (a file given as standard input to the interpreter or specified as a command line argument
to the interpreter) is a code block. A script command
(a command specified on the interpreter command line with the ‘-c‘ option) is a code block. 
The string argument passed to the built-in functions eval() and exec() is a code block.
A code block is executed in an execution frame. A frame contains some administrative information
(used for debugging) and determines where and how execution continues after the code block’s execution
has completed.

上面的主要意思是：

Python程序是由代码块构造的。块是一个python程序的文本，他是作为一个单元执行的。

代码块：一个模块，一个函数，一个类，一个文件等都是一个代码块。

而作为交互方式输入的每个命令都是一个代码块。

什么叫交互方式？就是咱们在cmd中进入Python解释器里面，每一行代码都是一个代码块，例如：

>>> s1 = 300        # 这是一个代码块
>>> s2 = 300        # 这也是一个代码块

>>> def func1():
...     pass
...                   # 此处的函数体是一个代码块
>>> def func2():
...     print("123")
...     print("zw")
...                   # 此处的函数体也是一个代码块

➜  ~ ls|grep *.py
awa.py                # 此文件是一个代码块

代码块的缓存机制

为了节省内存，提高效率，python有小数据池的概念，意思为当你执行一个代码块时，python会先检查这个代码块是否存在，存在则重用，不存在则新建，这也就是为什么上面你生成了s1、s2两个变量，变量值都为100，但是内存地址一样的原因，也就是id相同，但是为什么当等于300的时候则不一样，也就是新建了呢？那是因为小数据池的缓存机制是有使用范围的：

代码块的缓存机制的适用范围： int（float），str，bool。

int(float):任何数字在同一代码块下都会复用。

bool:True和False在字典中会以1，0方式存在，并且复用。

str：几乎所有的字符串都会符合缓存机制

小数据池

小数据池，也称为小整数缓存机制，或者称为驻留机制等等.

大前提：小数据池也是只针对 int(float)，str，bool。

小数据池是针对不同代码块之间的缓存机制！！！

官方对于整数，字符串的小数据池是这么说的:

# 对于整数，Python官方文档中这么说：
The current implementation keeps an array of integer objects for all integers between -5 and 256, when you
create an int in that range you actually just get back a reference to the existing object. So it should be
possible to change the value of 1. I suspect the behaviour of Python in this case is undefined.

# 对于字符串：
Incomputer science, string interning is a method of storing only onecopy of each distinct string value, 
which must be immutable. Interning strings makes some stringprocessing tasks more time- or space-efficient 
at the cost of requiring moretime when the string is created or interned. The distinct values are stored 
ina string intern pool.

Python自动将 -5~256 的整数进行了缓存，当你将这些整数赋值给变量时，并不会重新创建对象，而是使用已经创建好的缓存对象。

python会将一定规则的字符串在字符串驻留池中，创建一份，当你将这些字符串赋值给变量时，并不会重新创建对象，而是使用在字符串驻留池中创建好的对象。

其实，无论是缓存还是字符串驻留池，都是python做的一个优化，就是将~5-256的整数，和一定规则的字符串，放在一个‘池’（容器，或者字典）中，无论程序中那些变量指向这些范围内的整数或者字符串，那么他直接在这个‘池’中引用，言外之意，就是内存中之创建一个。

优点：能够提高一些字符串，整数处理人物在时间和空间上的性能；需要值相同的字符串，整数的时候，直接从‘池’里拿来用，避免频繁的创建和销毁，提升效率，节约内存。

int：如上所述，小数据池的范围是-5~256 ，如果多个变量都是指向同一个（在这个范围内的）数字，他们在内存中指向的都是一个内存地址。

str:字符串从以下几个方面讨论：

乘数为1时：

仅含大小写字母，数字，下划线，默认驻留。

含其他字符，长度<=1,默认驻留。

含其他字符，长度>1,默认驻留。

乘数>=2时：

仅含大小写字母，数字，下划线，总长度<=20,默认驻留。

指定驻留：

>>> from sys import intern
>>> s1 = intern("zw左魏!#@*lp")
>>> s2 = intern("zw左魏!#@*lp")
>>> print(s1 is s2)
True

满足以上字符串的规则时，就符合小数据池的概念。

bool值就是True，False，无论你创建多少个变量指向True，False，那么他在内存中只存在一个。

看一下用了小数据池（驻留机制）的效率有多高：

显而易见，节省大量内存在字符串比较时，非驻留比较效率o(n)，驻留时比较效率o(1)。

结：

如果在同一代码块下，则采用同一代码块下的换缓存机制。

如果是不同代码块，则采用小数据池的驻留机制。

>>> s1 = 300
>>> s2 = 300                  #  直接python解释器运行，s1是一个代码块，s2是一个代码块
>>> print(s1 is s2)
False                         #  不同代码块走小数据池，300超出范围，所以为False



#在外面创建一个文件awa.py
➜  ~ echo -e "s1 = 300\ns2 = 300\nprint(s1 is s2)" >> awa.py
➜  ~ python3.6 awa.py             # 这个文件是一个代码块
True                              #  所以直接走代码块的缓存机制为True

# 注： 如果在函数内生成两个函数，则这两个函数都生成300的情况下
# 因为一个函数体是一个代码块， python会为每个函数生成临时空间(开辟一块内存)
# 随函数执行完成而消散。所以结果会为False

本文大量摘自太白金星博客园

https://www.cnblogs.com/jin-xin/articles/9439483.html

左魏

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
python 小数据池

强大自己是唯一获得幸福的途径，这是长远的，而非当下的玩乐！首先，先说一下 "==" "is" "id"，因为下面说小数据池时会用到&gt;&gt;&gt; s1 = 300&gt;&gt;&gt; s2 = 300&gt;&gt;&gt; print(s1 == s2)
复制链接

扫一扫