由Python的浅拷贝(shallow copy)和深拷贝(deep copy)引发的思考

最新推荐文章于 2024-11-05 15:28:12 发布

weixin_30879833

最新推荐文章于 2024-11-05 15:28:12 发布

阅读量114

点赞数

文章标签： python 数据结构与算法 c#

原文链接：http://www.cnblogs.com/ant-colonies/p/6592570.html

版权

首先查看拷贝模块(copy)发现：

>>> help(copy)
Help on module copy:
NAME
    copy - Generic (shallow and deep) copying operations.
DESCRIPTION
    Interface summary:
            import copy
            x = copy.copy(y)        # make a shallow copy of y
            x = copy.deepcopy(y)    # make a deep copy of y
    For module specific errors, copy.Error is raised.
    The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or
    class instances).

    - A shallow copy constructs a new compound object and then (to the extent possible) inserts *the same objects* into it that the
      original contains.

    - A deep copy constructs a new compound object and then, recursively, inserts *copies* into it of the objects found in the original.

...(here omitted 10000words)

由以上的信息可知：

1、相同点：都拷贝出了一个新的复合对象；

2、不同点：浅拷贝—— 在拷贝出的新的对象上插入（引用）源list对象的一切；

深拷贝—— 递归地拷贝源list对象中的一切。（彻头彻尾的另立门户）

现在出现了一个新的问题—— 拷贝

在计算机中拷贝一份数据或者拷贝一个变量，意味着系统需分配新的内存用于对拷贝数据的存放。

我们先来讨论一下变量的赋值（变量的数据结构中的内存地址域的拷贝）过程。

首先看一下变量的赋值过程：

 1 Python 2.6.6 (r266:84292, Aug 18 2016, 15:13:37) 
 2 [GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux2
 3 Type "help", "copyright", "credits" or "license" for more information.
 4 >>> a = 3
 5 >>> b = a
 6 >>> id(a)
 7 7488264
 8 >>> id(b)
 9 7488264
10 >>> a = 4
11 >>> id(a)
12 7488240
13 >>> id(b)               # 咦，b没有随a发生改变
14 7488264
15 >>> b
   3

要解释这个，必须要了解变量的数据结构。

当向系统申请创建一个变量时，系统先分配一块内存空间，该内存空间用于存储该变量。

变量的数据结构包括2部分：第一部分用于存储变量的名称和变量的数据类型的长度，第二部分用于存储内存地址（即索引）。

当变量未初始化时，第二部分数据为垃圾值；一旦初始化，该部分的值即为初始化值的内存地址。

例如：以上 a = 3, 其过程如下：

首先系统为常量3（int型）分配一块内存大小为4byte的空间存放常量3；然后将常量3的内存地址存储于变量a的第二部分。这样就完成了变量a的赋值过程。

b = a时，同样系统先分配一块内存空间存放变量b, 之后系统将a中的第二部分数据拷贝到b中的第二部分。

而id()的返回值正是变量的第二部分数据（内存地址）。

所以当执行a时，是根据第二部分的数据（内存地址）获取该内存的值。

当a = 4 时，变量a第二部分的数据即为常量4的存储地址，因此id(a)发生改变，而id(b)保持不变。

如下图：

回到浅拷贝和深拷贝的议题：

浅拷贝—— 在拷贝出的新的对象上插入（引用）源list对象的一切；

深拷贝—— 递归地拷贝源list对象中的一切。（彻头彻尾的另立门户）。

浅拷贝的实例：

 1 #!/usr/bin/python                  # Python2
 2 #
 3 import copy
 4 
 5 will = ["Will", 28, ["Python", "C#", "JavaScript"]]
 6 wilber = copy.copy(will)
 7 
 8 print id(will)                     # 140337318319672
 9 print will                         # ['Will', 28, ['Python', 'C#', 'JavaScript']]
10 print [id(ele) for ele in will]    # [140337318374208, 13394096, 140337318282160]
11 print '============================'
12 print id(will[2])                  # 140337318282160
13 print id(will[2][0])               # 140337318677600
14 print id(wilber[2][0])             # 140337318677600
15 print id(wilber)                   # 140337318386216
16 print wilber                       # ['Will', 28, ['Python', 'C#', 'JavaScript']]
17 print [id(ele) for ele in wilber]  # [140337318374208, 13394096, 140337318282160]
18   
19 will[0] = "Wilber"
20 will[2].append("CSS")
21 print id(will)                     # 140337318319672
22 print will                         # ['Wilber', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
23 print [id(ele) for ele in will]    # [140337318374448, 13394096, 140337318282160]
24 print id(wilber)                   # 140337318386216
25 print wilber                       # ['Will', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
26 print [id(ele) for ele in wilber]  # [140337318374208, 13394096, 140337318282160]

浅拷贝只是生成一个新的对象，数据结构以及索引关系未变。

浅拷贝时，列表will与wilber由系统分配不同的地址，系统将列表will的第一层进行拷贝即：will[0], will[1], will[2]拷贝，故wilber[0]与will[0]，wilber[1]与will[1]，wilber[2]与will[2]，指向相同的内存地址。

如下图所示：

深拷贝实例：

 1 #!/usr/bin/python
 2 #
 3 import copy
 4  
 5 will = ["Will", 28, ["Python", "C#", "JavaScript"]]
 6 wilber = copy.deepcopy(will)
 7 
 8 print id(will)                    # 139899797283040    
 9 print will                        # ['Will', 28, ['Python', 'C#', 'JavaScript']]
10 print [id(ele) for ele in will]   # [139899797338992, 11432112, 139899797246896]
11 print '============='
12 print id(will[2])                 # 139899797246896
13 print id(wilber[2])               # 139899797351024
14 print id(will[2][0])              # 139899797642336
15 print id(wilber[2][0])            # 139899797642336
16 print id(wilber[2][1])            # 139899797339088
17 print id(wilber)                  # 139899797349296
18 print wilber                      # ['Will', 28, ['Python', 'C#', 'JavaScript']]
19 print [id(ele) for ele in wilber] # [139899797338992, 11432112, 139899797351024]
20  
21 will[0] = "Wilber"
22 will[2].append("CSS")
23 print id(will)                    # 139899797283040
24 print will                        # ['Wilber', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
25 print [id(ele) for ele in will]   # [139899797339280, 11432112, 139899797246896]
26 print id(wilber)                  # 139899797349296
27 print wilber                      # ['Will', 28, ['Python', 'C#', 'JavaScript']]
28 print [id(ele) for ele in wilber] # [139899797338992, 11432112, 139899797351024]

深拷贝会递归（逐层）拷贝list的数据结构。

深拷贝时，系统将列表will逐层进行拷贝即：列表will与wilbe，will[2]与wilber[2]由系统分配不同的地址，will[0], will[1], will[2]，will[2][0], will[2][1], will[2][2]拷贝;

故wilber[0]与will[0]，wilber[1]与will[1], will[2][0]与wilber[2][0], will[2][1]与wilber[2][0], will[2][2]与wilber[2][2]，指向相同的内存地址。

附注-list之间的赋值代码：

 1 #!/usr/bin/python
 2 #
 3 will = ["Will", 28, ["Python", "C#", "JavaScript"]]
 4 wilber = will
 5 print id(will)
 6 print will
 7 print [id(ele) for ele in will]
 8 print id(wilber)
 9 print wilber
10 print [id(ele) for ele in wilber]
11  
12 will[0] = "Wilber"
13 will[2].append("CSS")
14 print id(will)
15 print will
16 print [id(ele) for ele in will]                # 发现操作的是同一对象
17 print id(wilber)
18 print wilber
19 print [id(ele) for ele in wilber]