numpy学习笔记：view vs copy

毫无公主范的17

已于 2023-04-11 22:44:12 修改

阅读量178

点赞数 1

文章标签： numpy 学习 python

于 2023-04-11 17:43:54 首次发布

本文链接：https://blog.csdn.net/m0_51795847/article/details/130087002

版权

每当执行一条numpy instruction时，总会创建一份copy或提供一份view。copy是物理上存在另一个地方的数据，而view存在内存的同一个地址。

Index and Fancy Indexing

[index](https://docs.scipy.org/doc/numpy/user/basics.indexing.html)总是返回view而[fancy indexing](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing)会返回copy。也就是说：当对index修改时，会同时修改原数据，而fancy indexing并不会。

什么是fancy indexing

fancy indexing是一种array indexing的方式。相比使用slice(:)或single ind
cies[single number]的index，fancy indexing使用indices arrays来获取索引。通过使用fancy indexing，我们可以获取特定的元素或元素集。

import numpy as np

# Create an array
a = np.array([1, 2, 3, 4, 5])

# Select elements at indices 1 and 3
b = a[[1, 3]]

# Print the result
print(b) # Output: [2 4]

fancy indexing也可用于多维数组切片

#multi-dimensional arrays. In this case, we pass multiple index arrays, one for each dimension of the array.
# Create a 2D array
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Select elements at row indices 0 and 2, and column indices 1 and 2
b = a[[0, 2], [1, 2]]

# Print the result
print(b) # Output: [2 9]

index和fancy indexing有什么区别

import numpy as np

Z = np.zeros(5)
Z_view = Z[:3] 
Z_view[...] = 1 #Z_view modifies the base array 'Z'
print(Z)
Z = np.zeros(5)
Z_copy = Z[[0,1,2]]
Z_copy[...] = 1 #Z_copy does not modify the base array 'Z'
print(Z)

#[1. 1. 1. 0. 0.]
#[0. 0. 0. 0. 0.]

区分指令结果是view还是copy

可以通过Base属性区分

import numpy as np
Z = np.random.uniform(0,1,(5,5)) #draws sample from a uniform distribution
Z1 = Z[:3,:]
#print("Z1",Z1)
Z2 = Z[[0,1,2], :]
#print("Z2",Z2)
print(np.allclose(Z1,Z2)) #returns True if two arrays are element-wise equal within a tolerance.
print(Z1.base is Z)#return true if memory of Z1 is shared with Z and false otherwise
print(Z2.base is Z)#return true if memory of Z2 is shared with Z and false otherwise
print(Z2.base is None) #return true if meory of Z2 is not shared

#True
#True
#False
#True

Z2是一份copy，因此Z2.base is Z returns false，Z2不与Z共享内存

Ravel and Flatten

一些numpy的function尽可能返回view（比如ravel），而还有一些总是返回copy（比如flatten）。

import numpy as np

Z = np.zeros((5,5))
print("Z:\n",Z)
print("Z.ravel().base:\n",Z.ravel().base)
print("Z.ravel().base is Z:",Z.ravel().base is Z) #returns true if memory of Z.ravel() is shared with Z

# ravel return view when possible, but not all cases.
print("\nZ[::2,::2].ravel():\n",Z[::2,::2].ravel())
print("\nZ[::2,::2].base is Z:",Z[::2,::2].base is Z)#returns true if memory of Z[::2,::2] is shared with Z
print("\nZ[::2,::2].ravel().base is Z:",Z[::2,::2].ravel().base is Z)#returns true if memory of Z[::2,::2].ravel() is shared with Z

print("\nZ.flatten()\n:",Z.flatten())
print("Z.flatten.base is Z:",Z.flatten().base is Z)#returns true if memory of Z.flatten() is shared with Z

'''
output
Z:
 [[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
Z.ravel().base:
 [[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
Z.ravel().base is Z: True

Z[::2,::2].ravel():
 [0. 0. 0. 0. 0. 0. 0. 0. 0.]

Z[::2,::2].base is Z: True

Z[::2,::2].ravel().base is Z: False

Z.flatten()
: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0.]
Z.flatten.base is Z: False
'''

Temporary copy

copy产生的更常见场景是隐式的。例如，你可能在算术运算中使用array：

import numpy as np
X = np.ones(10, dtype=np.int)#create an array X of size 10 containing ones
Y = np.ones(10, dtype=np.int)#create an array Y of size 10 containing ones
A = 2*X + 2*Y #store 2*X + 2*Y in A
print("X:",X)
print("Y:",Y)
print("A=2*X + 2*Y :\nA:",A)

在上面的例子中，有三个中间值(copies)产生，分别存放2*X,2*Y和2*X+2*Y。这个例子中数据量很小，但当数据量大的时候，你就要考虑怎么实现优化。一种解决方法是：

import numpy as np
X = np.ones(10, dtype=np.int) #create an array X of size 10 containing ones
Y = np.ones(10, dtype=np.int) #create an array Y of size 10 containing ones
print("X:",X,"Y:",Y,"\n np.multiply(X, 2, out=X)")
np.multiply(X, 2, out=X) # multiply X with 2 and store the result in X
print("X:",X,"Y:",Y,"\n np.multiply(Y, 2, out=Y)")
np.multiply(Y, 2, out=Y)# multiply Y with 2 and store the result in Y
print("X:",X,"Y:",Y,"\n np.add(X, Y, out=X)")
np.add(X, Y, out=X)# add X and Y and store the result in X
print("X:",X,"Y:",Y)

这解决了我们之前遇到的算术运算的问题，但也有许多情况需要我们创建这些copies(As you may need them).
通过一个小实验看一下这对性能的影响：

import numpy as np
#region tool function
def timeit(stmt, globals):
    import timeit as _timeit
    import numpy as np
    
    # Rough approximation of a single run
    trial = _timeit.timeit(stmt, globals=globals, number=1)
    
    # Maximum duration
    duration = 1.0
    
    # Number of repeat
    repeat = 3
    
    # Compute rounded number of trials
    number = max(1,int(10**np.floor(np.log(duration/trial/repeat)/np.log(10))))
    
    # Only report best run
    best = min(_timeit.repeat(stmt, globals=globals, number=number, repeat=repeat))

    units = {"usec": 1, "msec": 1e3, "sec": 1e6}
    precision = 3
    usec = best * 1e6 / number
    if usec < 1000:
        print("%d loops, best of %d: %.*g usec per loop" % (number, repeat,
                                                            precision, usec))
    else:
        msec = usec / 1000
        if msec < 1000:
            print("%d loops, best of %d: %.*g msec per loop" % (number, repeat,
                                                                precision, msec))
        else:
            sec = msec / 1000
            print("%d loops, best of %d: %.*g sec per loop" % (number, repeat,
                                                               precision, sec))
   # Display results
    # print("%d loops, best of %d: %g sec per loop" % (number, repeat, best/number))
#endregion
    

def test(X,Y):
  timeit("Z=X + 2.0*Y", globals()) #time taken by Z=X + 2.0*Y
  timeit("Z = X + 2*Y", globals()) #time taken by Z=X + 2*Y
  timeit("np.add(X, Y, out=X); np.add(X, Y, out=X)", globals()) #time taken by np.add(X, Y, out=X); np.add(X, Y, out=X)
   
X = np.ones(100000000, dtype=np.int) 
Y = np.ones(100000000, dtype=np.int)
test(X,Y)

'''
output
1 loops, best of 3: 293 msec per loop
1 loops, best of 3: 248 msec per loop
1 loops, best of 3: 171 msec per loop
'''

毫无公主范的17

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
numpy学习笔记：view vs copy

fancy indexing是一种array indexing的方式。相比使用slice(:)或single indcies[single number]的index，fancy indexing使用indices arrays来获取索引。通过使用fancy indexing，我们可以获取特定的元素或元素集。fancy indexing也可用于多维数组切片。
复制链接

扫一扫