I was reading The Hacker's Guide to Python
this afternoon. In the chapter of Performances and Optimizations, the author wrote about memoryview in Python.
Dealing with a large amount of data takes a lot of memory, especially when we need to copy, slice or even modify them. In cases I've met, I just set a new variable equals to the original data. However, the memory usage doubles. So what we can do in Python is try to keep data in memory, do these operations to the data in the memory.
In the Snippet 1, data are copied over and over again till the socket sends out all of it, while in Snippet 2, data are created and then referenced in the memory by slicing to send out.
Snippet 1
import socket
s = socket.socket(...)
s.connect(...)
data = b"a" * (1024 * 100000)
while data:
sent = s.send(data)
data = data[sent:]
Snippet 2
import socket
s = socket.socket(...)
s.connect(...)
data = b"a" * (1024 * 100000)
mv = memoryview(data)
while mv:
sent = s.send(mv)
mv = mv[sent:]
Besides, we can use readinto
to set values in the memory.
Reference
10.7 Achieving zero copy with the buffer protocol
Less copies in Python with the buffer protocol and memoryviews