Say I have a running CPython session,
Is there a way to run the data (bytes) from a pyc file directly?
(without having the data on-disk necessarily, and without having to write a temporary pyc file)
Example script to show a simple use-case:
if foo:
# Intentionally ambiguous, since the data source
# is a detail and answers shouldn't depend this detail.
data = read_data_from_somewhere()
else:
data = open("bar.pyc", 'rb').read()
assert(type(data) is bytes)
code = bytes_to_code(data)
# call a method from the loaded code
code.call_function()
Exact use isn't important, but generating code dynamically and copying over a network to execute is one use-case (for the purpose of thinking about this question).
Here are some example use-cases, which made me curious to know how this can be done:
Checking Python scripts for malicious code.
If a single command can access a larger body of code hidden in binary data, what would that command look like?
Dynamically generate code and cache it for re-use (not necessarily on disk, could use a data-base for example).
Ability to send pre-compiled byte-code to a process, control an application which embeds Python for eg.
解决方案Is there a way to run the data from a pyc file directly?
The compiled code object can be saved using marshal
import marshal
bytes = marshal.dumps(eggs)
the bytes can be converted back to a code object
eggs = marshal.loads(bytes)
exec(eggs)
A pyc file is a marshaled code object with a header
For Python3, the header is 12 bytes which need to be skipped, the remaining data can be read via marshal.loads.
At the simple level, a .pyc file is a binary file containing only three things:
A four-byte magic number,
A four-byte modification timestamp, and
A marshalled code object.
Note, the link references Python2, but its almost the same in Python3, the pyc header size is just 12 instead of 8 bytes.