python中struct calsize_Python struct calsize different from actual

最新推荐文章于 2023-06-03 16:42:16 发布

Linkzero Tsang

最新推荐文章于 2023-06-03 16:42:16 发布

阅读量132

点赞数

文章标签： python中struct calsize

本文链接：https://blog.csdn.net/weixin_31076173/article/details/114938533

版权

I am trying to read one short and long from a binary file using python struct.

But the

print(struct.calcsize("hl")) # o/p 16

which is wrong, It should have been 2 bytes for short and 8 bytes for long. I am not sure i am using the struct module the wrong way.

When i print the value for each it is

print(struct.calcsize("h")) # o/p 2

print(struct.calcsize("l")) # o/p 8

Is there a way to force python to maintain the precision on datatypes?

# Answer 1

By default struct alignment rules, 16 is the correct answer. Each field is aligned to match its size, so you end up with a short for two bytes, then six bytes of padding (to reach the next address aligned to a multiple of eight bytes), then eight bytes for the long.

You can use a byte order prefix (any of them disable padding), but they also disable machine native sizes (so struct.calcsize("=l") will be a fixed 4 bytes on all systems, and struct.calcsize("=hl") will be 6 bytes on all systems, not 10, even on systems with 8 byte longs).

If you want to compute struct sizes for arbitrary structures using machine native types with non-default padding rules, you'll need to go to the ctypes module, define your ctypes.Structure subclass with the desired _pack_ setting, then use ctypes.sizeof to check the size, e.g.:

from ctypes import Structure, c_long, c_short, sizeof

class HL(Structure):

_pack_ = 1 # Disables padding for field alignment

# Defines (unnamed) fields, a short followed by long

_fields_ = [("", c_short),

("", c_long)]

print(sizeof(HL))

which outputs 10 as desired.

This could be factored out as a utility function if needed (this is a simplified example that doesn't handle all struct format codes, but you can expand if needed):

from ctypes import *

FMT_TO_TYPE = dict(zip("cb?hHiIlLqQnNfd",

(c_char, c_byte, c_bool, c_short, c_ushort, c_int, c_uint,

c_long, c_ulong, c_longlong, c_ulonglong,

c_ssize_t, c_size_t, c_float, c_double)))

def calcsize(fmt, pack=None):

'''Compute size of a format string with arbitrary padding (defaults to native)'''

class _(Structure):

if pack is not None:

_pack_ = pack

_fields_ = [("", FMT_TO_TYPE[c]) for c in fmt]

return sizeof(_)

which, once defined, lets you compute sizes padded or unpadded like so:

>>> calcsize("hl") # Defaults to native "natural" alignment padding

>>> calcsize("hl", 1) # pack=1 means no alignment padding between members

# Answer 2

This is what the doc says:

By default, the result of packing a given C struct includes pad bytes in order to maintain proper alignment for the C types involved; similarly, alignment is taken into account when unpacking. This behavior is chosen so that the bytes of a packed struct correspond exactly to the layout in memory of the corresponding C struct. To handle platform-independent data formats or omit implicit pad bytes, use standard size and alignment instead of native size and alignment

Changing it from standard to native is pretty easy: you just append the prefix = before the format characters.

print(struct.calcsize("=hl"))

EDIT

Since from the native to standard mode, some default sizes are changed, you have two options:

keeping the native mode, but switching the format characters, in this way: struct.calcsize("lh"). In C even the order of your variable inside the struct is important. Here the padding is 8 bytes, it means that every variable has to be referenced at multiple of 8 bytes.

Using the format characters of the standard mode, so: struct.calcsize("=hq")