python比较数组中数的大小,numpy数组和列表中元素的大小不同

I am using Python 3.4 32 bits on win 7.

I found that an integer in an numpy array has 4 bytes, but in a list it has 10 bytes.

import numpy as np

s = 10;

lt = [None] * s;

cnt = 0 ;

for i in range(0, s):

lt[cnt] = i;

cnt += 1;

lt = [x for x in lt if x is not None];

a = np.array(lt);

print("len(a) is " + str(len(a)) + " size is " + str(sys.getsizeof(a)) \

+ " bytes " + " a.itemsize is " + str(a.itemsize) + " total size is " \

+ str(a.itemsize * len(a)) + " Bytes , len(lt) is " \

+ str(len(lt)) + " size is " + str(sys.getsizeof(lt)) + " Bytes ");

len(a) is 10 size is 40 bytes a.itemsize is 4 total size is 40 Bytes , len(lt) is 10 size is 100 Bytes the fist element has 12 Bytes

Because in a list, each element has to keep a pointer to point to the next element ?

If I assigned a string to the list:

lt[cnt] = "A";

len(a) is 10 size is 40 bytes a.itemsize is 4 total size is 40 Bytes , len(lt) is 10 size is 100 Bytes the fist element has 30 Bytes

So, in array, each element has 4 bytes and in list, it is 30 bytes.

But, if I tried:

lt[cnt] = "AB";

len(a) is 10 size is 40 bytes a.itemsize is 8 total size is 80 Bytes , len(lt) is 10 size is 100 Bytes the fist element has 33 Bytes

In array, each element has 8 bytes but in list, it is 33 bytes.

if I tried :

lt[cnt] = "csedvserb revrvrrw gvrgrwgervwe grujy oliulfv qdqdqafwg5u u56i78k8 awdwfw"; # 73 characters long

len(a) is 10 size is 40 bytes a.itemsize is 292 total size is 2920 Bytes , len(lt) is 10 size is 100 Bytes the fist element has 246 Bytes

In array, each element has 292 bytes (=73 * 4) but in list, it has 246 bytes ?

Any explanation will be appreciated.

解决方案

The element size in arrays is easy - it's determined by the dtype, and as your code shows can be found with .itemsize. 4bytes is common, such as for np.int32, np.float64. Unicode strings are also allocated 4 bytes per character - though the real unicode uses a variable number of characters.

The per element size for lists (and tuples) is trickier. A list does not contain the elements directly, rather it contains pointers to objects which are stored elsewhere. Your list size records the number of pointers, plus a pad. The pad lets it grow in size (with .append) efficiently. All your lists have the same size, regardless of 'first item' size.

My data:

In [2324]: lt=[None]*10

In [2325]: sys.getsizeof(lt)

Out[2325]: 72

In [2326]: lt=[i for i in range(10)]

In [2327]: sys.getsizeof(lt)

Out[2327]: 96

In [2328]: lt=['A' for i in range(10)]

In [2329]: sys.getsizeof(lt)

Out[2329]: 96

In [2330]: lt=['AB' for i in range(10)]

In [2331]: sys.getsizeof(lt)

Out[2331]: 96

In [2332]: lt=['ABCDEF' for i in range(10)]

In [2333]: sys.getsizeof(lt)

Out[2333]: 96

In [2334]: lt=[None for i in range(10)]

In [2335]: sys.getsizeof(lt)

Out[2335]: 96

and for the corresponding arrays:

In [2344]: lt=[None]*10; a=np.array(lt)

In [2345]: a

Out[2345]: array([None, None, None, None, None, None, None, None, None, None], dtype=object)

In [2346]: a.itemsize

Out[2346]: 4

In [2347]: lt=['AB' for i in range(10)]; a=np.array(lt)

In [2348]: a

Out[2348]:

array(['AB', 'AB', 'AB', 'AB', 'AB', 'AB', 'AB', 'AB', 'AB', 'AB'],

dtype='

In [2349]: a.itemsize

Out[2349]: 8

When the list contains None, the array is object dtype, and the elements are all pointers (4 bytes integers).

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值