python 字典查询比列表快_为什么我的dict查找不比Python中的列表查找快？

最新推荐文章于 2021-09-11 22:07:54 发布

weixin_39851307

最新推荐文章于 2021-09-11 22:07:54 发布

阅读量103

点赞数

文章标签： python 字典查询比列表快

I'm reading each line of a file into both a list and a dict,

with open("../data/title/pruned2_titleonly.txt", 'rb') as f_titles:

titles_lst = f_titles.read().split('\n')

assert titles_lst[-1] == ''

titles_lst.pop() # remove the last element, an empty string

titles_dict = {}

with open("../data/title/pruned2_titleonly.txt", 'rb') as f_titles:

for i,line in enumerate(f_titles):

titles_dict[i] = line

and I'm testing the performance by accessing each item in the list/dict in random order:

n = len(titles_lst)

a = np.random.permutation(n)

%%time

for i in xrange(10):

t = []

for b in a:

t.append(titles_lst[b])

del t

>>> CPU times: user 18.2 s, sys: 60 ms, total: 18.2 s

>>> Wall time: 18.1 s

%%time

for i in xrange(10):

t = []

for b in a:

t.append(titles_dict[b])

del t

>>> CPU times: user 41 s, sys: 208 ms, total: 41.2 s

>>> Wall time: 40.9 s

The above result seems to imply that dictionaries are not as efficient as lists for lookup tables, even though list lookups are O(n) while dict lookups are O(1). I've tested the following to see if the O(n)/O(1) performance was true... turns out it isn't...

%timeit titles_lst[n/2]

>>> 10000000 loops, best of 3: 81 ns per loop

%timeit titles_dict[n/2]

>>> 10000000 loops, best of 3: 120 ns per loop

What is the deal? If it's important to note, I am using Python 2.7.6 Anaconda distribution under Ubuntu 12.04, and I built NumPy under Intel MKL.

解决方案The above result seems to imply that dictionaries are not as efficient

as lists for lookup tables, even though list lookups are O(n) while

dict lookups are O(1). I've tested the following to see if the

O(n)/O(1) performance was true... turns out it isn't...

It's not true that dict lookups are O(N), in the sense of "getting an item" which is the sense your code seems to test. Determining where (if at all) an element exists could be O(N), e.g. somelist.index(someval_not_in_the_list) or someval_not_in_the_list in somelist will both have to scan over each element. Try comparing x in somelist with x in somedict to see a major difference.

But simply accessing somelist[index] is O(1) (see the Time Complexity page). And the coefficient is probably going to be smaller than in the case of a dictionary, also O(1), because you don't have to hash the key.

weixin_39851307

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫