Python Hashmap/Dictionary 使用指南

Python

来自:http://www.dotnetperls.com/dictionary-python


Built-in Dictionary List Set Tuple 2D Array Bytes Class Console Convert Datetime Duplicates Error File Find If Lambda Len Lower Map Math Namedtuple None Random Re Slice Sort Split String Strip Sub Substring TypeWhile

Dictionary.  A dictionary optimizes element lookups. It associates keys to values. Each key must have a value. Dictionaries are used in many programs.
With square brackets,  we assign and access a value at a key. With get() we can specify a default result. Dictionaries are fast. We create, mutate and test them.
Get example.  There are many ways to get values. We can use the "[" and "]" characters. We access a value directly this way. But this syntax causes a KeyError if the key is not found.

Instead:We can use the get() method with one or two arguments. This does not cause any annoying errors. It returns None.

Argument 1:The first argument to get() is the key you are testing. This argument is required.

Argument 2:The second, optional argument to get() is the default value. This is returned if the key is not found.

Based on: Python 3

Python program that gets values

plants = {}

# Add three key-value tuples to the dictionary.
plants["radish"] = 2
plants["squash"] = 4
plants["carrot"] = 7

# Get syntax 1.
print(plants["radish"])

# Get syntax 2.
print(plants.get("tuna"))
print(plants.get("tuna", "no tuna found"))

Output

2
None
no tuna found
Get, none.  In Python "None" is a special value like null or nil. I like None. It is my friend. It means no value. Get() returns None if no value is found in a dictionary. None

Note:It is valid to assign a key to None. So get() can return None, but there is actually a None value in the dictionary.


Key error.  Errors in programs are not there just to torment you. They indicate problems with a program and help it work better. A KeyError occurs on an invalid access. KeyError
Python program that causes KeyError

lookup = {"cat": 1, "dog": 2}

# The dictionary has no fish key!
print(lookup["fish"])

Output

Traceback (most recent call last):
  File "C:\programs\file.py", line 5, in <module>
    print(lookup["fish"])
KeyError: 'fish'
In-keyword.  A dictionary may (or may not) contain a specific key. Often we need to test for existence. One way to do so is with the in-keyword. In

True:This keyword returns 1 (meaning true) if the key exists as part of a key-value tuple in the dictionary.

False:If the key does not exist, the in-keyword returns 0, indicating false. This is helpful in if-statements.

Python program that uses in

animals = {}
animals["monkey"] = 1
animals["tuna"] = 2
animals["giraffe"] = 4

# Use in.
if "tuna" in animals:
    print("Has tuna")
else:
    print("No tuna")

# Use in on nonexistent key.
if "elephant" in animals:
    print("Has elephant")
else:
    print("No elephant")

Output

Has tuna
No elephant
Len built-in.  This returns the number of key-value tuples in a dictionary. The data types of the keys and values do not matter. Len also works on lists and strings.

Caution:The length returned for a dictionary does not separately consider keys and values. Each pair adds one to the length.

Python program that uses len on dictionary

animals = {"parrot": 2, "fish": 6}

# Use len built-in on animals.
print("Length:", len(animals))

Output

Length: 2
Len notes.  Let us review. Len() can be used on other data types, not just dictionaries. It acts upon a list, returning the number of elements within. It also handles tuples. Len
Keys, values.  A dictionary contains keys. It contains values. And with the keys() and values() methods, we can store these elements in lists.

Next:A dictionary of three key-value pairs is created. This dictionary could be used to store hit counts on a website's pages.

Views:We introduce two variables, named keys and values. These are not lists—but we can convert them to lists.

Convert
Python program that uses keys

hits = {"home": 125, "sitemap": 27, "about": 43}
keys = hits.keys()
values = hits.values()

print("Keys:")
print(keys)
print(len(keys))

print("Values:")
print(values)
print(len(values))

Output

Keys:
dict_keys(['home', 'about', 'sitemap'])
3
Values:
dict_values([125, 43, 27])
3
Keys, values ordering.  Elements returned by keys() and values() are not ordered. In the above output, the keys-view is not alphabetically sorted. Consider a sorted view (keep reading).
Sorted keys.  In a dictionary keys are not sorted in any way. They are unordered. Their order reflects the internals of the hashing algorithm's buckets.

But:Sometimes we need to sort keys. We invoke another method, sorted(), on the keys. This creates a sorted view.

Python program that sorts keys in dictionary

# Same as previous program.
hits = {"home": 124, "sitemap": 26, "about": 32}

# Sort the keys from the dictionary.
keys = sorted(hits.keys())

print(keys)

Output

['about', 'home', 'sitemap']
Items.  With this method we receive a list of two-element tuples. Each tuple contains, as its first element, the key. Its second element is the value.

Tip:With tuples, we can address the first element with an index of 0. The second element has an index of 1.

Program:The code uses a for-loop on the items() list. It uses the print() method with two arguments.

Python that uses items method

rents = {"apartment": 1000, "house": 1300}

# Convert to list of tuples.
rentItems = rents.items()

# Loop and display tuple items.
for rentItem in rentItems:
    print("Place:", rentItem[0])
    print("Cost:", rentItem[1])
    print("")

Output

Place: house
Cost: 1300

Place: apartment
Cost: 1000
Items, assign.  We cannot assign elements in the tuples. If you try to assign rentItem[0] or rentItem[1], you will get an error. This is the error message.
Python error:

TypeError: 'tuple' object does not support item assignment
Items, unpack.  The items() list can be used in another for-loop syntax. We can unpack the two parts of each tuple in items() directly in the for-loop.

Here:In this example, we use the identifier "k" for the key, and "v" for the value.

Python that unpacks items

# Create a dictionary.
data = {"a": 1, "b": 2, "c": 3}

# Loop over items and unpack each item.
for k, v in data.items():
    # Display key and value.
    print(k, v)

Output

a 1
c 3
b 2
For-loop.  A dictionary can be directly enumerated with a for-loop. This accesses only the keys in the dictionary. To get a value, we will need to look up the value.

Items:We can call the items() method to get a list of tuples. No extra hash lookups will be needed to access values.

Here:The plant variable, in the for-loop, is the key. The value is not available—we would need plants.get(plant) to access it.

Python that loops over dictionary

plants = {"radish": 2, "squash": 4, "carrot": 7}

# Loop over dictionary directly.
# ... This only accesses keys.
for plant in plants:
    print(plant)

Output

radish
carrot
squash
Del built-in.  How can we remove data? We apply the del method to a dictionary entry. In this program, we initialize a dictionary with three key-value tuples. Del

Then:We remove the tuple with key "windows". When we display the dictionary, it now contains only two key-value pairs.

Python that uses del

systems = {"mac": 1, "windows": 5, "linux": 1}

# Remove key-value at "windows" key.
del systems["windows"]

# Display dictionary.
print(systems)

Output

{'mac': 1, 'linux': 1}
Del, alternative.  An alternative to using del on a dictionary is to change the key's value to a special value. This is a null object refactoring strategy.
Update.  With this method we change one dictionary to have new values from a second dictionary. Update() also modifies existing values. Here we create two dictionaries.

Pets1, pets2:The pets2 dictionary has a different value for the dog key—it has the value "animal", not "canine".

Also:The pets2 dictionary contains a new key-value pair. In this pair the key is "parakeet" and the value is "bird".

Result:Existing values are replaced with new values that match. New values are added if no matches exist.

Python that uses update

# First dictionary.
pets1 = {"cat": "feline", "dog": "canine"}

# Second dictionary.
pets2 = {"dog": "animal", "parakeet": "bird"}

# Update first dictionary with second.
pets1.update(pets2)

# Display both dictionaries.
print(pets1)
print(pets2)

Output

{'parakeet': 'bird', 'dog': 'animal', 'cat': 'feline'}
{'dog': 'animal', 'parakeet': 'bird'}
Copy.  This method performs a shallow copy of an entire dictionary. Every key-value tuple in the dictionary is copied. This is not just a new variable reference.

Here:We create a copy of the original dictionary. We then modify values within the copy. The original is not affected.

Python that uses copy

original = {"box": 1, "cat": 2, "apple": 5}

# Create copy of dictionary.
modified = original.copy()

# Change copy only.
modified["cat"] = 200
modified["apple"] = 9

# Original is still the same.
print(original)
print(modified)

Output

{'box': 1, 'apple': 5, 'cat': 2}
{'box': 1, 'apple': 9, 'cat': 200}
Fromkeys.  This method receives a sequence of keys, such as a list. It creates a dictionary with each of those keys. We can specify a value as the second argument.

Values:If you specify the second argument to fromdict(), each key has that value in the newly-created dictionary.

Python that uses fromkeys

# A list of keys.
keys = ["bird", "plant", "fish"]

# Create dictionary from keys.
d = dict.fromkeys(keys, 5)

# Display.
print(d)

Output

{'plant': 5, 'bird': 5, 'fish': 5}
Dict.  With this built-in function, we can construct a dictionary from a list of tuples. The tuples are pairs. They each have two elements, a key and a value.

Tip:This is a possible way to load a dictionary from disk. We can store (serialize) it as a list of pairs.

Python that uses dict built-in

# Create list of tuple pairs.
# ... These are key-value pairs.
pairs = [("cat", "meow"), ("dog", "bark"), ("bird", "chirp")]

# Convert list to dictionary.
lookup = dict(pairs)

# Test the dictionary.
print(lookup.get("dog"))
print(len(lookup))

Output

bark
3
Memoize.  One classic optimization is called memoization. And this can be implemented easily with a dictionary. In memoization, a function (def) computes its result. Memoize

And:Once the computation is done, it stores its result in a cache. In the cache, the argument is the key. And the result is the value.


Memoization, continued.  When a memoized function is called, it first checks this cache to see if it has been, with this argument, run before.

And:If it has, it returns its cached—memoized—return value. No further computations need be done.

Note:If a function is only called once with the argument, memoization has no benefit. And with many arguments, it usually works poorly.


Get performance.  I compared a loop that uses get() with one that uses both the in-keyword and a second look up. Version 2, with the "in" operator, was faster.

Version 1:This version uses a second argument to get(). It tests that against the result and then proceeds if the value was found.

Version 2:This version uses "in" and then a lookup. Twice as many lookups occur. But fewer statements are executed.

Python that benchmarks get

import time

# Input dictionary.
systems = {"mac": 1, "windows": 5, "linux": 1}

# Time 1.
print(time.time())

# Get version.
i = 0
v = 0
x = 0
while i < 10000000:
    x = systems.get("windows", -1)
    if x != -1:
        v = x
    i += 1

# Time 2.
print(time.time())

# In version.
i = 0
v = 0
while i < 10000000:
    if "windows" in systems:
        v = systems["windows"]
    i += 1

# Time 3.
print(time.time())

Output

1345819697.257
1345819701.155 (get = 3.90 s)
1345819703.453 (in  = 2.30 s)
String key performance.  In another test, I compared string keys. I found that long string keys take longer to look up than short ones. Shorter keys are faster. Dictionary String Key
Performance, loop.  A dictionary can be looped over in different ways. In this benchmark we test two approaches. We access the key and value in each iteration.

Version 1:This version loops over the keys of the dictionary with a while-loop. It then does an extra lookup to get the value.

Version 2:This version instead uses a list of tuples containing the keys and values. It actually does not touch the original dictionary.

But:Version 2 has the same effect—we access the keys and values. The cost of calling items() initially is not counted here.

Python that benchmarks loops

import time

data = {"michael": 1, "james": 1, "mary": 2, "dale": 5}
items = data.items()

print(time.time())

# Version 1: get.
i = 0
while i < 10000000:
    v = 0
    for key in data:
        v = data[key]
    i += 1

print(time.time())

# Version 2: items.
i = 0
while i < 10000000:
    v = 0
    for tuple in items:
        v = tuple[1]
    i += 1

print(time.time())

Output

1345602749.41
1345602764.29 (version 1 = 14.88 s)
1345602777.68 (version 2 = 13.39 s)
Benchmark, loop results.  We see above that looping over a list of tuples is faster than directly looping over a dictionary. This makes sense. With the list, no lookups are done.
Frequencies.  A dictionary can be used to count frequencies. Here we introduce a string that has some repeated letters. We use get() on a dictionary to start at 0 for nonexistent values.

So:The first time a letter is found, its frequency is set to 0 + 1, then 1 + 1. Get() has a default return.

Python that counts letter frequencies

# The first three letters are repeated.
letters = "abcabcdefghi"

frequencies = {}
for c in letters:
    # If no key exists, get returns the value 0.
    # ... We then add one to increase the frequency.
    # ... So we start at 1 and progress to 2 and then 3.
    frequencies[c] = frequencies.get(c, 0) + 1

for f in frequencies.items():
    # Print the tuple pair.
    print(f)

Output

('a', 2)
('c', 2)
('b', 2)
('e', 1)
('d', 1)
('g', 1)
('f', 1)
('i', 1)
('h', 1)
A summary.  A dictionary is usually implemented as a hash table. Here a special hashing algorithm translates a key (often a string) into an integer.
For a speedup,  this integer is used to locate the data. This reduces search time. For programs with performance trouble, using a dictionary is often the initial path to optimization.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值