Chapter 9 Dictionaries
List: A linear collection of values that stay in order
Lists index their entries based on the position in the list
Dictionary: A "bag" of values, each with its own label
Dictionaries are like bags - no order
So we index the things we put in the dictionary with a "lookup tag"
Dictionaries are like Lists except that they use keys instead of numbers to look up values.
purse = dict() # kind of like, use the label to replace ordered number to mark the place
purse['money'] = 12
purse['candy'] =3
purse['tissues'] = 75
print purse
print purse['candy']
purse['candy'] = purse['candy'] + 2
print purse
output:
{'money': 12, 'tissues': 75, 'candy': 3}
3
{'money': 12, 'tissues': 75, 'candy': 5}
# the label is not always 'string'
Empty dictionary
ooo = {}
print ooo
Tracebacks: an error will appear if you refer a key which is not in the dictionary
We can use the in operator to check if a key is in the dic
print 'csev' in ccc
output:
False
An interesting example: count names
namelist = ['Joe', 'Joey', 'Joe', 'Mike', 'Mike', 'Joey', 'Mike', 'Yifan', 'Joe']
counts = dict()
for name in namelist:
if name in counts:
counts[name] = counts[name] + 1
else:
counts[name] = 1
print counts
Another way might be more understandable:
for name in namelist:
if name not in counts:
counts[name] = 1
else:
counts[name] = counts[name] + 1
output:
{'Yifan': 1, 'Mike': 3, 'Joe': 3, 'Joey': 2}
The get method for dictionary
If the key is not in the dictionary, make the value equal to a Default Value
The following two code snippets have the same result
if name in counts:
print counts[name]
else:
print 0
print count.get(name,0)
Simplified counting with get()
namelist = ['Joe', 'Joey', 'Joe', 'Mike', 'Mike', 'Joey', 'Mike', 'Yifan', 'Joe']
counts = dict()
for name in namelist:
counts[name] = counts.get(name,0) + 1
print counts
Def Loops and Dictionaries
for loop can go through all of the keys in the dic and look up the values by keys
namecount = {'Yifan': 1, 'Mike': 3, 'Joe': 3, 'Joey': 2}
for key in namecount:
print key, namecount[key]
output:
Yifan 1
Mike 3
Joe 3
Joey 2
Get the list of keys (or values, or both)
namecount = {'Yifan': 1, 'Mike': 3, 'Joe': 3, 'Joey': 2}
print list(namecount)
print namecount.keys()
print namecount.values()
print namecount.items()
output:
['Yifan', 'Mike', 'Joe', 'Joey']
['Yifan', 'Mike', 'Joe', 'Joey']
[1, 3, 3, 2] # same order with the keys
[('Yifan', 1), ('Mike', 3), ('Joe', 3), ('Joey', 2)] <- ( , ) is a kind of tuples
Two Iteration Variables!
key-value in pairs
namecount = {'Yifan': 1, 'Mike': 3, 'Joe': 3, 'Joey': 2}
for name,count in namecount.items():
print name, count
output:
Yifan 1
Mike 3
Joe 3
Joey 2
An Exercise: Get the most frequently appeared word in the text and count the number of it
file_name = raw_input("Enter the file name:")
fhand = open(file_name,'r')
content = fhand.read()
wordslist = content.split()
count = dict()
for word in wordslist:
count[word] = count.get(word,0) + 1
biggestcount = None
biggestword = None
for word, wordcount in count.items():
if biggestword == None or wordcount > biggestcount :
biggestcount = wordcount
biggestword = word
print biggestword, biggestcount
Comment: Do not forget the split() function.
Chapter 10 Tuples
Tuples are another kind of sequence that function much like a list
- they have elements which are indexed starting at 0
list: [ ]
tupes: ( )
But,
Tuples are none-changeable, immutable. Kind of similar to a string.
z = (5, 4, 3) z[2] = 0. z.sort()
Tuples are more efficient
(x, y) = (100, 4)
print y
Conduct the assignment at a time
Tuples are Comparable
print (0, 1, 2) < (5, 1, 2)
print (0, 1, 2) < (0, 3, 10000)
print ('Jones', 'Sally') < ('Jones', 'Fred')
print ('Jones', 'Sally') > ('Jane', 'Sally')
print ('Jones', 'Sally') > ('Adam', 'Sally')
output:
True
True
False
True
True
Sorting Lists of Tuples
We can take advantage of the ability to sort a list of tuples to get a sorted version of a dictionary
#use d.items() and t.sort()
namedic = {'Yifan': 1, 'Mike': 3, 'Joe': 3, 'Joey': 2}
nametuple = namedic.items()
print nametuple # sort by keys
nametuple.sort()
print nametuple
#use d.items() and sorted(). More directly
namedic = {'Yifan': 1, 'Mike': 3, 'Joe': 3, 'Joey': 2}
print namedic.items()
t = sorted(namedic.items())
print t
for k,v in sorted(namedic.items()):
print k,v
output:
[('Yifan', 1), ('Mike', 3), ('Joe', 3), ('Joey', 2)]
[('Joe', 3), ('Joey', 2), ('Mike', 3), ('Yifan', 1)]
[('Yifan', 1), ('Mike', 3), ('Joe', 3), ('Joey', 2)]
[('Joe', 3), ('Joey', 2), ('Mike', 3), ('Yifan', 1)]
Joe 3
Joey 2
Mike 3
Yifan 1
Sort by values instead of key
namedic = {'Yifan': 1, 'Mike': 3, 'Joe': 3, 'Joey': 2}
tmp = list()
for k, v in namedic.items():
tmp.append((v,k))
print tmp
tmp.sort(reverse=True) #(reverse=True) means it presents from the highest to lowest
print tmp
output:
[(1, 'Yifan'), (3, 'Mike'), (3, 'Joe'), (2, 'Joey')]
[(3, 'Mike'), (3, 'Joe'), (2, 'Joey'), (1, 'Yifan')]
it seems that the tuple sort focus on the first stuff in each item: (focus, xxx)
An Exercise: The top 10 most common words
file_name = raw_input("Enter the file name:")
fhand = open(file_name,'r')
content = fhand.read()
wordslist = content.split()
count = dict()
for word in wordslist:
count[word] = count.get(word,0) + 1
lll = list()
for k,v in count.items():
lll.append((v,k))
lll.sort(reverse=True)
for i in range(10):
print lll[i]
output:
(352, 'Jan')
(324, '2008')
(245, 'by')
(243, 'Received:')
(219, '-0500')
(218, 'from')
(203, '4')
(194, 'with')
(183, 'Fri,')
(136, 'id')
for v,k in lll[:10]:
print v,k
output:
352 Jan
324 2008
245 by
243 Received:
219 -0500
218 from
203 4
194 with
183 Fri,
136 id
Even Shorter Version
namedic = {'Yifan': 1, 'Mike': 3, 'Joe': 3, 'Joey': 2}
print sorted([(v,k) for k,v in namedic.items()]) #here '[]' is a list comprehension
output:
[(1, 'Yifan'), (2, 'Joey'), (3, 'Joe'), (3, 'Mike')]
Comments:
List comprehension: creates a dynamic list that meets certain requirements.
In this case, we make a list of reversed tuples and then sort it.