Data Processing and Visulisation with Python
Python Exercise 15
Square dictionary
Write a Python function which takes a positive integer n as parameter and return a dictionary containing 1~n as its keys and the values are the coresponding squares of keys.
method 1
def squareDic(m):
squre = {}
for i in range(1,m+1):
squre[i] = squre.get(i,i**2)
return squre
method 2
squareDic(5)
def squareDic(n):
d = {}
for i in range(1,n+1):
d[i] = i**2
return d
squareDic(10)
Sum of values
Write a Python function to take a dictionary as parameter and return the sum of all its values which are all numbers.
method 1
def valueSum(d):
n = len(d)
s = 0
for i in range(1,n+1):
s += d[i]
return s
method 2
def valueSum(D):
return sum(D.values())
m = {1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81, 10: 100}
valueSum(m)
n = {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
valueSum(n)
Character frequency
Write a Python function to count the character frequencies in a given string, and return them in a dictionary.
def charFreq(s):
c = {}
for i in s:
c[i] = c.get(i,0)+1
return c
charFreq('This is a book')
charFreq('Zhongnan University of Economics and Law')
Concatenate dictionaries
Write a Python program to concatenate following dictionaries to create a new one.
Sample Dictionary :
dic1={1:10, 2:20}
dic2={3:30, 4:40}
dic3={5:50, 6:60}
Expected Result : {1: 10, 2: 20, 3: 30, 4: 40, 5: 50, 6: 60}
dic1={1:10, 2:20}
dic2={3:30, 4:40}
dic3={5:50, 6:60}
#-------------------Your Code----------------
#---------------End of your code-------------
method 1
dic1={1:10, 2:20}
dic2={3:30, 4:40}
dic3={5:50, 6:60}
dic = {}
dic.update(dic1)
dic.update(dic2)
dic.update(dic3)
print(dic)
method 2
dic1={1:10, 2:20}
dic2={3:30, 4:40}
dic3={5:50, 6:60}
for a,b in dic2.items():
dic1[a]=b
for a,b in dic3.items():
dic1[a]=b
dic1
Merging by adding
Write a Python function to combine two dictionaries adding values for common keys.
Sample input dictionaries:
d1 = {'a': 100, 'b': 200, 'c':300}
d2 = {'a': 300, 'b': 200, 'd':400}
Sample return value: {‘a’: 400, ‘b’: 400, ‘d’: 400, ‘c’: 300}
method 1
def mergeAdd(d1,d2):
d = {}
for i in d2.keys():
k1 = d1.keys()
if i in k1:
d[i] = d1[i] + d2[i]
else:
d[i] = d2[i]
for j in d1.keys():
k2 = d2.keys()
if not j in k2:
d[j] = d1[j]
return d
method 2
def mergeAdd(d1,d2):
for key in d2:
d1.get(key,0):
d1[key]=d1[key]+d2[key]
return d1
d1 = {'a': 100, 'b': 200, 'c':300}
d2 = {'a': 300, 'b': 200, 'd':400}
mergeAdd(d1, d2)
Dictionary in table
Write a Python function to print a nested dictionary in a table.
Note:
The input nested dictionary has two layers. The keys of the outer layer are incremental numbers starting from 0 to an upper bound less than 100, The keys of the innter layers are the same for each outer items and are alphabets within ‘A’ to ‘Z’. And the values of the inner layers are integers within 100.
Sample input:
{0: {'A': 2, 'B': 9, 'C': 89, 'D': 26},
1: {'A': 51, 'B': 76, 'C': 58, 'D': 56},
2: {'A': 22, 'B': 2, 'C': 86, 'D': 74},
3: {'A': 54, 'B': 90, 'C': 90, 'D': 76},
4: {'A': 51, 'B': 26, 'C': 64, 'D': 84}}
def dicTable(D):
print('Dictionary in table:')
for a in D[0]:
print('\t',a,end = '')
for b in D:
print('\n',b,end = '')
for a in D[b]:
print('\t',D[b][a],end='')
dic = {0: {'A': 2, 'B': 9, 'C': 89, 'D': 26},
1: {'A': 51, 'B': 76, 'C': 58, 'D': 56},
2: {'A': 22, 'B': 2, 'C': 86, 'D': 74},
3: {'A': 54, 'B': 90, 'C': 90, 'D': 76},
4: {'A': 51, 'B': 26, 'C': 64, 'D': 84}}
dicTable(dic)
dic = {0: {'A': 94, 'B': 97, 'C': 35, 'D': 82, 'E': 95, 'F': 3, 'G': 85, 'H': 23},
1: {'A': 83, 'B': 0, 'C': 69, 'D': 77, 'E': 52, 'F': 86, 'G': 40, 'H': 78},
2: {'A': 98, 'B': 73, 'C': 91, 'D': 42, 'E': 59, 'F': 36, 'G': 56, 'H': 79},
3: {'A': 15, 'B': 39, 'C': 29, 'D': 72, 'E': 52, 'F': 61, 'G': 3, 'H': 49},
4: {'A': 43, 'B': 20, 'C': 43, 'D': 0, 'E': 0, 'F': 51, 'G': 31, 'H': 47},
5: {'A': 66, 'B': 30, 'C': 86, 'D': 24, 'E': 2, 'F': 30, 'G': 85, 'H': 69},
6: {'A': 43, 'B': 48, 'C': 36, 'D': 4, 'E': 1, 'F': 58, 'G': 87, 'H': 8},
7: {'A': 38, 'B': 2, 'C': 66, 'D': 18, 'E': 19, 'F': 20, 'G': 88, 'H': 96},
8: {'A': 75, 'B': 67, 'C': 83, 'D': 17, 'E': 98, 'F': 64, 'G': 66, 'H': 22},
9: {'A': 58, 'B': 7, 'C': 18, 'D': 76, 'E': 64, 'F': 40, 'G': 21, 'H': 26}}
dicTable(dic)
Hour distribution
File ‘mbox.txt’ contains a lot of emails including message body and header. In each header, the first line starting with ‘From’ contains the date and time when the email was sent. Please write a Python program to scan the file and print out the email frequency distribution by the hour.
Hint:
Create a dictionary with the keys set to the hours: 0 to 23, and count the frequencies of emails in each hour.
Optional process:
Try to visualize your result.
Hint:
If you prefer the visualization as the following output, you can use ‘\u2588’ to print █.
method 1
fs = open("mbox.txt","r")
l1 = []
for i in range(24):
l1.append(i)
l2 = [0]*24
for line in fs:
line.strip()
if line[:5] == "From ":
k = line.split()
time = k[5]
hour = time.split(':')
h = int(hour[0])
v = l1.index(h)
l2[v]+=1
for i in range(24):
if l2[i] > 0:
print(str(l1[i]).rjust(2)+":"+str(l2[i]).rjust(3), "\u2588"*(l2[i]//4+1),sep="\t")
fs.close()
method 2
with open('mbox.txt') as fh:
hours=[]
for i in range(24):
hours += ['{:0>2d}'.format(i)] ## print(hours)
hourlist=[]
for line in fh :
if line.startswith('From '):
time = line.split()[5] ##
hour = time.split(':')[0] ##
hourlist += [hour]
for hour in hours:
#print(int(hour),":",hourlist.count(hour))
print(str(int(hour)).rjust(2)+":"+str(hourlist.count(hour)).rjust(3)+'\u2588'*(hourlist.count(hour)//4+1))
Recursion improvement
Remember how deficient the recursion solution is when we try to calculate fibonacci number? Now, with dictionary, we can make some improvement. Please rewrite fibonacci function in recursion to make it more efficient with the help of dictionary.
Hint:
Use a dictionary memo
to save the already calculated values! Each time when the value of fibImp(a) is needed, check the dictionary memo
first before calculating it from scratch. Each time when a value is calculated, save it to the dictionary for later use.
method 1
# use the base cases to initialize the dictionary
memo = {0:0, 1:1}
def fibImp(n):
#---------------------Your code------------------
if not n in memo:
memo[n] = fibImp(n-1) + fibImp(n-2)
return memo[n]
#--------------------End of code-----------------
method 2
memo = {0:0, 1:1}
def fibImp(n):
if memo.get(n,0):
return memo[n]
else:
t = max(memo.keys())
while t < n :
memo[t+1]=memo[t]+memo[t-1]
t += 1
return memo[n]
fibImp(1000)
fibImp(100)
fibImp(1000)