需求:
需要你写一个函数,这个函数有一个string类型的入参, 这个函数所做的事情,就是找出入参当中所有包含的子串,并统计每一种子串在入参当中出现的次数,降序输出
例如:abcdcccabcc是入参,bcd、bc都是子串,ac不是,要求字串最少2个字符),例如ab出现了2次。
先列出最终的代码:
def sort_zichuan(s1:str):
'''
:param s1: 输入的字符串
:return: 排好序的字串组成的列表
'''
# new_s1=[]
s_count_dict = {}
# print(len(s1)) # 11
for i in range(0, len(s1) - 1):
for j in range(i + 2, len(s1) + 1):
# new_s1.append(s1[i:j])
sub_s = s1[i:j]
if sub_s in s_count_dict:
s_count_dict[sub_s] += 1
else:
s_count_dict[sub_s] = 1
# print(s_count_dict)
return(sorted(s_count_dict.items(), key=lambda item: item[1], reverse=True))
s1 = 'abcdcccabcc'
print(sort_zichuan(s1))
# [('cc', 3), ('ab', 2), ('abc', 2), ('bc', 2), ('abcd', 1), ('abcdc', 1), ('abcdcc', 1), ('abcdccc', 1), ('abcdccca', 1), ('abcdcccab', 1), ('abcdcccabc', 1), ('abcdcccabcc', 1), ('bcd', 1), ('bcdc', 1), ('bcdcc', 1), ('bcdccc', 1), ('bcdccca', 1), ('bcdcccab', 1), ('bcdcccabc', 1), ('bcdcccabcc', 1), ('cd', 1), ('cdc', 1), ('cdcc', 1), ('cdccc', 1), ('cdccca', 1), ('cdcccab', 1), ('cdcccabc', 1), ('cdcccabcc', 1), ('dc', 1), ('dcc', 1), ('dccc', 1), ('dccca', 1), ('dcccab', 1), ('dcccabc', 1), ('dcccabcc', 1), ('ccc', 1), ('ccca', 1), ('cccab', 1), ('cccabc', 1), ('cccabcc', 1), ('cca', 1), ('ccab', 1), ('ccabc', 1), ('ccabcc', 1), ('ca', 1), ('cab', 1), ('cabc', 1), ('cabcc', 1), ('abcc', 1), ('bcc', 1)]
分析问题
有三个问题需要解决
1 .找字串
2.统计字串出现的次数
3.排序
第一步找字串
# 所以我们要想得到所有的字串的话代码是下面这样的
s1 = 'abcdcccabcc'
new_s1=[] # 用来存储字串的列表
for i in range(0,len(s1)-1):
for j in range(i+2,len(s1)+1):
new_s1.append(s1[i:j])
print(new_s1)
# ['ab', 'abc', 'abcd', 'abcdc', 'abcdcc', 'abcdccc', 'abcdccca', 'abcdcccab', 'abcdcccabc', 'abcdcccabcc', 'bc', 'bcd', 'bcdc', 'bcdcc', 'bcdccc', 'bcdccca', 'bcdcccab', 'bcdcccabc', 'bcdcccabcc', 'cd', 'cdc', 'cdcc', 'cdccc', 'cdccca', 'cdcccab', 'cdcccabc', 'cdcccabcc', 'dc', 'dcc', 'dccc', 'dccca', 'dcccab', 'dcccabc', 'dcccabcc', 'cc', 'ccc', 'ccca', 'cccab', 'cccabc', 'cccabcc', 'cc', 'cca', 'ccab', 'ccabc', 'ccabcc', 'ca', 'cab', 'cabc', 'cabcc', 'ab', 'abc', 'abcc', 'bc', 'bcc', 'cc']
第二步:统计字串出现的次数
考虑到字串和次数应该是对应的关系,这个就很像字典中的key和value,所以我们用字典来存储字串和字串出现的次数,其中字串作为key,出现的次数作为value
思路就是:
当字串在字典中时,我们字串对应的key的value就加一,否则的话字串对应的key的value就等于1(因为这个字串他至少会出现一次,第一次出现的时候让他等于1,没有第二次再出现了)
结合第一步的代码就是
s1 = 'abcdcccabcc'
# new_s1=[]
s_count_dict={} # 用来存储字串和其出现次数 的字典
# print(len(s1)) # 11
for i in range(0,len(s1)-1):
for j in range(i+2,len(s1)+1):
# new_s1.append(s1[i:j])
sub_s=s1[i:j] # 把字串给sub_s
# 如果字串在字典中,就让字串对应的值+1
if sub_s in s_count_dict:
s_count_dict[sub_s] +=1
else:
s_count_dict[sub_s]=1
print(s_count_dict)
# {'ab': 2, 'abc': 2, 'abcd': 1, 'abcdc': 1, 'abcdcc': 1, 'abcdccc': 1, 'abcdccca': 1, 'abcdcccab': 1, 'abcdcccabc': 1, 'abcdcccabcc': 1, 'bc': 2, 'bcd': 1, 'bcdc': 1, 'bcdcc': 1, 'bcdccc': 1, 'bcdccca': 1, 'bcdcccab': 1, 'bcdcccabc': 1, 'bcdcccabcc': 1, 'cd': 1, 'cdc': 1, 'cdcc': 1, 'cdccc': 1, 'cdccca': 1, 'cdcccab': 1, 'cdcccabc': 1, 'cdcccabcc': 1, 'dc': 1, 'dcc': 1, 'dccc': 1, 'dccca': 1, 'dcccab': 1, 'dcccabc': 1, 'dcccabcc': 1, 'cc': 3, 'ccc': 1, 'ccca': 1, 'cccab': 1, 'cccabc': 1, 'cccabcc': 1, 'cca': 1, 'ccab': 1, 'ccabc': 1, 'ccabcc': 1, 'ca': 1, 'cab': 1, 'cabc': 1, 'cabcc': 1, 'abcc': 1, 'bcc': 1}
第三步排序
字典本身无序的,也没有特定的排序方法,如果想要排序的话我们可以用到高阶函数sorted
sorted函数的参数是下面这样的
第一个参数是可迭代对象,第二个参数是排序的指标,第三个是是否逆序排列
排序的代码是下面这样的
sorted(s_count_dict.items(),key=lambda item:item[1],reverse=True)
解释: s_count_dict.items()是我们拿到字典中的每一组key和value,得到的内容是下面这样
print(s_count_dict.items())
# dict_items([('ab', 2), ('abc', 2), ('abcd', 1), ('abcdc', 1), ('abcdcc', 1), ('abcdccc', 1), ('abcdccca', 1), ('abcdcccab', 1), ('abcdcccabc', 1), ('abcdcccabcc', 1), ('bc', 2), ('bcd', 1), ('bcdc', 1), ('bcdcc', 1), ('bcdccc', 1), ('bcdccca', 1), ('bcdcccab', 1), ('bcdcccabc', 1), ('bcdcccabcc', 1), ('cd', 1), ('cdc', 1), ('cdcc', 1), ('cdccc', 1), ('cdccca', 1), ('cdcccab', 1), ('cdcccabc', 1), ('cdcccabcc', 1), ('dc', 1), ('dcc', 1), ('dccc', 1), ('dccca', 1), ('dcccab', 1), ('dcccabc', 1), ('dcccabcc', 1), ('cc', 3), ('ccc', 1), ('ccca', 1), ('cccab', 1), ('cccabc', 1), ('cccabcc', 1), ('cca', 1), ('ccab', 1), ('ccabc', 1), ('ccabcc', 1), ('ca', 1), ('cab', 1), ('cabc', 1), ('cabcc', 1), ('abcc', 1), ('bcc', 1)])
lambda item:item[1]
lambda a:b的意思是:匿名函数,传入参数是a,返回的参数是b,带入到这个问题里面就是,我们传入item(是会传入第一个参数中s_count_dict.items()中的每一项:('ab', 2)、('abc', 2)等等),然后得到这个元组里面的第二个值也就是item(1)
reverse=True
表示逆序排列(从大到小)