2018年1月1日,CHERRY私人定制服务已经正式上线,CHERRY表示,从键盘内部轴体的混搭,到键盘外壳与键帽的个性化,所有部件均可以在“CHERRY键盘私人订制”中自定义选择,打造专属自己的键盘。那么问题来了?常用拼音输入法的话26个字母按键轴体该怎么混搭呢?要想决定怎么混搭就要先知道哪些字母按键被敲击的最多?ok,是时候表演真正的技术了!
首先,电脑打开,下载红楼梦精修版txt文件,写代码,运行代码,看结果!搞定!就这么简单!哈哈 其实思路是这样的,把红楼梦全文转换成拼音,统计每个字母出现的次数。上代码:
import pprint
import collections
from pypinyin import lazy_pinyin
def getpinyin(file, pyfile):
output = open(pyfile, 'w+', encoding='UTF-8')
for line in open(file, "r", encoding='UTF-8'):
rs = line.rstrip('\n')
py = ''.join([x for x in lazy_pinyin(rs) if x.isalpha()])
output.write(py)
def getcharcount(filename):
with open(filename, 'r', encoding='UTF-8') as info:
count = collections.Counter(info.read().upper()).most_common(26)
value = pprint.pformat(count)
print(value)
if __name__ == '__main__':
getpinyin('hlm.txt', 'data')
getcharcount('data')
看结果:
[('I', 299992),
('A', 241824),
('N', 228524),
('E', 171511),
('U', 164922),
('H', 138596),
('O', 130906),
('G', 113403),
('Y', 73109),
('Z', 68395),
('S', 62901),
('D', 59323),
('L', 58584),
('J', 48926),
('B', 41133),
('X', 35880),
('R', 31328),
('M', 28514),
('T', 28296),
('Q', 26181),
('W', 25719),
('C', 23619),
('F', 16787),
('K', 10380),
('P', 6986),
('V', 1371)]
另一个版本红楼梦得到的结果:
[('I', 291961),
('A', 238091),
('N', 222311),
('E', 170788),
('U', 159466),
('H', 135272),
('O', 128644),
('G', 110039),
('Y', 70254),
('Z', 66673),
('S', 60949),
('D', 58349),
('L', 57619),
('J', 48402),
('B', 39951),
('X', 35021),
('R', 31102),
('M', 28590),
('T', 27755),
('Q', 25357),
('W', 25016),
('C', 22680),
('F', 16099),
('K', 10159),
('P', 6811),
('V', 1333)]
结果就是这样了,这只是一个思路,抛砖引玉了,用红楼梦做统计不一定严谨,意思就是这么个意思,事情就是这么个事情,您看着办呗。