综合练习:词频统计

news = '''In the modern Chinese history of fine arts education, Xu Beihong (1895~ 1953), is to adopt the western art long modern painting master, prodromal type of art educators. Xu Beihong was born in rural areas, young the family was poor, his father Xu Dazhang is a village school teacher, good at flower, figure painting. At the age of 4 Xu Beihong began reading in school, and was interested in painting. At the age of 9, from whom he learned painting at the age of 10, has been able to make father's assistant. When busy season, and poor farming, working life, so that a child nurturance hard-working, simple style and honest and upright character.
When Xu Beihong was 13 years old in the year of great famine, father to walk the political arena, rely on vend word selling paintings. At the age of 17, his father contracted the illness, go from bad to worse family, a family of eight living burden, then fell to Xu Beihong 's shoulder. He started in the primary school, middle school teachers have to Shanghai pictures, such as selling paintings. 19 years old when his father died, more impoverished family.
In 1915, Xu Beihong again goes to Shanghai, the friends help," I always in time" ( at the time of the casino corner ) stay, hard creation, at the same time to night school to learn french. He drew a horse, send aesthetic books curator Gao Jianfu. Gao Jianfu and his brother Gao Qifeng read, appreciate his art. At this time, Xu Beihong was admitted to the aurora University, but without Qian Jinxue, but Gao Qifeng grant. Then met Kang Youwei, have the opportunity to observe the Kang collection of rubbings. And in the view of art and was influenced by Kang Youwei.
In 1917, Xu Beihong went to Japan to study art in Tokyo, fall back to Beijing, invited by Cai Yuanpei Ren Peking University descriptive research mentor, and met Chen Shiceng. Chen Shiceng is a poet Chen Sanyuan's eldest son, his grandfather is the renowned Hunan governor Chen Baozhen. He is the beginning of the twentieth Century China 's most outstanding art master, but also a bole type character, Qi Baishi fame is due to his strong push tours. After Chen Shiceng's death, Qi Baishi had the poem mourning, said: "I'm no king is back, you no I do not enter."
Xu Beihong 1919France, famous French painter Yang karma. Dangan early Sciences in nineteenth Century master Connor 's gate, later became one of the leaders of the French state. He is especially good at depicting the life of the Breton fishermen and farmers, the" blessing"," bread"," for" men such as painting as early as the Xu Beihong adore. Since then, Xu Beihong every week are almost always works to Hickey Road No. 65Dangan studio for advice, and joined the painter in the tea party, with Meniere, times to your conversation greatly benefit.
1921after going to Germany, Xu Beihong studied at the studio of painter CommScope, next to Paris. In 1925 by the Singapore home. Spring of second, he went to Paris, and went to Belgium Brussels Pro painting, travelled to Switzerland, Italy.
After returning home in 1927, served as a professor in the Department of Centre College, southern Shanghai art academy fine arts Dean, Peking University School of art director. In 1933with the Chinese modern painting in France, Germany, Belgium, Italy and the Soviet Union exhibition. During the war of resistance against Japan, the work to the Southeast Asia, India and other Southeast Asia exhibition, and all income donated to the refugees.
On the eve of the liberation, the Kuomintang government to send aircraft to pick up Xu Beihong and a number of well-known professor to go to Nanjing, by Xu Beihong refused. After the liberation, he was invited to China to attend the World Peace Congress, and served as president of the China Central Academy of Fine Arts, and was elected to the National Federation of standing committee, CPPCC representatives and the Chinese National Artists Association president. 1952is ill, his life and all the country donated collection of works creation. Died in 1953, only live 59 years old. The country is this great artist in Beijing established the Xu Beihong Memorial, saved his one thousand works. He composed works amounted to thousands of pieces, training and found a large number of excellent art talents.
Xu Beihong is good at Chinese painting, oil painting, especially fine sketch. His paintings are full of passion, skill is very high. Famous paintings are" my"" after the creek, the five hundred people", traditional Chinese painting has" nine party" Gao", great determination and courage"," Reunion", tokyo. Most can reflect the Xu Beihong personality, to express his thoughts and feelings than his horse. He on equine muscle, skeletal and expression dynamics, made long-term observational study, draw thousands of sketches. So he painted horse figure painting and earned, not arrogant, not trivial subtle, reinforcing bone Zhuang, of great momentum, spirit all foot. Some figures, lions, cats and other works, is the large amount of high quality. His painting" learning from nature, seek the truth " principle.'''
sep=''','?!:.()'''
exclude={'the','and','of','to','in'}

#将所有,.?!’:等分隔符全部替换为空格
for c in sep: news=news.replace(c,'')

#将所有大写转换为小写 wordList=news.lower().split() # for w in wordList: # wordDict[w]=wordDict.get(w,0)+1 # for w in exclude: # del(wordDict[w])
#排除语法型词汇,代词、冠词、连词 wordDict = {} wordSet=set(wordList)-exclude for w in wordSet: wordDict[w]=wordList.count(w)
#排序 dictList=list(wordDict.items()) dictList.sort(key=lambda x:x[1],reverse=True)
输出词频最大TOP20 for i in range(20): print(dictList[i]) # print(dictList) # for w in wordDict: # print(w,wordDict[w])
#将分析对象存为utf-8编码的文件,通过文件读取的方式获得词频分析内容。 f=open('news.txt','a') for i in range(25): f.write(dictList[i][0]+''+str(dictList[1])+'\n') f.close()

  

 



import jieba f = open('sanguoyanyi.txt', 'r',encoding='utf-8') text = f.read() f.close() jieba.add_word('曹操') jieba.add_word('诸葛亮') jieba.add_word('孔明') punctuation = ''',。‘’“”:;()!?、 ''' a = {'的','\n','\u3000','曰','之','不','人','军','操','一','将', '大','马','来','德','有','于','下','兵','此', '玄','公','见','为','何','中','而','可','吾', '出','也','以','与','上','后','今','其','去', '日','明','言'} for i in punctuation: text = text.replace(i, '') print(list(jieba.cut(text))) tempwords = list(jieba.cut(text)) print(tempwords) count = {} words = list(set(tempwords) - a) print(words) for i in range(0, len(words)): count[words[i]] = text.count(str(words[i])) countList = list(count.items()) countList.sort(key=lambda x: x[1], reverse=True) print(countList) f = open('zzzCount.txt', 'a') for i in range(20): f.write(countList[i][0] + ':' + str(countList[i][1]) + '\n') f.close()

  

转载于:https://www.cnblogs.com/leonHQ/p/8658889.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
' ' 窗口启动的时候,初始化类会把数据库载入内存,以增加运算速度,所以占用内存稍微大一点,如果不喜欢,可以修改类初始化部分{方初始化()} ' 子重置词典数据库() 这个功能用于自定义词库,吧文本词库转换为sqlite数据库词库 ' 词库文件保存在运行目录kic.txt ' 词库数据库为disk.db ' 词库数据保存在sqlite数据库中,没有找到更好的,更快的查找文本的方式,只能先用数据库了 在 子重置词典数据库(),和 方初始化() 这两个方法(函数)中有一定的信息框的错误提示,实际应用最好改成其他提示方式,而且提示错误后程序并不会终止,需要主程序自己判断 词库比较小,分词精度估计不高大,建议实际应用的时候才用更大,跟完整的词库,最好根据内容的相关性这样最好. 提供一个我自己用的词库,主要用于购物网站的分词的,在程序文件夹得kic1.txt里面 修改成kic.txt 然后重新生成数据库就行了 ' 此文件算法根据织梦中文分词php版本的简化版本修改而来,如果用于商业请自行考虑版权问题 '关键字自动获取php源码 这个文件夹里面的就是原来的php文件,应该是老版本的织梦cms里面提取出来的....这是一个简化版本的分词程序 .简化了一些算法,我也是根据这个php文件修改而来的.所以这个分词算法用于提权关键字是比较合适的 .用于更高的要求估计还不够合适 .sqlite采用的是kyozy的sqlite模块,因为他的模块可以吧数据库读入内存... .程序中还有许多可以提升速度的地方....比如说使用的数据库感觉应该有很好的方法...但是没有找到 .欢迎大家指正..做的更好 ' 作者: www.liuxingou.com 十年一剑
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值