python jieba 是中文分词,snownlp 是自然语言处理,pinyin 拼音和中译英词典。
pip install jieba;
pip install snownlp;
pip install pinyin; 拼音
存在 Lib/site-packages/pinyin/cedict.py 中译英词典
测试程序 test_pinyin.py
# -*- coding: utf-8 -*-
import jieba
import snownlp
import pinyin
# statement 语句
stmt = "我爱我的祖国"
#stmt = "我的目的地是上海浦东开发区"
seg_list = jieba.cut(stmt, cut_all=False)
words = ' '.join(seg_list)
print(words)
pyin = pinyin.get(words)
print(pyin)
print(type(seg_list))
from snownlp import SnowNLP
sn = SnowNLP(stmt)
wlist = sn.words
pinyins = sn.pinyin
print('wlist=', wlist) # 分词:
print('pinyin:',pinyins) # 拼音:
print('sentiments:', sn.sentiments) # # 情感分析:
print('keywords:', sn.keywords(3)) # 关键词:
print('summary:', sn.summary(3)) # 摘要/文本概括:
print('sim:', sn.sim(['我','国'])) # 词的相似度[ ]
# 中译英词典
from pinyin import cedict
cedict.init() # 初始化 中译英词典
for w in wlist:
english = cedict.translate_word(w, dictionary=['simplified'])
print(f"{w}: {english}")
运行 python pinyin_test.py 后输出
Prefix dict has been built succesfully.
我 爱 我 的 祖国
wǒ ài wǒ de zǔguó
<class 'generator'>
wlist= ['我', '爱', '我', '的', '祖国']
pinyin: ['wo', 'ai', 'wo', 'de', 'zu', 'guo']
sentiments: 0.9304746506862666
keywords: ['祖国', '爱']
summary: ['我爱我的祖国']
sim: [0.5877866649021191, 0, 0.5877866649021191, 0, 0, 1.2992829841302609]
我: ['I', 'me', 'my']
爱: ['to love', 'to be fond of', 'to like', 'affection', 'to be inclined (to do sth)', 'to tend to (happen)']
我: ['I', 'me', 'my']
的: ['aim', 'clear']
祖国: ['motherland']
注意:seg_list 类型是 generator , 不是 list .