数据源
数据源请参考github项目:https://github.com/mahavivo/english-dictionary
博主这里使用的是英汉大词典_del_ipa_edited.txt
这个文件,网络地址:https://github.com/mahavivo/english-dictionary/blob/master/%E8%8B%B1%E6%B1%89%E5%A4%A7%E8%AF%8D%E5%85%B8%EF%BC%88%E7%AC%AC%E4%BA%8C%E7%89%88%EF%BC%89/%E8%8B%B1%E6%B1%89%E5%A4%A7%E8%AF%8D%E5%85%B8_del_ipa_edited.txt
字典信息处理
首先使用pandas
读取txt格式文件,使用⬄
作为切分符,然后对每个单词word
使用strip()
去除末尾的空格。最后将它设置为index
,方便后续的查询操作
import pandas as pd
def init_dictionary():
my_dictionary = pd.read_csv(
'英汉大词典_del_ipa_edited.txt', sep='⬄', header=0,
names=['word', 'interpretation'])
def check(series):
return series['word'].strip()
my_dictionary['word'] = my_dictionary.apply(check, axis=1)
my_dictionary.set_index('word', inplace=True)
return my_dictionary
查询单词释义
首先检查index
是否包含给定的单词,然后返回结果
def get_translate(dictionary, word):
if word in dictionary.index.values:
return dictionary.loc[word].values.tolist()[0]
else:
return None
完整案例
import pandas as pd
def init_dictionary():
my_dictionary = pd.read_csv(
'英汉大词典_del_ipa_edited.txt', sep='⬄', header=0,
names=['word', 'interpretation'])
def check(series):
return series['word'].strip()
my_dictionary['word'] = my_dictionary.apply(check, axis=1)
my_dictionary.set_index('word', inplace=True)
return my_dictionary
def get_translate(dictionary, word):
if word in dictionary.index.values:
return dictionary.loc[word].values.tolist()[0]
else:
return None
my_dictionary = init_dictionary()
print(get_translate(my_dictionary, 'word'))
print(get_translate(my_dictionary, 'fielsafjofoasfjiwoefj'))
print('a')