python 单词字典_使用python中的字典在文本文件中查找字典单词

本文介绍了如何使用Python处理文本文件,特别是使用字典在文本中查找单词。展示了使用enchant库进行英文单词验证的脚本,以及处理HTML实体和分割文本的函数。
摘要由CSDN通过智能技术生成

English cricket cuts ties with Zimbabwe Wednesday, 25 June, 2008 text<void(0);><void(0);> <void(0);>email <void(0);>print EMAIL THIS ARTICLE your name: your email address: recipient's name: recipient's email address: <;>add another recipient your comment: Send Mail<void(0);> close this form <http://ad.au.doubleclick.net/jump/sbs.com.au/worldnews;sz=300x250;tile=2;ord=123456789?> The England and Wales Cricket Board (ECB) announced it was suspending all ties with Zimbabwe and was cancelling Zimbabwe's tour of England next year.

BLOCKQUOTE>

该脚本应返回:

English cricket cuts ties with Zimbabwe Wednesday

The England and Wales Cricket Board (ECB) announced it was suspending all ties with Zimbabwe and was cancelling Zimbabwe's tour of England next year

BLOCKQUOTE>

我接受了abarnert的回应。下面是我的最终剧本。注意它非常低效,应该清理一些。同样免责声明我从很久以前就没有编码。

import enchant

from enchant.tokenize import get_tokenizer

import os

def clean_files():

os.chdir("TARGET_DIRECTORY")

for files in os.listdir("."):

#get the numbers out file names

file_number = files[files.rfind("_")+1:files.rfind(".")]

#Print status to screen

print"Working

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值