有没有可以搜索python程序的软件-使用python进行文本搜索

1586010002-jmsa.png

I am working on a text search project, and using text blob to search for sentences from text.

TextBlob pulls all the sentences with the keywords efficiently. However for effective research i also want to pull out one sentence before and one after which I am unable to figure.

Below is the code I am using:

def extraxt_sents(Text,word):

search_words = set(word.split(','))

sents = ''.join([s.lower() for s in Text])

blob = TextBlob(sents)

matches = [str(s) for s in blob.sentences if search_words & set(s.words)]

print search_words

print(matches)

解决方案

If you want to get the lines before and after the match, you can either create a loop and memorize the previous line, or use slices, like [from:to] on the blob.sentences list.

The best way might be to use the enumerate bultin function.

match_region = [map(str, blob.sentences[i-1:i+2]) # from prev to after next

for i, s in enumerate(blob.sentences) # i is index, e is element

if search_words & set(s.words)] # same as your condition

Here, blob.sentences[i-1:i+2] will extract the sublist spanning from index i-1 (inclusive) to index i+2 (exclusive), and map turns the elements in this list into strings.

Note: Actually, you might want to replace i-1 with max(0, i-1); otherwise i-1 could be -1 and Python would interpret this as the last element, yielding an empty slice. If i+2 is higher than the list's length, on the other hand, this will not be a problem.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值