python怎么在电脑里搜索文件_python搜索文件列表中的单词(python search file for a list of words)...

最新推荐文章于 2023-03-24 15:43:45 发布

weixin_39562338

最新推荐文章于 2023-03-24 15:43:45 发布

阅读量584

点赞数

文章标签： python怎么在电脑里搜索文件

python搜索文件列表中的单词(python search file for a list of words)

首先，我开始尝试使用以下代码搜索文件中的一个单词：

import re

shakes = open("tt.txt", "r")

for line in shakes:

if re.match("(.*)(H|h)appy(.*)", line):

print line,

但如果我需要检查多个单词怎么办？我想可能像for循环这样的东西可以工作，每次搜索文件列表中的不同单词。

你觉得这样方便吗？

First I started trying to search file for one single word with this code:

import re

shakes = open("tt.txt", "r")

for line in shakes:

if re.match("(.*)(H|h)appy(.*)", line):

print line,

but what if I need to check for multiple words? I was thinking that maybe something like a for loop can work, searching the file each time for a different word in the list.

Do you think this can be convenient?

原文：https://stackoverflow.com/questions/29181206

2020-04-27 20:04

满意答案

只需将word_list加入| 作为分隔符。 (?i)不区分大小写的修饰符有助于进行不区分大小写的匹配。

for line in shakes:

if re.search(r"(?i)"+'|'.join(word_lst), line):

print line,

例：

>>> f = ['hello','foo','bar']

>>> s = '''hello

hai

Foo

Bar'''.splitlines()

>>> for line in s:

if re.search(r"(?i)"+'|'.join(f), line):

print(line)

hello

Foo

Bar

没有正则表达式：

>>> f = ['hello','foo','bar']

>>> s = '''hello

hai

Foo

Bar'''.splitlines()

>>> for line in s:

if any(i.lower() in line.lower() for i in f):

print(line)

hello

Foo

Bar

Just join the word_list with | as delimiter. (?i) case-insensitive modifier helps to do a case-insensitive match.

for line in shakes:

if re.search(r"(?i)"+'|'.join(word_lst), line):

print line,

Example:

>>> f = ['hello','foo','bar']

>>> s = '''hello

hai

Foo

Bar'''.splitlines()

>>> for line in s:

if re.search(r"(?i)"+'|'.join(f), line):

print(line)

hello

Foo

Bar

Without regex:

>>> f = ['hello','foo','bar']

>>> s = '''hello

hai

Foo

Bar'''.splitlines()

>>> for line in s:

if any(i.lower() in line.lower() for i in f):

print(line)

hello

Foo

Bar

2015-03-21

相关问答

您需要使用选项-f ： $ grep -f A B

选项-F执行固定的字符串搜索，其中-f用于指定模式文件。如果文件只包含固定字符串，而不是regexps，则可能需要两者。 $ grep -Ff A B

您可能还需要-w选项来匹配整个单词： $ grep -wFf A B

阅读man grep ，以描述所有可能的参数和他们做什么。 You need to use the option -f: $ grep -f A B

The option -F does a fixed string ...

在这种情况下，正则表达式是合适的工具我希望它找到“猫”，“猫”，“。cat”而不是“目录”。模式： r'\bcat\b' \b匹配单词边界。如何让用户同时在所有文本中搜索两个单词(“cat”或“dog”) 模式： r'\bcat\b|\bdog\b' 要打印"filename: " ： #!/usr/bin/env python

import os

import re

import sys

def fgrep(words, filen...

您可以尝试使用以下代码转换数组中的文件内容。然后你可以使用isset或in_array进行进一步的操作。 <?php

$file = file_get_contents('test.txt');

$arr=array_combine(explode(' ',$homepage));

$final_arr=array_combine($arr,$arr);

print_r($final_arr);

希望这可以帮到你。 you can t...

如果你想用搜索输入打印每一行，你可以做。 bookSearch = input("What book or author are you looking for? \n")

with open("bookList.txt", r) as infile:

for line in infile.readlines():

if bookSearch in line:

print(line)

当你写入文件时：更换： file = open("bookLi...

该问题是由文件逐行迭代的方式引起的。在下面的代码片段中，每个“行”将具有尾随的换行符。因此，进行拆分会使最后一行留下尾随换行符。 with open(os.path.join("./texts",elem)) as fh:

for line in fh:

words = line.split(' ')

如果您打印这些文字的“repr”， print repr(words)

你会看到最后一个单词包含尾随换行符， ['other', 'word\n']

而不是预...

有一个比哈希表更好的解决方案。如果您要在大量文本中搜索一组固定的单词，那么您使用Aho-Corasick字符串匹配算法的方式就是这样。该算法根据您要搜索的单词构建状态机，然后通过该状态机运行输入文本，在找到匹配项时输出匹配项。因为构建状态机需要一些时间，所以该算法最适合搜索非常大的文本体。你可以用正则表达式做类似的事情。例如，您可能希望在某些文本中找到“dog”，“cat”，“horse”和“skunk”等字样。您可以构建正则表达式： "dog|cat|horse|skunk"

然后...

您获得空列表，因为None不等于空列表。您可能想要的是将条件更改为以下内容： if matching:

# do your stuff

您似乎正在检查质量列表中的字符串中是否存在子字符串。这可能不是你想要的。如果要检查质量列表中显示的行上的单词，可能需要将列表解析更改为： words = line.split()

match = [word for word in words if word.lower() in qualities]

如果你正在寻找匹配和空格，你可能想看看正则表...

你可以这样做： wordCheck = raw_input("please enter the word you would like to check the spelling of: ")

with open("words.txt", "r") as f:

found = False

for line in f:

if line.strip() == wordCheck:

print ('That is the correct s...

只需将word_list加入| 作为分隔符。 (?i)不区分大小写的修饰符有助于进行不区分大小写的匹配。 for line in shakes:

if re.search(r"(?i)"+'|'.join(word_lst), line):

print line,

例： >>> f = ['hello','foo','bar']

>>> s = '''hello

hai

Foo

Bar'''.splitlines()

>>> for line in s:

...

Basho: Riak Search Riak Search Introduction

...

As you know, I've been playing with Solr lately, tr

...

列表就像java里的collection，所具有的特性也要比元组更多，更灵活，其character总结

...

http://en.wikipedia.org/wiki/Faceted_search http://

...

Data Week: Becoming a data scientist Data Pointed,

...

Windowsis an extremely effective and a an efficient

...

Java 流(Stream)、文件(File)和IO Java.io包几乎包含了所有操作输入、输

...

5th Jan, 10 Drupal drupal advanced forum drupa

...

Faceted Search with Solr Posted byyonik Facet

...

尝试使用solr取代hibernate search的方法，因本人对二者没有全面的了解，对二者都只是使

...

最新问答

如果启用了复制处理程序，请确保将其置于其中一个安全角色之后。我见过人们做的另一件事是在不同的端口上运行admin。最好在需要auth的页面上使用SSL，这样你就不会发送明确的密码，因此管理和复制将发生在8443上，而常规查询将在8080上发生。如果您要签署自己的证书，请查看此有用的SO页面：如何在特定连接上使用不同的证书？ I didn't know that /admin was the context for SOLR admin because /admin does not re

第一：在您的样本中，您有：但是你在询问 //td[@class=‘CarMiniProfile-TableHeader’] (注意TableHeader中的大写'T')。 xpath区分大小写。第二：通过查询// td [@ class ='CarMiniProfile-TableHeader'] / td，你暗示你在外部td中有一个'td'元素，而它们是兄弟姐妹。有很多方法可以在这里获得制作和模型

这是你的答案： http://jsfiddle.net/gPsdk/40/ .preloader-container { position: absolute; top: 0px; right: 0px; bottom: 0px; left: 0px; background: #FFFFFF; z-index: 5; opacity: 1; -webkit-transition: all 500ms ease-out;

问题是，在启用Outlook库引用的情况下， olMailItem是一个保留常量，我认为当您将Dim olMailItem as Outlook.MailItem ，这不是问题，但是尝试设置变量会导致问题。以下是完整的解释：您已将olMailItem声明为对象变量。在赋值语句的右侧，在将其值设置为对象的实例之前，您将引用此Object 。这基本上是一个递归错误，因为你有对象试图自己分配自己。还有另一个潜在的错误，如果之前已经分配了olMailItem ，这个语句会引发另一个错误(可能是

我建议使用wireshark http://www.wireshark.org/通过记录(“捕获”)设备可以看到的网络流量副本来“监听”网络上发生的对话。当您开始捕获时，数据量似乎过大，但如果您能够发现任何看起来像您的SOAP消息的片段(应该很容易发现)，那么您可以通过右键单击并选择来快速过滤到该对话'关注TCP Stream'。然后，您可以在弹出窗口中查看您编写的SOAP服务与Silverlight客户端之间的整个对话。如果一切正常，请关闭弹出窗口。作为一个额外的好处，wireshar

Android默认情况下不提供TextView的合理结果。您可以使用以下库并实现适当的aligntment。 https://github.com/navabi/JustifiedTextView Android Does not provide Justified aligntment of TextView By default. You can use following library and achieve proper aligntment. https://github.com/

你的代码适合我： class apples { public static void main(String args[]) { System.out.println("Hello World!"); } } 我将它下载到c：\ temp \ apples.java。以下是我编译和运行的方式： C:\temp>javac -cp . apples.java C:\temp>dir apples Volume in drive C is HP_PAV

12个十六进制数字(带前导0x)表示48位。那是256 TB的虚拟地址空间。在AMD64上阅读wiki(我假设你在上面，对吗？)架构http://en.wikipedia.org/wiki/X86-64 12 hex digits (with leading 0x) mean 48 bits. That is 256 TB of virtual address space. Read wiki on AMD64 (I assume that you are on it, right?) ar

这将取决于你想要的。对象有两种属性：类属性和实例属性。类属性类属性对于类的每个实例都是相同的对象。 class MyClass: class_attribute = [] 这里已经为类定义了MyClass.class_attribute ，您可以使用它。如果您创建MyClass实例，则每个实例都可以访问相同的class_attribute 。实例属性 instance属性仅在创建实例时可用，并且对于类的每个实例都是唯一的。您只能在实例上使用它们。在方法__init__中定