python如何寻找两个相似的文件_Python：比较具有相似名称的文件（递归地）

最新推荐文章于 2023-06-05 17:11:18 发布

weixin_39929646

最新推荐文章于 2023-06-05 17:11:18 发布

阅读量526

点赞数

文章标签： python如何寻找两个相似的文件

在these guys的帮助下，我能够生成以下代码，它读取两个文件(即SA1.WRD和SA1.PHN)，合并它们，并将结果与从字典中删除的单词的子列表进行比较：

在

导入系统

导入操作系统

进口re

导入itertools#generator function to merge sound and word files

def takeuntil(iterable, stop):

for x in iterable:

yield x

if x[1] == stop:

break

#open a dictionary file and create subset of words

class_defintion = re.compile('([1-2] [lnr] t en|[1-2] t en)')

with open('TIMITDIC.TXT') as w_list:

entries = (line.split(' ', 1) for line in w_list)

comp_set = [ x[0] for x in entries if class_defintion.search(x[1]) ]

#open word and sound files

total_words = 0

with open(sys.argv[1]) as unsplit_words, open(sys.argv[2]) as unsplit_sounds:

sounds = (line.split() for line in unsplit_sounds)

words = (line.split() for line in unsplit_words)

output = [

(word, " ".join(sound for _, _, sound in

takeuntil(sounds, stop)))

for start, stop, word in words

]

for x in output:

total_words += 1

#extract words from above into list of words in dictionary set

glottal_environments = [ x for x in output if x[0] in comp_set ]

我试图修改#open a dictionary files之后的部分，使其在一个包含多个子目录的大目录上运行。每个子目录包含.txt文件、.wav文件、.wrd和.phn文件。我只想打开.wrd和.phn文件，并且我希望一次可以打开两个，并且只有在基本文件名匹配的情况下，即SA1.wrd和SA1.phn，而不是SA1.wrd和SI997.phn。在

我当时的猜测是这样做的：

^{pr2}$

返回：[('SA1.WRD', 'SA1.PHN'), ('SA2.WRD', 'SA2.PHN'), ('SI997.WRD', 'SI997.PHN')]

我的第一个问题是我是否在正确的轨道上，如果是的话，我的第二个问题是如何将这些元组中的每一项都作为文件名来读取。在

谢谢你的帮助。在

编辑：

我想我可以把代码块放到for循环中：for f in files:

#OPEN THE WORD AND PHONE FILES, COMAPRE THEM (TAKE A WORD COUNT)

total_words = 0

with open(f[0]) as unsplit_words, open(f[1]) as unsplit_sounds:

...

但是，这会导致IOError，这可能是由于每个元组中每个项的单引号引起的。在

更新

我修改了我的原始脚本以包括os.path.join(root, f)，如下所述。脚本现在遍历目录树中的所有文件，但只处理它找到的最后两个文件。以下是print files的输出：[]

[('test/test1/SI997.WRD', 'test/test1/SI997.PHN')]

[('test/test2/SI997.WRD', 'test/test2/SI997.PHN')]

weixin_39929646

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。