由于一些数据是以CSV, excel存放,想从数据中选取一些数据,偷个懒就写了个从excel中获取数据名,并在指定目录中寻找到并将其复制到新的目录下的小脚本。
(1)使用pandas读取excel
(2)python的os模块查找自定根目录下所有文件中是否存在目标文件
(3)使用shutil模块对文件进行复制操作
excel文件如下:
存放数据的路径:
代码:
#!/usr/bin/env python3
import os, shutil
import pandas as pd
def findpath(file_dir, file_name):
'''
:param file_dir: the root
:param file_name: target file name
:return: the path of target file
'''
for root, dirs, files in os.walk(file_dir):
# print(root) # 当前目录路径
# print(dirs) # 当前路径下所有子目录
# print(files) # 当前路径下所有非目录子文件
if file_name in files:
print(file_name, 'in', root)
cur_file_path = root + '//' + file_name
return cur_file_path
print('No such file!')
return None
if __name__ == '__main__':
xlsx_path = 'D:/Research/record_of_heart_sound/PhysioNet_Database/special test use/random_sample.xlsx'
data_frame=pd.read_excel(xlsx_path)
# where the data store
target_path = 'D:/Research/record_of_heart_sound/PhysioNet_Database'
file_name = []
for x in data_frame['Challenge record name']:
file_name.append(x + '.wav')
file_path = []
for x in file_name:
# file_path.append(findpath(target_path, x))
# target file path
srcFile = findpath(target_path, x)
# new directory
targetFile = 'D:/Research/record_of_heart_sound/PhysioNet_Database/special test use/' + x
# copy to new directory
shutil.copyfile(srcFile, targetFile)
参考:
python读取excel文件的三种方法
pandas入门 之read_excel()和to_excel()函数解析
python获取当前路径
python 获取当前文件夹下所有文件名