python 关于文件和文件夹的操作（持续学习更新汇总中）

最新推荐文章于 2024-07-12 07:51:53 发布

星夜猫

最新推荐文章于 2024-07-12 07:51:53 发布

阅读量689

点赞数

分类专栏： python学习文章标签： python

本文链接：https://blog.csdn.net/qq_44418077/article/details/117173038

版权

python学习专栏收录该内容

5 篇文章 0 订阅

订阅专栏

python 关于文件和文件夹的操作

1、关于文件的常用操作
2、关于文件夹的常用操作
- 2.1 遍历文件夹，获得其文件
- 2.2 创建文件夹 os.makedirs()
参考文章

这里，将python 对于文件和文件夹的我常用的操作进行一次汇总，以方便日后自己和大家的使用。
代码实例部分改编自网络，除引用处声明外，将在文末集中感谢。

1、关于文件的常用操作

1.1文件的打开与关闭

f=open(InP,encoding='gb2312',errors='ignore')
……
f.close()

或者

with open(InP,encoding='gb2312',errors='ignore') as f:
	……

1.2 文件的读写

1.2.1 读文件

f=open(InP,encoding='gb2312',errors='ignore')
for line in f:
	line=line.strip()
f.close()

1.2.2 写文件

o = open(OutP)
print(string,file=o)
o.close()

1.3 文件编码的查看与更改

有的时候我们需要批量的查看文件的编码，并批量的更改，下面的代码改编于https://blog.csdn.net/jy692405180/article/details/52496599

import os
from chardet.universaldetector import UniversalDetector

'''
chardet的子模块chardet.universaldetector。
这个模块允许我们分多次（逐行读取或者自行断行读取）检测文本的编码格式，
当达到一定的阈值时便可以提前退出检测。
'''

def get_filelist(path):
    Filelist = []
    for home, dirs, files in os.walk(path):
        for filename in files:
            # 文件名列表，包含完整路径
            if ".txt" in filename:
                Filelist.append(os.path.join(home, filename))
            # # 文件名列表，只包含文件名
            # Filelist.append( filename)
 
    return Filelist

def get_encode_info(file):
    '''
    获得文本的编码信息
    '''
    with open(file, 'rb') as f:
        detector = UniversalDetector()
        for line in f.readlines():
            detector.feed(line)  # 逐行载入UniversalDetector对象中进行识别
            if detector.done:  # done为一个布尔值，默认为False，达到阈值时变为True
                break
        detector.close()  # 调用该函数做最后的数据整合
        return detector.result['encoding']
     
def read_file(file):
    '''
    读取并返回文件内容
    '''
    with open(file, 'rb') as f:
        return f.read()
 
def write_file(content, file):
    '''
    将内容写入文本
    '''
    with open(file, 'wb') as f:
        f.write(content)

def convert_encode(file, original_encode, des_encode):
    '''
    更改文件编码
    '''
    file_content = read_file(file)
    file_decode = file_content.decode(original_encode,'ignore')
    file_encode = file_decode.encode(des_encode)
    write_file(file_encode, file)
 
if __name__ == "__main__":
    #文件夹目录
    filePath = r'E:\temp\temp\\'
    #获得文件地址
    Filelist = get_filelist(filePath)
    #想要的编码格式
    encode_want = 'utf-8'
    #处理
    for filename in Filelist:
        #读取文件内容
        file_content = read_file(filename)
        #获取文件编码信息
        encode_info = get_encode_info(filename)
        print('old:',encode_info)
        #如果不是我们指定的编码格式，则进行修改
        if encode_info != encode_want:
            convert_encode(filename, encode_info, encode_want)
            
        #获得文件新的编码信息
        encode_info = get_encode_info(filename)
        print('new:',encode_info)

1.4 文件大小的查看

我们有的时候在处理完文件后，也许需要一个反馈，比如文件的大小，又或者更改前后文件的大小对比

Python 3.8.1 
>>> import os
>>> path = r'E:\Corpus\UCI_MachineLearningRepository\00196\ConfLongDemo_JSI.txt'
>>> os.path.getsize(path)
21546346
>>> print('File size = %.2f Mb' % (os.path.getsize(path)/1024/1024))
File size = 20.55 Mb

1.5 获得文件

当我们需要处理文本的时候，大多是基于我们已经有了一些文件，然后再去处理。但是，也会有一些例外，即，我们需要先获得这些文件，然后再对齐进行处理。

1.5.1 根据URL 获得文件

from urllib.request import urlretrieve
urlretrieve(url, savepath)

2、关于文件夹的常用操作

2.1 遍历文件夹，获得其文件

当我们批量对文件处理时，难免遇到对同一个文件夹下大批量的文件进行处理，因此，如果可以快速获得所有文件的路径，就可以快速对所有的文件批量处理。

第一种 os.listdir（）方法

def get_files_path(file_list):
    '''
    file_list:文件夹的绝对路径
    return:文件的绝对路径，list
    '''
    files_path=[]
    filenames = os.listdir(file_list)
    for filename in filenames:
        filepath = file_list + filename
        files_path.append(filepath)
    return files_path

第二种os.walk（）方法

def get_files_path(InP):
    '''
    InP:目标文件夹
    return：files_path,list，存储所有文件的绝对路径
    '''
    #存放文件路径
    files_path=[]
    for root, dirs, files in os.walk(InP, topdown=False):
        # root 表示当前正在访问的文件夹路径
        # dirs 表示该文件夹下的子目录名list
        # files 表示该文件夹下的文件list
        
        #遍历所有的文件夹
        for name in dirs:
             print(os.path.join(root, name))
        
        # 遍历文件
        for name in files:
            # print(len(os.path.join(root, name)))
            files_path.append(os.path.join(root, name))
    print('已获得文件夹内文件地址的数量：',len(files_path))
    
    return files_path

2.2 创建文件夹 os.makedirs()

在批量写入文件的时候，我们会遇到将不同的文件创建到不同文件夹下，下面的这个代码就是为创建文件夹而copy过来的。

def mkdir(path):
    '''
    创建指定文件夹目录，为单个的数据集创建文件夹
    ---
    path:想要创建的文件夹路径
    '''
    path=path.strip()# 去除首位空格
    path=path.rstrip("\\")# 去除尾部 \ 符号
    # 判断路径是否存在： 存在，True；不存在，False
    isExists=os.path.exists(path)
    # 判断结果
    if not isExists:
        # 如果不存在则创建目录
        # 创建目录操作函数
        os.makedirs(path) 
        print(path+' 创建成功')
        return True
    else:
        # 如果目录存在则不创建，并提示目录已存在
        print(path+' 目录已存在')
        return False

参考文章

1、Python创建目录文件夹https://www.cnblogs.com/monsteryang/p/6574550.html
2、[Python模块学习]chardet模块识别字节包编码https://blog.csdn.net/jy692405180/article/details/52496599
3、python3的urlretrieve（）方法的作用与使用（入门）https://blog.csdn.net/u012424313/article/details/82222188

星夜猫

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python 关于文件和文件夹的操作（持续学习更新汇总中）

python 关于文件和文件夹的操作1、关于文件的常用操作1.1文件的打开与关闭1.2 文件的读写1.2.1 读文件1.2.2 写文件1.3 文件编码的查看与更改1.4 遍历文件2、关于文件夹的常用操作2.1 遍历文件夹，获得其文件2.2 创建文件夹 os.makedirs()参考文章这里，将python 对于文件和文件夹的我常用的操作进行一次汇总，以方便日后自己和大家的使用。代码实例部分改编自网络，除引用处声明外，将在文末集中感谢。1、关于文件的常用操作1.1文件的打开与关闭f=open(InP
复制链接

扫一扫

专栏目录