Python从入门到进阶(十四)——数据文件读写

最新推荐文章于 2024-07-12 16:16:27 发布

꧁༺℘₨风、凌๓༻꧂

最新推荐文章于 2024-07-12 16:16:27 发布

阅读量58

点赞数

分类专栏： Python学习文章标签： python 前端 windows

Python学习专栏收录该内容

24 篇文章 1 订阅

订阅专栏

一数据文件读写

python文件读写的方式

文件读写就是一种常见的IO操作。python封装了操作系统的底层接口，直接提供了文件读写相关的操作方法；文件读写不需要额外引入第三方库；

一个文件读写的步骤：
1、从硬盘中读取一个文件路径
2、加载文件到内存中，获取文件对象（也叫文件句柄）
3、通过文件对象对对接进行读写操作
4、最后需要关闭文件；

打开一个文件:

# 一般写法
f=open(file,mode,encoding='utf-8')

主要是三个参数,文件路径,打开模式,文件编码
关于打开模式的描述如下图:

文件打开模式	描述
r	以只读模式打开,并将文件指针指向文件头,不存在报错
w	只写打开文件,文件存在清空,不存在创建
a	以只追加可写模式打开文件,并将文件指针指向文件尾部,不存在创建
r+	在r的基础上增加了可写功能
w+	在w的基础上增加了可读功能
a+	在a的基础上增加了可读功能
b	读写二进制文件(默认是t,表示文本),需要与上面几种模式搭配使用,如ab,wb,ab,ab+(POSIX系统,包含Linux都会忽略该字符)
关于可写可读的三个模式的区别:

r+覆盖当前文件指针所在位置的字符;
w+在打开文件时就会先将文件内容清空,适合重写
a+只能写到文件末尾,适合追加

文件读取:

file = '1.txt'
file_obj = open(file,‘r’,encoding='utf-8')
content = file_obj.read()
print(content)
file_obj.close()

以只读模式打开一个文件，读取内容，关闭文件；
使用with 方式，可以写文件关闭代码；

file = '1.txt'
with open(file,‘r’,encoding='utf-8') as file_obj:
    	content = file_obj.read()
print(content)

按行读取：

file = '1.txt'
with open(file,‘r’,encoding='utf-8') as file_obj:		    					content = file_obj.readline() #读取一行
print(content)
for line in file_obj.readlines(): #读取多行
   print(line)

文件的写入

写文件和读文件是一样的，唯一区别是调用open()函数时，传入标识符’w’,'w+'或者’wb’表示写文本文件或写二进制文件；
python提供了两个“写”方法： write() 和 writelines()。

f1=open('1.txt','w')
f1.write('123')
f1.close()
------------------
f1=open('1.txt','w')
f1.writelines(["1\n","2\n","3\n"])
f1.close()

创建文件夹
使用os递归创建文件夹,已存在就覆盖

def testCrdir(self,name):
        os.makedirs(name,exist_ok=True)

文件夹拷贝
实现文件夹拷贝，要求如下：

使用 shutil 拷贝 “copy.py” 文件到 “/tmp/copy.py”
拷贝 “copy.py” 文件到 “/tmp/copy2.py”, 保留元数据
递归拷贝目录 “./” 到 “/tmp/file_test/”，如果已存在就覆盖

import shutil
shutil.copy(
    "copy.py", 
    "/tmp/copy.py"
)

# 拷贝文件，保持元数据
shutil.copy2(
    "copy.py", 
    "/tmp/copy2.py"
)

# 递归拷贝目录
shutil.copytree(
    "./", 
    "/tmp/file_test/", 
    dirs_exist_ok=True
)

Python文件遍历

稳定排序地遍历一个文件下的文件

def testFileForeach(self,dir):
        import os
        entries=os.listdir(dir)
        entries.sort()
        return entries

    def testff(self,dir):
        entries=self.testFileForeach(dir)
        for entry in entries:
            print(entry)

遍历一个文件夹下的所有子文件夹，并返回所有’config.json’文件的绝对路径列表

def retrieve_file_paths(self,dir_name):
        file_paths=[]
        abs_dir_name=os.path.abspath(dir_name)
        cfg_file=os.path.join(abs_dir_name,'config.json')
        if os.path.exists(cfg_file):
            file_paths.append(cfg_file)
        
        for base,dirs,files in os.walk(abs_dir_name):
            for dir in dirs:
                cfg_file=os.path.join(base,dir,'config.json')
                if os.path.exists(cfg_file):
                    file_paths.append(cfg_file)
        print(file_paths)
        return file_paths

    def retrieve_file_paths(self,dir_name):
        file_paths=[]
        abs_dir_name=os.path.abspath(dir_name)
        for base,dirs,files in os.walk(abs_dir_name):
            for file_name in files:
                if file_name=='config.json':
                    file_path=os.path.join(base,file_name)
                    file_paths.append(file_path)
        print(file_paths)
        return file_paths

    def retrieve_file_paths(self,dir_name):
        file_paths=[]
        abs_dir_name=os.path.abspath(dir_name)
        for base,dirs,files in os.walk(abs_dir_name):
            cfg_file=os.path.join(base,'config.json')
            if os.path.exists(cfg_file):
                file_paths.append(cfg_file)
        print(file_paths)
        return file_paths

文件统计
统计文件中行数，非空行数，以及空格间隔的token数

# -*- coding: UTF-8 -*-
import json
 
def count_file(file):
    line_count = 0
    non_empty_line_count = 0
    token_count = 0
 
    with open(file, 'r') as f:
        while True:
            # 读取每行
            line = f.readline()
            if not line:
                break
 
            line_count += 1
            line_len = len(line)
            line_token_count = 0
 
            # TODO(You): 请在此实现统计单行token数
 
            token_count += line_token_count
            if line_token_count > 0:
                non_empty_line_count += 1
 
    return {
        'file': file,
        'line_count': line_count,
        'line_token_count': token_count,
        'non_empty_line_count': non_empty_line_count
    }
 
if __name__ == '__main__':
    ret = count_file('count_file.py')
    print('行数：', ret['line_count'])
    print('非空行：', ret['non_empty_line_count'])
    print('非空词数：', ret['line_token_count'])
    with open('/tmp/count.json', 'w') as f:
        f.write(json.dumps(ret, indent=2, ensure_ascii=False))
请选出下列能正确实现这一功能的选项。
# A.
				blank=True
                for char in line:
                    if char in [' ','\t','\b','\n']:
                        if not blank:
                            line_token_count+=1
                        blank=True
                    else:
                        blank=False
# B.
                blank=False
                for char in line:
                    if char in [' ','\t','\b']:
                        blank=True
                    else:
                        if blank:
                            line_token_count+=1
                        blank=False
# C.
                blank=True
                i=0
                while i<line_len:
                    char=line[i]
                    if char in [' ','\t','\b','\n']:
                        if not blank:
                            line_token_count+=1
                        blank=True
                    else:
                        blank=False
                    i+=1
# D.
                blank=True
                i=0
                while i<line_len:
                    char=line[i]
                    if char in [' ','\t','\b']:
                        blank=True
                    else:
                        if blank:
                            line_token_count+=1
                        blank=False
                    i+=1
# B正确

文件夹压缩
使用 shutil 对文件夹进行zip压缩，压缩过程显示进度条

使用 shutil 对文件夹进行zip压缩，压缩过程显示进度条

# -*- coding: UTF-8 -*-
import os
import shutil
import logging
from progress.bar import IncrementalBar
logger = logging.getLogger(__name__)

def count_files_in_dir(dir):
    totalFiles = 0
    for base, dirs, files in os.walk(dir):
        totalFiles += len(files)
    return totalFiles

def zip_with_progress(dir_path, zip_file):
    bar = None
    total_files = count_files_in_dir(dir_path)

    def progress(*args, **kwargs):
        # TODO(You): 进度条显示
    
    # 调用shutil.make_archive时，临时替换其 logger 参数，用来显示进度条
    old_info = logger.info
    logger.info = lambda *args, **kwargs: progress(*args, **kwargs)
    shutil.make_archive(dir_path, 'zip', dir_path, logger=logger)
    logger.info = old_info

    if bar is not None:
        bar.finish()

if __name__ == '__main__':
    zip_with_progress('./', '/tmp/test_file_zip.zip')
    print()

请选出下列能正确实现这一功能的选项。
# A.
def progress(*args, **kwargs):
    if not args[0].startswith('adding'):
        return

    nonlocal bar, total_files
    print('@开始压缩:{}'.format(zip_file))
    bar = IncrementalBar('正在压缩:', max=total_files)
    bar.next(1)
#　B.
def progress(*args, **kwargs):
    if not args[0].startswith('adding'):
        return

    if bar is None:
        print('@开始压缩:{}'.format(zip_file))
        bar = IncrementalBar('正在压缩:', max=total_files)
    bar.next(1)
# C.
def progress(*args, **kwargs):
    if not args[0].startswith('adding'):
        return

    nonlocal bar, total_files
    if bar is None:
        print('@开始压缩:{}'.format(zip_file))
        bar = IncrementalBar('正在压缩:', max=total_files)
    bar.next(1)
# D.
def progress(*args, **kwargs):
    if not args[0].startswith('adding'):
        return

    nonlocal bar, total_files
    if bar is None:
        print('@开始压缩:{}'.format(zip_file))
        bar = IncrementalBar('正在压缩:', max=total_files)
    bar.next(1)

꧁༺℘₨风、凌๓༻꧂

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Python从入门到进阶(十四)——数据文件读写

文件读写就是一种常见的IO操作。python封装了操作系统的底层接口，直接提供了文件读写相关的操作方法；文件读写不需要额外引入第三方库；主要是三个参数,文件路径,打开模式,文件编码。以只读模式打开一个文件，读取内容，关闭文件；使用with 方式，可以写文件关闭代码；
复制链接

扫一扫