Python学习VI --- 文件与目录

最新推荐文章于 2023-05-03 18:26:47 发布

Hungryof

最新推荐文章于 2023-05-03 18:26:47 发布

阅读量1.3k

点赞数

分类专栏： Python Python学习文章标签： python

Python 同时被 2 个专栏收录

16 篇文章 0 订阅

订阅专栏

Python学习

12 篇文章 10 订阅

订阅专栏

这篇博客主要是阅读python之旅时做的笔记。提取出最主要的知识点，供个人在以后中快速查阅。

读写文本文件

读文件

try:
    f = open('/path/to/file', 'r')    # 打开文件
    data = f.read()                   # 读取文件内容
finally:
    if f:
        f.close()                     # 确保文件被关闭

#为啥不用with呢,它会自动调用close方法
with open('/path/to/file', 'r') as f:
    data = f.read()

符号	说明
‘r’	读模式
‘w’	写模式
‘a’	追加
‘b’	二进制模式
‘+’	读/写模式

读写方式
一次性所有	read()或readlines()
按字节读取	read(size)
按行读取	readline()

读取所有内容

重点说明一下readlines() ,其实就是把所有内容按行读取，放入一个列表中罢了。

10  1   9   9
6   3   2   8
20  10  3   23
1   4   1   10
10  8   6   3
10  2   1   6

with open('data.txt', 'r') as f:
    lines = f.readlines()
    line_num = len(lines)
    print lines
    print line_num


['10\t1\t9\t9\n', '6\t3\t2\t8\n', '20\t10\t3\t23\n', '1\t4\t1\t10\n', '10\t8\t6\t3\n', '10\t2\t1\t6']
6

按字节读取

当文件过大时，还是最好构造一个固定长度的缓冲区，不断读取文件内容

#简单版本
with open('path/to/file', 'r') as f:
    while True:
        piece = f.read(1024)        # 每次读取 1024 个字节（即 1 KB）的内容
        if not piece:
            break
        print piece

#用生成器
def read_in_chunks(file_object, chunk_size=1024):
    """Lazy function (generator) to read a file piece by piece.
    Default chunk size: 1k."""
    while True:
        data = file_object.read(chunk_size)
        if not data:
            break
        yield data

with open('path/to/file', 'r') as f:
    for piece in read_in_chunks(f):
        print piece

逐行读取

with open('data.txt', 'r') as f:
    while True:
        line = f.readline()     # 逐行读取
        if not line:
            break
        print line,             # 这里加了 ',' 是为了避免 print 自动换行

文件迭代器

在 Python 中，文件对象是可迭代的，这意味着我们可以直接在 for 循环中使用它们，而且是逐行迭代的，也就是说，效果和 readline() 是一样的，而且更简洁。

# f就是文件迭代器，而且是逐行迭代的
with open('data.txt', 'r') as f:
    for line in f:
        print line,

写文件

#覆盖模式
with open('/Users/ethan/data2.txt', 'w') as f:
    f.write('one\n')
    f.write('two')

#追加模式
with open('/Users/ethan/data2.txt', 'a') as f:
    f.write('three\n')
    f.write('four')

读写二进制文件

对于二进制文件，我们要多加一个’b’

with open('test.png', 'rb') as f:
    image_data = f.read()    # image_data 是字节字符串格式的，而不是文本字符串

这里需要注意的是，在读取二进制数据时，返回的数据是字节字符串格式的，而不是文本字符串。一般情况下，我们可能会对它进行编码，比如 base64 编码，可以这样做：

import base64

with open('test.png', 'rb') as f:
    image_data = f.read()
    base64_data = base64.b64encode(image_data)    # 使用 base64 编码
    print base64_data

写入二进制文件

with open('test.png', 'rb') as f:
    image_data = f.read()

with open('/Users/ethan/test2.png', 'wb') as f:
    f.write(image_data)

OS模块

OS主要是常见的文件和目录操作, 详情见官方文档
这里写图片描述

# 绝对路径
>>> import os                          # 记得导入 os 模块
>>> os.path.abspath('hello.py')
'/Users/ethan/coding/python/hello.py'
>>> os.path.abspath('web')
'/Users/ethan/coding/python/web'
>>> os.path.abspath('.')                # 当前目录的绝对路径
'/Users/ethan/coding/python'

#获取文件或文件夹路径
>>> os.path.dirname('/Users/ethan/coding/python/hello.py')
'/Users/ethan/coding/python'
>>> os.path.dirname('/Users/ethan/coding/python/')
'/Users/ethan/coding/python'
>>> os.path.dirname('/Users/ethan/coding/python')
'/Users/ethan/coding'

# 获取文件名或文件夹名
>>> os.path.basename('/Users/ethan/coding/python/hello.py')
'hello.py'
>>> os.path.basename('/Users/ethan/coding/python/')
''
>>> os.path.basename('/Users/ethan/coding/python')
'python'

#os.path.splitext：分离文件名与扩展名
>>> os.path.splitext('/Users/ethan/coding/python/hello.py')
('/Users/ethan/coding/python/hello', '.py')
>>> os.path.splitext('/Users/ethan/coding/python')
('/Users/ethan/coding/python', '')
>>> os.path.splitext('/Users/ethan/coding/python/')
('/Users/ethan/coding/python/', '')

#os.path.split：分离目录与文件名
>>> os.path.split('/Users/ethan/coding/python/hello.py')
('/Users/ethan/coding/python', 'hello.py')
>>> os.path.split('/Users/ethan/coding/python/')
('/Users/ethan/coding/python', '')
>>> os.path.split('/Users/ethan/coding/python')
('/Users/ethan/coding', 'python')

os.walk 是遍历目录常用的模块，它返回一个包含 3 个元素的元祖：(dirpath, dirnames, filenames)。dirpath 是以 string 字符串形式返回该目录下所有的绝对路径；dirnames 是以列表 list 形式返回每一个绝对路径下的文件夹名字；filesnames 是以列表 list 形式返回该路径下所有文件名字。

>>> for root, dirs, files in os.walk('/Users/ethan/coding'):
...     print root
...     print dirs
...     print files
...
/Users/ethan/coding
['python']
[]
/Users/ethan/coding/python
['web2']
['hello.py']
/Users/ethan/coding/python/web2
[]
[]